ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2017, Vol. 54 ›› Issue (6): 1316-1325.doi: 10.7544/issn1000-1239.2017.20170095

所属专题: 2017计算机体系结构前言技术(一)专题

• 系统结构 • 上一篇    下一篇



  1. 1(武汉理工大学计算机科学与技术学院 武汉 430070); 2(交通物联网技术湖北省重点实验室 (武汉理工大学) 武汉 430070); 3(佛罗里达大学电气与计算机工程系 美国 佛罗里达州 盖恩斯维尔 32611) (
  • 出版日期: 2017-06-01
  • 基金资助: 

Real-Time Panoramic Video Stitching Based on GPU Acceleration Using Local ORB Feature Extraction

Du Chengyao1, Yuan Jingling1,2, Chen Mincheng1, Li Tao3   

  1. 1(School of Computer Science and Technology, Wuhan University of Technology, Wuhan 430070); 2(Hubei Key Laboratory of Transportation Internet of Things (Wuhan University of Technology), Wuhan 430070); 3(Department of Electrical and Computer Engineering, University of Florida, Gainesville, FL, USA 32611)
  • Online: 2017-06-01

摘要: 全景视频是在同一视点拍摄记录全方位场景的视频.随着虚拟现实(VR)技术和视频直播技术的发展,全景视频的采集设备受到广泛关注.然而制作全景视频要求CPU和GPU都具有很强的处理能力,传统的全景产品往往依赖于庞大的设备和后期处理,导致高功耗、低稳定性、没有实时性且不利于信息安全.为了解决这些问题,首先提出了L-ORB特征点提取算法,该算法优化了分割视频图像的特征检测区域以及简化ORB算法对尺度和旋转不变性的支持;然后利用局部敏感Hash(Multi-Probe LSH)算法对特征点进行匹配,用改进的样本一致性(progressive sample consensus, PROSAC)算法消除误匹配,得到帧图像拼接映射关系,并采用多频带融合算法消除视频间的接缝.此外,使用整合了ARM A57 CPU和Maxwell GPU的Nvidia Jetson TX1异构嵌入式系统,利用其Teraflops的浮点计算能力和内建的视频采集、存储、无线传输模块,实现了多摄像头视频信息的实时全景拼接系统,有效地利用GPU指令的块、线程、流并行策略对图像拼接算法进行加速.实验结果表明,算法在图像拼接的特征提取、特征匹配等各个阶段均有很好的性能提升,其算法速度是传统ORB算法的11倍、传统SIFT算法的639倍;系统较传统的嵌入式系统性能提升了29倍,但其功耗低至10W.

关键词: 全景视频, 图像拼接, 异构计算, 嵌入式GPU, ORB

Abstract: Panoramic video is a sort of video recorded at the same point of view to record the full scene. The collecting devices of panoramic video are getting widespread attention with the development of VR and live-broadcasting video technology. Nevertheless, CPU and GPU are required to possess strong processing abilities to make panoramic video. The traditional panoramic products depend on large equipment or post processing, which results in high power consumption, low stability, unsatisfying performance in real time and negative advantages to the information security. This paper proposes a L-ORB feature detection algorithm. The algorithm optimizes the feature detection regions of the video images and simplifies the support of the ORB algorithm in scale and rotation invariance. Then the features points are matched by the multi-probe LSH algorithm and the progressive sample consensus (PROSAC) is used to eliminate the false matches. Finally, we get the mapping relation of image mosaic and use the multi-band fusion algorithm to eliminate the gap between the video. In addition, we use the Nvidia Jetson TX1 heterogeneous embedded system that integrates ARM A57 CPU and Maxwell GPU, leveraging its Teraflops floating point computing power and built-in video capture, storage, and wireless transmission modules to achieve multi-camera video information real-time panoramic splicing system, the effective use of GPU instructions block, thread, flow parallel strategy to speed up the image stitching algorithm. The experimental results show that the algorithm mentioned can improve the performance in the stages of feature extraction of images stitching and matching, the running speed of which is 11 times than that of the traditional ORB algorithm and 639 times than that of the traditional SIFT algorithm. The performance of the system accomplished in the article is 59 times than that of the former embedded one, while the power dissipation is reduced to 10W.

Key words: panoramic video, image stitching, heterogeneous computing, embedded GPU, oriented FAST and rotated BRIEF (ORB)