ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2017, Vol. 54 ›› Issue (1): 71-79.doi: 10.7544/issn1000-1239.2017.20150707

Previous Articles     Next Articles

A Heterogeneous Multimodal Object Recognition Strategy of the Massive Network Data Flow(201905Retraction)

Wen Mengfei1,4, Liu Weirong1, Hu Chao2,3   

  1. 1(School of Information Science and Engineering, Central South University, Changsha 410083); 2(Information and Network Center, Central South University, Changsha 410083); 3(Key Laboratory of Medical Information Research (Central South University), College of Hunan Province, Changsha 410083); 4(Hunan Provincial Research Institute of Education, Changsha 410005)
  • Online:2017-01-01

Abstract: It is a research hot to achieve the object recognition of the massive network media data nowadays. To address the problem, an object recognition strategy is proposed to handle the massive network media data flow which adopts heterogeneous multimodal structure while utilizing the temporal coherence. Firstly, based on the video and audio co-existing feature of media network data, a heterogeneous multimodal structure is constructed to incorporate the convolutional neural network(CNN) and the restricted Boltzmann machine(RBM). The audio information is processed by restricted Boltzmann machine and the video information is processed by convolutional neural network respectively. The heterogeneous multimodal structure can exploit the merits of different deep learning neural networks. After that, the share characteristic representation are generated by using the canonical correlation analysis(CCA). Then the temporal coherence of video frame is utilized to improve the recognizing accuracy further. There kinds of experiments are adopted to validate the effectiveness of the proposed strategy. The first type of experiment compares the proposed strategy with single-mode algorithm. The second type of experiment illustrates the result based on composite database. Finally the videos coming from real websites are extracted to compare the proposed strategy with other algorithms. These experiments prove the effectiveness of the proposed heterogeneous multimodal strategy.

Key words: object recognition, deep learning, convolutional neural network (CNN), restricted Boltzmann machine (RBM), canonical correlation analysis (CCA)

CLC Number: