Advanced Search
    Cheng Xiaoyang, Zhan Yongzhao, Mao Qirong, Zhan Zhicai. Video Semantic Analysis Based on Topographic Sparse Pre-Training CNN[J]. Journal of Computer Research and Development, 2018, 55(12): 2703-2714. DOI: 10.7544/issn1000-1239.2018.20170579
    Citation: Cheng Xiaoyang, Zhan Yongzhao, Mao Qirong, Zhan Zhicai. Video Semantic Analysis Based on Topographic Sparse Pre-Training CNN[J]. Journal of Computer Research and Development, 2018, 55(12): 2703-2714. DOI: 10.7544/issn1000-1239.2018.20170579

    Video Semantic Analysis Based on Topographic Sparse Pre-Training CNN

    • Video feature learning by deep neural network has become a hot research topic in video semantic analysis such as video object detection, motion recognition and video event detection. The topographic information of the video image plays an important role in describing the relationship between image and content. At the same time, it is helpful to improve the discriminability of the video feature expression by considering the characteristics of the video sequence with optimization. In this paper, an approach based on pre-training convolutional neural network with new topographic sparse encoder is proposed for video feature learning. This method has two stages: semi-supervised video image feature learning and supervised video sequence features optimization learning. In the semi-supervised video image feature learning stage, a new topographic sparse encoder is presented and used to pre-train neural networks, so that the characteristic expression of the video image can reflect the topographic information of the image, and a logistic regression is used to fine-tune the networks parameters using video concept label for video image feature learning. In the supervised video sequence feature optimization learning stage, a fully connected layer for feature learning of video sequence is constructed in order to express the feature of video sequence reasonably. A logistic regression constraint is established to adjust the network parameters in order that the discriminative feature of video sequence can be obtained. The experiments for relative methods are carried out on typical video datasets. The results show that the proposed method has better discriminability for the expression of video features, and can improve the accuracy of video semantic concept detection effectively.
    • loading

    Catalog

      Turn off MathJax
      Article Contents

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return