Advanced Search
    Zhang Ruimao, Peng Jiefeng, Wu Yang, Lin Liang. The Semantic Knowledge Embedded Deep Representation Learning and Its Applications on Visual Understanding[J]. Journal of Computer Research and Development, 2017, 54(6): 1251-1266. DOI: 10.7544/issn1000-1239.2017.20171064
    Citation: Zhang Ruimao, Peng Jiefeng, Wu Yang, Lin Liang. The Semantic Knowledge Embedded Deep Representation Learning and Its Applications on Visual Understanding[J]. Journal of Computer Research and Development, 2017, 54(6): 1251-1266. DOI: 10.7544/issn1000-1239.2017.20171064

    The Semantic Knowledge Embedded Deep Representation Learning and Its Applications on Visual Understanding

    • With the rapid development of deep learning technique and large scale visual datasets, the traditional computer vision tasks have achieved unprecedented improvement. In order to handle more and more complex vision tasks, how to integrate the domain knowledge into the deep neural network and enhance the ability of deep model to represent the visual pattern, has become a widely discussed topic in both academia and industry. This thesis engages in exploring effective deep models to combine the semantic knowledge and feature learning. The main contributions can be summarized as follows: 1)We integrate the semantic similarity of visual data into the deep feature learning process, and propose a deep similarity comparison model named bit-scalable deep hashing to address the issue of visual similarity comparison. The model in this thesis has achieved great performance on image searching and people’s identification. 2)We also propose a high-order graph LSTM (HG-LSTM) networks to solve the problem of geometric attribute analysis, which realizes the process of integrating the multi semantic context into the feature learning process. Our extensive experiments show that our model is capable of predicting rich scene geometric attributes and outperforming several state-of-the-art methods by large margins. 3)We integrate the structured semantic information of visual data into the feature learning process, and propose a novel deep architecture to investigate a fundamental problem of scene understanding: how to parse a scene image into a structured configuration. Extensive experiments show that our model is capable of producing meaningful and structured scene configurations, and achieving more favorable scene labeling result on two challenging datasets compared with other state-of-the-art weakly-supervised deep learning methods.
    • loading

    Catalog

      Turn off MathJax
      Article Contents

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return