ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2018, Vol. 55 ›› Issue (12): 2775-2784.doi: 10.7544/issn1000-1239.2018.20170581

• 图形图像 • 上一篇    下一篇

基于全卷积网络的中小目标检索方法(201909撤稿)

彭天强1,孙晓峰2,栗芳3   

  1. 1(河南工程学院计算机学院 郑州 451191);2(河南工程学院国际教育学院 郑州 451191);3(郑州金惠计算机系统工程有限公司 郑州 450001) (ptq_drumboy@163.com)
  • 出版日期: 2018-12-01
  • 基金资助: 
    国家自然科学基金项目(61301232)

Middle or Small Object Retrieval Based on Fully Convolutional Networks

Peng Tianqiang1, Sun Xiaofeng2, Li Fang3   

  1. 1(School of Computer Science, Henan Institute of Engineering, Zhengzhou 451191);2(School of International Education, Henan Institute of Engineering, Zhengzhou 451191);3(Zhengzhou Jinhui Computer System Engineering Co. Ltd, Zhengzhou 450001)
  • Online: 2018-12-01

摘要: 基于预训练卷积神经网络(convolutional neural networks, CNNs)的图像表示已成为图像检索任务中一种新的方法,但是这种图像表示方法均是对图像的整体特征表示,无法适用于目标仅占被检索图像的部分区域的检索.为了解决该问题,提出一种基于全卷积网络的中小目标检索方法,该方法将预训练全卷积网络应用于目标较小、仅占被检索图像部分区域的检索.1)利用全卷积网络对输入图像大小不受限制的优势,给定被检索图像,经过全卷积网络得到该图像的特征矩阵表示;2)给定查询目标图像,利用全卷积神经网络,得到目标图像的特征表示;3)将目标特征,与被检索图像的特征矩阵的每一个特征进行相似性比对,得到相似值和匹配最优位置.进一步引入多尺度、多比例变换以适用不同大小的实例目标.在标准数据集Oxford5K上的实验表明:该算法的检索性能优于现有算法.另外,在搜集的Logo数据集,该算法得到了不错的检索效果,进一步验证了算法的普适性和有效性.

关键词: 全卷积网络, 目标检索, 特征矩阵, 目标定位, 多比例变换

Abstract: Image representations derived from pre-trained convolutional neural networks (CNNs) have become the new state of the art in the task of image retrieval. But these methods are all based on image global representations and can’t be applied to the retrieval of query objects which only occupy the part area of the retrieved images. To solve these problems, this work explores the suitability for object retrieval of small query objects which only occupy part area of the retrieved images using pre-trained fully convolutional networks. First, we take advantage of the fully convolutional networks without the restriction of the size of input image,and given retrieved images,feature matrix representations are derived by fully convolutional networks. Second, given the query object, the feature can also be derived by the fully convolutional networks. Finally, the feature of query object is matched with each feature of the feature matrix of the retrieved image, and the similarity and optimal matching location are obtained. We further investigate the suitability of the multi-scale, multi-ratio transformation for different sizes of query object in the retrieved image. Experimental results on the benchmark dataset Oxford5K show that our method outperforms other state-of-the-art methods. We further demonstrate its scalability and efficacy on the Logo dataset which is collected randomly from the Internet.

Key words: fully convolutional networks (FCN), object retrieval, feature matrix, object location, multi-ratio transformation

中图分类号: