• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Xiao Jinsheng, Zhao Tao, Zhou Jian, Le Qiuping, Yang Liheng. Small Target Detection Network Based on Context Augmentation and Feature Refinement[J]. Journal of Computer Research and Development, 2023, 60(2): 465-474. DOI: 10.7544/issn1000-1239.202110956
Citation: Xiao Jinsheng, Zhao Tao, Zhou Jian, Le Qiuping, Yang Liheng. Small Target Detection Network Based on Context Augmentation and Feature Refinement[J]. Journal of Computer Research and Development, 2023, 60(2): 465-474. DOI: 10.7544/issn1000-1239.202110956

Small Target Detection Network Based on Context Augmentation and Feature Refinement

Funds: This work was supported by the National Natural Science Foundation of China for Young Scientists (42101448) and the Open Project Program Foundation of the CAS Key Laboratory of Opto-Electronics Information Processing (OEIP-O-202009).
More Information
  • Received Date: September 22, 2021
  • Revised Date: April 18, 2022
  • Available Online: February 10, 2023
  • Small objects contain few and fuzzy features, which is a hard problem in the field of object detection. The poor performance of small object detection is mainly caused by the limitation of the network and the imbalance of the training dataset. A novel feature pyramid composite structure constructed by context augmentation module (CAM) and feature refinement module (FRM) is proposed. The feature fusion of multi-scale dilated convolution is applied to generate features on different receptive fields, and then the features are added to detection network to supplement context information. The channel and space feature refinement mechanism is introduced to suppress the conflict information generated by multi-scale feature fusion and prevent small objects from being submerged in the conflict information. Besides, a data augmentation method called copy-reduce-paste is proposed to increase the proportion of small targets, so that the contribution of small targets to the loss value during training is greater and the training is more balanced. Experimental results show that the Mean Average Precision(mAP) of object detection on the VOC dataset of the proposed network is 83.6% (IOU is 0.5). The AP value of small target detection is 16.9% (IOU changes from 0.5 to 0.95), which is 3.9%, 7.7% and 5.3% higher than that of YOLOV4, CenterNet and RefineDet, respectively. The AP value of small target detection on TinyPerson dataset is 55.1%, which is 0.8% and 3.5% higher than that of YOLOV5 and DSFD, respectively.

  • [1]
    Joseph R, Ali F. YOLOV3: An incremental improvement[J]. arXiv preprint, arXiv: 1804.02767, 2018
    [2]
    Liu Wei, Anguelov D, Erhan D, et al. SSD: Single shot multi-box detector[C] //Proc of the 14th European Conf on Computer Vision. Berlin: Springer, 2016: 21−37
    [3]
    Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C] //Proc of the 27th IEEE Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2014: 580−587
    [4]
    Ren Shaoqing, He Kaiming, Girshick R. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137−1149 doi: 10.1109/TPAMI.2016.2577031
    [5]
    Kisantal M, Wojna Z, Murawski J, et al. Augmentation for small object detection[J]. arXiv preprint, arXiv: 1902.07296, 2019
    [6]
    黄继鹏,史颖欢,高阳. 面向小目标的多尺度Faster-RCNN检测算法[J]. 计算机研究与发展,2019,56(2):319−327 doi: 10.7544/issn1000-1239.2019.20170749

    Huang Jipeng, Shi Yinghuan, Gao Yang. Multi-scale faster-rcnn algorithm for small object detection[J]. Journal of Computer Research and Development, 2019, 56(2): 319−327 (in Chinese) doi: 10.7544/issn1000-1239.2019.20170749
    [7]
    Lin Tsungyi, Dollár P, Girshick R, et al. Feature pyramid networks for object detection[C] //Proc of the 30th IEEE Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2017: 936−944
    [8]
    Deng Chunfang, Wang Mengmeng, Liu Liang, et al. Extended feature pyramid network for small object detection[J]. arXiv preprint, arXiv: 2003.07021, 2020
    [9]
    肖进胜,饶天宇,贾茜,等. 改进的自适应冲击滤波图像超分辨率插值算法[J]. 计算机学报,2015,38(6):1131−1139 doi: 10.11897/SP.J.1016.2015.01131

    Xiao Jinsheng, Rao Tianyu, Jia Qian, et al. Interpolation algorithm based on improved adaptive shock filter in image super-resolution[J]. Chinese Journal of Computers, 2015, 38(6): 1131−1139 (in Chinese) doi: 10.11897/SP.J.1016.2015.01131
    [10]
    Zhang Shifeng, Wen Longyin, Bian Xiao, et al. Single-shot refinement neural network for object detection[C] //Proc of the 31st IEEE/CVF Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2018: 4203−4212
    [11]
    蒋弘毅,王永娟,康锦煜. 目标检测模型及其优化方法综述[J]. 自动化学报,2021,47(6):1232−1255 doi: 10.16383/j.aas.c190756

    Jiang Hongyim Wang Yongjuan, Kang Jingyu. A survey of object detection models and its optimization method[J]. Acta Automatica Sinica, 2021, 47(6): 1232−1255 (in Chinese) doi: 10.16383/j.aas.c190756
    [12]
    Liu Shu, Qi Lu, Qin Haifang, et al. Path aggregation network for instance segmentation[C] //Proc of the 31st IEEE/CVF Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2018: 8759−8768
    [13]
    Ghiasi G, Lin Tsungyi, Le Q V. NAS-FPN: Learning scalable feature pyramid architecture for object detection[C] //Proc of the 32nd IEEE/CVF Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2019: 7029−7038
    [14]
    Tan Mingxing, Pang Ruoming, Le Q V. EfficientDet: Scalable and efficient object detection[C] //Proc of the 33rd IEEE/CVF Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2020: 10778−10787
    [15]
    Chen Yukang, Zhang Peizhen, Li Zeming, et al. Stitcher: Feedback-driven data provider for object detection[J]. arXiv preprint, arXiv: 2004.12432, 2020
    [16]
    Bochkovskiy A, Wang Chienyao, Mark Liao H Y. YOLOV4: Optimal speed and accuracy of object detection[J]. arXiv preprint, arXiv: 2004.10934, 2020
    [17]
    Yu F, Vladlen K. Multi-scale context aggregation by dilated convolutions[J]. arXiv preprint, arXiv: 1511.07122, 2015
    [18]
    Yu Xuehui, Gong Yuqi, Jiang Nan, et al. Scale match for tiny person detection[C] //Proc of 2020 IEEE Winter Conf on Applications of Computer Vision. Piscataway, NJ: IEEE, 2020: 1257−1265
    [19]
    Paszke A, Gross S, Massa F, et al. Pytorch: An imperative style, high-performance deep learning library[C] //Proc of the 33rd Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press, 2019: 8026−8037
    [20]
    He Kaiming, Gkioxari G, Dollárm P, et al. Mask R-CNN[C] //Proc of the 16th IEEE Int Conf on Computer Vision (ICCV). Piscataway, NJ: IEEE, 2017: 2980−2988
    [21]
    Choi J, Elezi I, Lee H J, et al. Active learning for deep object detection via probabilistic modeling[J]. arXiv preprint, arXiv: 2103.16130, 2021
    [22]
    Li Jian, Wang Yabiao, Wang Changan, et al. DSFD: Dual shot face detector[C] //Proc of the 32nd IEEE Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2019: 5060−5069
    [23]
    Jocher G. YOLOV5[CP/OL]. (2021-06-09) [2021-06-09]. https://github.com/ultralytics/yolov5
    [24]
    Duan Kaiwen, Bai Song, Xie Lingxi, et al. CenterNet: Keypoint triplets for object detection[C] //Proc of the 17th IEEE Int Conf on Computer Vision (ICCV). Piscataway, NJ: IEEE, 2019: 6568−6577
    [25]
    Kong Tao, Yao Anbang, Chen Yurong, et al. HyperNet: Towards accurate region proposal generation and joint object detection[C] //Proc of the 29th IEEE Conf on Computer Vision and Pattern Recognition (CVPR). Piscataway, NJ: IEEE, 2016: 845−853
    [26]
    Zhu Yousong, Zhao Chaoyang, Wang Jinqiao, et al. CoupleNet: Coupling global structure with local parts for object detection[C] //Proc of the 15th IEEE Int Conf on Computer Vision (ICCV). Piscataway, NJ: IEEE, 2017: 4146−4154
    [27]
    Kong Tao, Sun Fuchun, Huang Wenbing, et al. Deep feature pyramid reconfiguration for object detection[C] //Proc of the 15th European Conf on Computer Vision (ECCV). Berlin: Springer, 2018: 169−185
    [28]
    Liu Ziming, Gao Guangyu, Sun Lin, et al. IPG-net: Image pyramid guidance network for small object detection[C] //Proc of the 33rd IEEE/CVF Conf on Computer Vision and Pattern Recognition Workshops (CVPRW). Piscataway, NJ: IEEE, 2020: 4422−4430
    [29]
    Liu Songtao, Huang Di, Wang Yunhong. Receptive field block net for accurate and fast object detection[C] //Proc of the 15th European Conf on Computer Vision. Berlin: Springer, 2018: 404−419
    [30]
    Zhu Rui, Zhang Shifeng, Wang Xiaobo. ScratchDet: Training single-shot object detectors from scratch[C] //Proc of the 32nd IEEE/CVF Conf on Computer Vision and Pattern Recognition (CVPR). Piscataway, NJ: IEEE, 2019: 2263−2272
    [31]
    Kim S W, Kook H K, Sun J Y, et al. Parallel feature pyramid network for object detection[C] //Proc of the 15th European Conf on Computer Vision (ECCV). Berlin: Springer, 2018: 234−250
  • Related Articles

    [1]Wang Chenze, Shen Xuehao, Huang Zhenli, Wang Zhengxia. Interactive Visualization Framework for Panoramic Super-Resolution Images Based on Localization Data[J]. Journal of Computer Research and Development, 2024, 61(7): 1741-1753. DOI: 10.7544/issn1000-1239.202330643
    [2]Fan Wei, Liu Yong. Social Network Information Diffusion Prediction Based on Spatial-Temporal Transformer[J]. Journal of Computer Research and Development, 2022, 59(8): 1757-1769. DOI: 10.7544/issn1000-1239.20220064
    [3]Zhou Weilin, Yang Yuan, Xu Mingwei. Network Function Virtualization Technology Research[J]. Journal of Computer Research and Development, 2018, 55(4): 675-688. DOI: 10.7544/issn1000-1239.2018.20170937
    [4]Yang Shuaifeng, Zhao Ruizhen. Image Super-Resolution Reconstruction Based on Low-Rank Matrix and Dictionary Learning[J]. Journal of Computer Research and Development, 2016, 53(4): 884-891. DOI: 10.7544/issn1000-1239.2016.20140726
    [5]Dou Nuo, Zhao Ruizhen, Cen Yigang, Hu Shaohai, Zhang Yongdong. Noisy Image Super-Resolution Reconstruction Based on Sparse Representation[J]. Journal of Computer Research and Development, 2015, 52(4): 943-951. DOI: 10.7544/issn1000-1239.2015.20140047
    [6]Yang Xin, Zhou Dake, Fei Shumin. A Self-Adapting Bilateral Total Variation Technology for Image Super-Resolution Reconstruction[J]. Journal of Computer Research and Development, 2012, 49(12): 2696-2701.
    [7]Wang Kai, Hou Zifeng. A Relaxed Co-Scheduling Method of Virtual CPUs on Xen Virtual Machines[J]. Journal of Computer Research and Development, 2012, 49(1): 118-127.
    [8]Wang Dan, Feng Dengguo, and Xu Zhen. An Approach to Data Sealing Based on Trusted Virtualization Platform[J]. Journal of Computer Research and Development, 2009, 46(8): 1325-1333.
    [9]Xiao Chuangbai, Yu Jing, Xue Yi. A Novel Fast Algorithm for MAP Super-Resolution Image Reconstruction[J]. Journal of Computer Research and Development, 2009, 46(5): 872-880.
    [10]Huang Hua, Fan Xin, Qi Chun, and Zhu Shihua. Face Image Super-Resolution Reconstruction Based on Recognition and Projection onto Convex Sets[J]. Journal of Computer Research and Development, 2005, 42(10): 1718-1725.
  • Cited by

    Periodical cited type(1)

    1. 刘韵洁,汪硕,黄韬,王佳森. 数算融合网络技术发展研究. 中国工程科学. 2025(01): 1-13 .

    Other cited types(0)

Catalog

    Article views (557) PDF downloads (235) Cited by(1)

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return