Abstract:
Small objects contain few and fuzzy features, which is a hard problem in the field of object detection. The poor performance of small object detection is mainly caused by the limitation of the network and the imbalance of the training dataset. A novel feature pyramid composite structure constructed by context augmentation module (CAM) and feature refinement module (FRM) is proposed. The feature fusion of multi-scale dilated convolution is applied to generate features on different receptive fields, and then the features are added to detection network to supplement context information. The channel and space feature refinement mechanism is introduced to suppress the conflict information generated by multi-scale feature fusion and prevent small objects from being submerged in the conflict information. Besides, a data augmentation method called copy-reduce-paste is proposed to increase the proportion of small targets, so that the contribution of small targets to the loss value during training is greater and the training is more balanced. Experimental results show that the Mean Average Precision(mAP) of object detection on the VOC dataset of the proposed network is 83.6% (IOU is 0.5). The AP value of small target detection is 16.9% (IOU changes from 0.5 to 0.95), which is 3.9%, 7.7% and 5.3% higher than that of YOLOV4, CenterNet and RefineDet, respectively. The AP value of small target detection on TinyPerson dataset is 55.1%, which is 0.8% and 3.5% higher than that of YOLOV5 and DSFD, respectively.