Butterfly is a kind of insects that are sensitive to the habitat. The distribution of butterfly species in natural environment reflects the balance of regional ecosystem and the biodiversity of the region. To identify the species of butterflies manually is a heavy time consuming work for experts. Computer vision technology makes it possible to automatically identify butterfly species. This paper focuses on identifying the butterfly species via images taken in natural environment. This is a very challenging task because the butterfly wings in the images are always folded and the features identifying the butterfly species cannot be seen. Therefore two new attention mechanisms, referred to as DSEA (direct squeeze-and-excitation with global average pooling) and DSEM (direct squeeze-and-excitation with global max pooling), are proposed in this paper to advance the classical object detection algorithm RetinaNet. And the deformable convolution is simultaneously introduced to enhance the power of RetinaNet to simulate the butterfly deformation in images from natural environment, so as to realize the automatic butterfly species identification task according to the features of butterfly images from natural environment. The very famous criterion mAP (mean average precision) for target detection is taken to value the proposed model, and the visualization is adopted to investigate the primary factors influencing the performance of the predictive model. Extensive experiments demonstrate that the improved RetinaNet is valid in identifying the butterfly species from images taken in the natural environment, especially the RetinaNet embedded with DSEM module. The balanced data can improve the generalization of the predictive model, and the structural dissimilarity of samples is a key factor affecting the performance of the predictive model.