Abstract:
The available butterfly image data sets comprise a few limited species, and the images in the data sets are always standard patterns without the images of butterflies in their living environments. To overcome the aforementioned limitations in the butterfly image data sets, we build a butterfly image data set composed of all species of butterflies in Monograph of Chinese butterflies with 4270 standard pattern images of 1176 butterfly species, and 1425 butterfly images from living environment of 111 species. We use the deep learning technique Faster R-CNN to develop an automatic butterfly identification system including butterfly position detection in images from living environment and species recognition. We delete those butterfly species with only one living environment image from data set, then partition the rest butterfly images from living environment into two subsets in half-half partition way, such that one is used as testing subset, and the other is respectively combined with all standard patterns of butterfly images or the standard patterns of butterfly images with the same species as the images from living environment to get two different training subsets. In order to construct the training subset for Faster R-CNN, nine methods are adopted to amplify the images in the training subset including the turning of up and down, and left and right, rotation with different angles, adding noises, blurring, and contrast ratio adjusting etc. Three kinds of network structure based prediction models are trained. The mAP (mean average prediction) criterion is used to evaluate the performance of the predictive models. The experimental results demonstrate that our Faster R-CNN based butterfly automatic identification system performs well. Its worst mAP is up to 60%, and it can simultaneously detect the positions of more than one butterflies in one image from living environment and can recognize their species as well.