ISSN 1000-1239 CN 11-1777/TP

• 人工智能 •

### 蝴蝶种类自动识别研究

1. 1(陕西师范大学计算机科学学院 西安 710119);2(南京大学计算机科学与技术系 南京 210023);3(山东财经大学计算机科学与技术学院 济南 250014);4(北京交通大学计算机与信息技术学院 北京 100044);5(中国科学院计算技术研究所 北京 100190);6(复旦大学计算机科学技术学院 上海 200433);7(南京航空航天大学计算机科学与技术学院 南京 210016);8(陕西师范大学生命科学学院 西安 710119) (xiejuany@snnu.edu.cn)
• 出版日期: 2018-08-01
• 基金资助:
国家自然科学基金项目(61673251)；中央高校基本科研业务费专项资金项目(GK201701006) This work was supported by the National Natural Science Foundation of China (61673251) and the Fundamental Research Funds for the Central Universities (GK201701006).

### The Automatic Identification of Butterfly Species

Xie Juanying1, Hou Qi1, Shi Yinghuan2, Lü Peng3, Jing Liping4, Zhuang Fuzhen5, Zhang Junping6, Tan Xiaoyang7,Xu Shengquan8

1. 1(School of Computer Science, Shaanxi Normal University, Xi’an 710119);2(Department of Computer Science & Technology, Nanjing University, Nanjing 210023);3(School of Computer Science & Technology, Shandong University of Finance and Economics, Jinan 250014);4(School of Computer & Information Technology, Beijing Jiaotong University, Beijing 100044);5(Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190);6(School of Computer Science, Fudan University, Shanghai 200433);7(College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 210016);8(College of Life Sciences, Shaanxi Normal University, Xi’an 710119)
• Online: 2018-08-01

Abstract: The available butterfly image data sets comprise a few limited species, and the images in the data sets are always standard patterns without the images of butterflies in their living environments. To overcome the aforementioned limitations in the butterfly image data sets, we build a butterfly image data set composed of all species of butterflies in Monograph of Chinese butterflies with 4270 standard pattern images of 1176 butterfly species, and 1425 butterfly images from living environment of 111 species. We use the deep learning technique Faster R-CNN to develop an automatic butterfly identification system including butterfly position detection in images from living environment and species recognition. We delete those butterfly species with only one living environment image from data set, then partition the rest butterfly images from living environment into two subsets in half-half partition way, such that one is used as testing subset, and the other is respectively combined with all standard patterns of butterfly images or the standard patterns of butterfly images with the same species as the images from living environment to get two different training subsets. In order to construct the training subset for Faster R-CNN, nine methods are adopted to amplify the images in the training subset including the turning of up and down, and left and right, rotation with different angles, adding noises, blurring, and contrast ratio adjusting etc. Three kinds of network structure based prediction models are trained. The mAP (mean average prediction) criterion is used to evaluate the performance of the predictive models. The experimental results demonstrate that our Faster R-CNN based butterfly automatic identification system performs well. Its worst mAP is up to 60%, and it can simultaneously detect the positions of more than one butterflies in one image from living environment and can recognize their species as well.