ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2018, Vol. 55 ›› Issue (3): 551-562.doi: 10.7544/issn1000-1239.2018.20170715

所属专题: 2018边缘计算专题

• 网络技术 • 上一篇    下一篇

面向边缘计算的嵌入式FPGA卷积神经网络构建方法

卢冶1,陈瑶2,4,李涛1,3,蔡瑞初2,宫晓利1,3   

  1. 1(南开大学计算机与控制工程学院 天津 300350); 2(广东工业大学计算机学院 广州 510006); 3(计算机体系结构国家重点实验室(中国科学院计算技术研究所) 北京 100190); 4(新加坡高等数字科学研究中心 新加坡 138632) (luye@nankai.edu.cn)
  • 出版日期: 2018-03-01
  • 基金资助: 
    国家自然科学基金项目(61702286);天津市自然科学基金项目(14JCQNJC00700,16ICYIC15200);计算机体系结构国家重点实验室开放课题(CARCH201504,CARCH201604);天津市大数据与云计算重大专项(15ZXDSGX00020);福建省信息处理与智能控制重点实验室开放课题(MJUKF201733);天津市优秀企业科技特派员项目(17JCTPJC49500)

Convolutional Neural Network Construction Method for Embedded FPGAs Oriented Edge Computing

Lu Ye1, Chen Yao2,4, Li Tao1,3, Cai Ruichu2, Gong Xiaoli1,3   

  1. 1(College of Computer and Control Engineering, Nankai University, Tianjin 300350); 2(School of Computers, Guangdong University of Technology, Guangzhou 510006); 3(State Key Laboratory of Computer Architecture (Institute of Computing Technology, Chinese Academy of Sciences), Beijing 100190); 4(Advanced Digital Sciences Center, Singapore 138632)
  • Online: 2018-03-01

摘要: 当前,高计算消耗的应用和服务逐渐从集中式云计算中心向网络边缘的嵌入式环境迁移,FPGA因其灵活性和高能效特性,使其在边缘计算的嵌入式系统中得到广泛的应用.传统的FPGA卷积神经网络构造方法存在设计周期长和优化空间小等缺点,无法有效探索硬件加速器的设计空间,在网络边缘的的嵌入式环境下尤为明显.针对该问题,提出一种面向边缘计算的嵌入式FPGA平台卷积神经网络通用的构建方法.通过设计卷积神经网络函数中的网络层间可复用的加速器核心,以少量硬件资源实现性能优化的卷积神经网络硬件;通过拓展设计、缓存优化及数据流优化等技术,实现HLS设计优化;利用该方法在嵌入式FPGA平台上构建相应卷积神经网络,实验结果表明:优化后的网络模型在与Xeon E5-1620 CPU和GTX Titan GPU相比时,在功耗与性能方面具有一定优势,适合应用于边缘计算环境中.

关键词: 边缘计算, 卷积神经网路, FPGA, 高层次综合, 加速器核心

Abstract: At present, applications and services with high computational consumption migrate gradually from centralized cloud computing center to embedded environment in the network edge. FPGA is widely used in the embedded systems under edge computing because of its flexibility and high efficiency. The conventional FPGA based convolutional neural network construction method has shortcomings, such as long design cycle and small optimization space, which leads to an ineffective exploration of the design space of targeted hardware accelerator, especially in network edge embedded environment. In order to overcome these issues, a high level synthesis based general method for convolutional neural network construction on embedded FPGA oriented edge computing is proposed. The highly reusable accelerator function is designed to construct the optimized convolutional neural network with a lower hardware resource consumption. Scalable design methodology, memory optimization and data flow enhancement are implemented on the accelerator core with HLS design strategy. The convolutional neural network is built on embedded FPGA platforms. The results show the advantage of performance and power when compared with Xeon E5-1620 CPU and GTX K80 GPU, and suitable for edge computing environment.

Key words: edge computing, convolutional neural network, FPGA, high level synthesis, accelerator core

中图分类号: