ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2017, Vol. 54 ›› Issue (6): 1367-1380.doi: 10.7544/issn1000-1239.2017.20170099

所属专题: 2017计算机体系结构前言技术(一)专题

• 系统结构 • 上一篇    下一篇

基于忆阻器的PIM结构实现深度卷积神经网络近似计算

李楚曦1,樊晓桠1,2,赵昌和1,张盛兵1,2,王党辉1,2,安建峰1,2,张萌1,2   

  1. 1(西北工业大学计算机学院 西安 710129); 2(嵌入式系统集成教育部工程研究中心(西北工业大学) 西安 710129) (lichuxi@mail.nwpu.edu.cn)
  • 出版日期: 2017-06-01
  • 基金资助: 
    国家自然科学基金项目(61472322);中央高校基本科研业务费专项资金项目(3102015BJ(Ⅱ)ZS018)

A Memristor-Based Processing-in-Memory Architecture for Deep Convolutional Neural Networks Approximate Computation

Li Chuxi1, Fan Xiaoya1,2, Zhao Changhe1, Zhang Shengbing1,2, Wang Danghui1,2, An Jianfeng1,2, Zhang Meng1,2   

  1. 1(School of Computer Science, Northwestern Polytechnical University, Xi’an 710129); 2(Engineering and Research Center of Embedded Systems Integration (Northwestern Polytechnical University), Ministry of Education, Xi’an 710129)
  • Online: 2017-06-01

摘要: 忆阻器(memristor)能够将存储和计算的特性融合,可用于构建存储计算一体化的PIM(processing-in-memory)结构.但是,由于计算阵列以及结构映射方法的限制,基于忆阻器阵列的深度神经网络计算需要频繁的AD/DA转换以及大量的中间存储,导致了显著的能量和面积开销.提出了一种新型的基于忆阻器的深度卷积神经网络近似计算PIM结构,利用模拟忆阻器大大增加数据密度,并将卷积过程分解到不同形式的忆阻器阵列中分别计算,增加了数据并行性,减少了数据转换次数并消除了中间存储,从而实现了加速和节能.针对该结构中可能存在的精度损失,给出了相应的优化策略.对不同规模和深度的神经网络计算进行仿真实验评估,结果表明,在相同计算精度下,该结构可以最多降低90%以上的能耗,同时计算性能提升约90%.

关键词: 忆阻器, PIM, 卷积神经网络, 近似计算, 模拟存储

Abstract: Memristor is one of the most promising candidates to build processing-in-memory (PIM) structures. The memristor-based PIM with digital or multi-level memristors has been proposed for neuromorphic computing. The essential frequent AD/DA converting and intermediate memory in these structures leads to significant energy and area overhead. To address this issue, a memristor-based PIM architecture for deep convolutional neural network (CNN) is proposed in this work. It exploits the analog architecture to eliminate data converting in neuron layer banks, each of which consists of two special modules named weight sub-arrays (WSAs) and accumulate sub-arrays (ASAs). The partial sums of neuron inputs are generated in WSAs concurrently and are written into ASAs continuously, in which the results are computed finally. The noise in proposed analog style architecture is analyzed quantitatively in both model and circuit levels, and a synthetic solution is presented to suppress the noise, which calibrates the non-linear distortion of weight with a corrective function, pre-charges the write module to reduce the parasitic effects, and eliminates noise with a modified noise-aware training. The proposed design has been evaluated by varying neural network benchmarks, in which the results show that the energy efficiency and performance can both be improved about 90% in specific neural network without accuracy losses compared with digital solutions.

Key words: memristor, processing-in-memory (PIM), convolutional neural network (CNN), approximate computation, analog memory

中图分类号: