高级检索
    胡飞, 尤志强, 刘鹏, 邝继顺. 基于忆阻器交叉阵列的卷积神经网络电路设计[J]. 计算机研究与发展, 2018, 55(5): 1097-1107. DOI: 10.7544/issn1000-1239.2018.20170107
    引用本文: 胡飞, 尤志强, 刘鹏, 邝继顺. 基于忆阻器交叉阵列的卷积神经网络电路设计[J]. 计算机研究与发展, 2018, 55(5): 1097-1107. DOI: 10.7544/issn1000-1239.2018.20170107
    Hu Fei, You Zhiqiang, Liu Peng, Kuang Jishun. Circuit Design of Convolutional Neural Network Based on Memristor Crossbar Arrays[J]. Journal of Computer Research and Development, 2018, 55(5): 1097-1107. DOI: 10.7544/issn1000-1239.2018.20170107
    Citation: Hu Fei, You Zhiqiang, Liu Peng, Kuang Jishun. Circuit Design of Convolutional Neural Network Based on Memristor Crossbar Arrays[J]. Journal of Computer Research and Development, 2018, 55(5): 1097-1107. DOI: 10.7544/issn1000-1239.2018.20170107

    基于忆阻器交叉阵列的卷积神经网络电路设计

    Circuit Design of Convolutional Neural Network Based on Memristor Crossbar Arrays

    • 摘要: 由于在神经形态计算方面具有优良的性能,忆阻器交叉阵列引起了研究者的广泛关注.利用忆阻器与传统器件提出了1个改进的忆阻器交叉阵列电路,可以准确地存储权重与偏置,结合相应的编码方案后可以运算点积操作,并将其用于卷积神经网络中的卷积核、池化与分类器部分.利用改进的忆阻器交叉阵列和基于卷积神经网络本身拥有的高容错性,还设计了1个忆阻卷积神经网络结构,可以完成1个基本卷积神经网络算法.在卷积操作后直接存储模拟形式的计算结果,使得卷积操作与池化操作之间避免了1次模数-数模转换过程.实验结果表明:设计的面积为0.852 5cm\+2芯片上的运算性能是1台计算机速度的1 770倍,在面积基本相当的前提下,性能比前人设计的电路提高了7.7倍.设计存在可以接受的微小识别误差开销,与软件运行结果相比,此电路在每个忆阻器存储6b或8b信息的情况下平均识别误差分别只增加了0.039%与0.012%.

       

      Abstract: Memristor crossbar array has caused wide attention due to its excellent performance in neuromorphic computing. In this paper, we design a circuit to realize a convolutional neural network (CNN) using memristors and CMOS devices. Firstly, we improve a memristor crossbar array that can store weights and bias accurately. A dot product between two vectors can be calculated after introducing an appropriate encoding scheme. The improved memristor crossbar array is employed for convolution and pooling operations, and a classifier in a CNN. Secondly, we also design a memristive CNN architecture using the improved memristor crossbar array and based on the high fault-tolerance of CNNs to perform a basic CNN algorithm. In the designed architecture, the analog results of convolution operations are sampled and held before a pooling operation rather than using analog digital converters and digital analog converters between convolution and pooling operations in a previous architecture. Experimental results show the designed circuit with the area of 0.8525cm\+2 can achieve a speedup of 1770×compared with a GPU platform. Compared with previous memristor-based architecture with a similar area, our design is 7.7×faster. The average recognition errors performed on the designed circuit are only 0.039% and 0.012% lost than those of software implementation in the cases of a memristor with 6-bit and 8-bit storage capacities, respectively.

       

    /

    返回文章
    返回