高级检索
    刘鹤, 季宇, 韩建辉, 张悠慧, 郑纬民. 面向阻变存储器的长短期记忆网络加速器的训练和软件仿真[J]. 计算机研究与发展, 2019, 56(6): 1182-1191. DOI: 10.7544/issn1000-1239.2019.20190113
    引用本文: 刘鹤, 季宇, 韩建辉, 张悠慧, 郑纬民. 面向阻变存储器的长短期记忆网络加速器的训练和软件仿真[J]. 计算机研究与发展, 2019, 56(6): 1182-1191. DOI: 10.7544/issn1000-1239.2019.20190113
    Liu He, Ji Yu, Han Jianhui, Zhang Youhui, Zheng Weimin. Training and Software Simulation for ReRAM-Based LSTM Neural Network Acceleration[J]. Journal of Computer Research and Development, 2019, 56(6): 1182-1191. DOI: 10.7544/issn1000-1239.2019.20190113
    Citation: Liu He, Ji Yu, Han Jianhui, Zhang Youhui, Zheng Weimin. Training and Software Simulation for ReRAM-Based LSTM Neural Network Acceleration[J]. Journal of Computer Research and Development, 2019, 56(6): 1182-1191. DOI: 10.7544/issn1000-1239.2019.20190113

    面向阻变存储器的长短期记忆网络加速器的训练和软件仿真

    Training and Software Simulation for ReRAM-Based LSTM Neural Network Acceleration

    • 摘要: 长短期记忆(long short-term memory, LSTM)网络是一种循环神经网络,其擅长处理和预测时间序列中间隔和延迟较长的事件,多用于语音识别、机器翻译等领域.然而受限于内存带宽的限制,现今的多数神经网络加速器件的计算模式并不能高效处理长短期记忆网络计算;而阻变存储器交叉开关结构能够以存内计算形式完成高效、高密度的向量矩阵乘运算,从而成为一种高效处理长短期记忆网络的极具潜力的加速器设计模式.研究了面向阻变存储器的长短期记忆神经网络加速器模拟工具以及相应的神经网络训练算法.该模拟工具能够以时钟驱动的形式模拟设计者提出的以阻变存储器交叉开关结构为核心加速部件的长短期记忆加速器微体系结构,从而进行设计空间探索;同时改进了神经网络训练算法以适应阻变存储器特性.这一模拟工具基于System-C实现,且对于核心计算部分实现了图形处理器加速,可以提高阻变存储器器件的仿真速度,为探索设计空间提供便利.

       

      Abstract: Long short-term memory (LSTM) is mostly used in fields of speech recognition, machine translation, etc., owing to its expertise in processing and predicting events with long intervals and long delays in time series. However, most of existing neural network acceleration chips cannot perform LSTM computation efficiently, as limited by the low memory bandwidth. ReRAM-based crossbars, on the other hand, can process matrix-vector multiplication efficiently due to its characteristic of processing in memory (PIM). However, a software tool of broad architectural exploration and end-to-end evaluation for ReRAM-based LSTM acceleration is still missing. This paper proposes a simulator for ReRAM-based LSTM neural network acceleration and a corresponding training algorithm. Main features (including imperfections) of ReRAM devices and circuits are reflected by the highly configurable tools, and the core computation of simulation can be accelerated by general-purpose graphics processing unit (GPGPU). Moreover, the core component of simulator has been verified by the corresponding circuit simulation of a real chip design. Within this framework, architectural exploration and comprehensive end-to-end evaluation can be achieved.

       

    /

    返回文章
    返回