ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2022, Vol. 59 ›› Issue (3): 533-552.doi: 10.7544/issn1000-1239.20210580

所属专题: 2022存储系统与智能处理专题

• 体系结构 • 上一篇    下一篇

基于自选尾数压缩的高能效浮点忆阻存内处理系统

丁文隆1,汪承宁1,2,童薇1,2   

  1. 1(华中科技大学计算机科学与技术学院 武汉 430074);2(武汉光电国家研究中心(华中科技大学) 武汉 430074) (wlding@hust.edu.cn)
  • 出版日期: 2022-03-07
  • 基金资助: 
    国家自然科学基金项目(61832007,61821003);中央高校基本科研业务费专项资金项目(2019kfyXMBZ037);之江实验室开放课题(2020AA3AB07)

Energy-Efficient Floating-Point Memristive In-Memory Processing System Based on Self-Selective Mantissa Compaction

Ding Wenlong1, Wang Chengning1,2, Tong Wei1,2   

  1. 1(School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan 430074);2(Wuhan National Laboratory for Optoelectronics(Huazhong University of Science and Technology), Wuhan 430074)
  • Online: 2022-03-07
  • Supported by: 
    This work was supported by the National Natural Science Foundation of China (61832007, 61821003), the Fundamental Research Funds for the Central Universities (2019kfyXMBZ037), and the Zhejiang Lab Open Fund (2020AA3AB07).

摘要: 矩阵向量乘法(matrix-vector multiplication, MVM)运算是高性能科学线性系统求解的重要计算内核.Feinberg等人最近的工作提出了将高精度浮点数部署在忆阻阵列上的方法,显示出其在加速科学MVM运算方面的巨大潜力.由于科学计算不同类型的应用对于求解精度的要求各不相同,为具体应用提供合适的计算方式是进一步降低系统能耗的有效途径.展示了一种拥有尾数压缩与对齐位优化策略的系统,在实现高精度浮点数忆阻MVM运算这一基本功能的前提下,能够根据具体应用的求解精度要求选择合适的浮点数尾数压缩位数.通过忽略浮点数尾数权重较小的部分低位与冗余的对齐位的阵列激活,减小运算时阵列及外围电路的能耗.评估结果表明:当忆阻器求解相对于软件基线平均分别有0~10\+\{-3\}数量级的求解残差时,平均运算阵列能耗与模数转换器能耗相对于已有的优化前的系统分别减少了5%~65%与30%~55%.

关键词: 忆阻器阵列, 模拟矩阵向量乘法, 高能效科学计算, 存内并行处理系统, 稀疏线性代数系统

Abstract: Matrix-vector multiplication (MVM) is a key computing kernel for solving high-performance scientific systems. Recent work by Feinberg et al has proposed a method of deploying high-precision operands on memristive crossbars, showing its great potential on accelerating scientific MVM. Since different types of scientific computing applications have different precision requirements, providing appropriate computation methods for specific applications is an effective way to further reduce energy consumption. This paper proposes a system with mantissa compaction and alignment optimization strategies. Under the premise of implementing the basic function of high-precision floating-point memristive MVM, the proposed system is also possible to properly select the compaction bits of the floating-point mantissa according to application precision requirements. By neglecting the activation of the low-bit crossbars with less mantissa significance and the redundant alignment crossbars when performing computation, the energy consumption of computational crossbars and peripheral circuits are significantly reduced. The evaluation result shows that when the crossbar-based in-memory solutions of sparse linear systems have average solving residual of 0~10\+\{-3\} order of magnitude compared with the software baseline, the average energy consumption of computational crossbars and peripheral analog-to-digital converters are reduced by 5%~65% and 30%~55% compared with the existing work without optimization, respectively.

Key words: memristive crossbars, analog matrix-vector multiplication, energy-efficient scientific computing, in-memory parallel processing system, sparse linear algebra system

中图分类号: