基于自选尾数压缩的高能效浮点忆阻存内处理系统

丁文隆; 汪承宁; 童薇

doi:10.7544/issn1000-1239.20210580

基于自选尾数压缩的高能效浮点忆阻存内处理系统

Energy-Efficient Floating-Point Memristive In-Memory Processing System Based on Self-Selective Mantissa Compaction

摘要

摘要: 矩阵向量乘法(matrix-vector multiplication, MVM)运算是高性能科学线性系统求解的重要计算内核.Feinberg等人最近的工作提出了将高精度浮点数部署在忆阻阵列上的方法，显示出其在加速科学MVM运算方面的巨大潜力.由于科学计算不同类型的应用对于求解精度的要求各不相同，为具体应用提供合适的计算方式是进一步降低系统能耗的有效途径.展示了一种拥有尾数压缩与对齐位优化策略的系统，在实现高精度浮点数忆阻MVM运算这一基本功能的前提下，能够根据具体应用的求解精度要求选择合适的浮点数尾数压缩位数.通过忽略浮点数尾数权重较小的部分低位与冗余的对齐位的阵列激活，减小运算时阵列及外围电路的能耗.评估结果表明：当忆阻器求解相对于软件基线平均分别有0~10\+\-3\数量级的求解残差时，平均运算阵列能耗与模数转换器能耗相对于已有的优化前的系统分别减少了5%~65%与30%~55%.

Abstract: Matrix-vector multiplication (MVM) is a key computing kernel for solving high-performance scientific systems. Recent work by Feinberg et al has proposed a method of deploying high-precision operands on memristive crossbars, showing its great potential on accelerating scientific MVM. Since different types of scientific computing applications have different precision requirements, providing appropriate computation methods for specific applications is an effective way to further reduce energy consumption. This paper proposes a system with mantissa compaction and alignment optimization strategies. Under the premise of implementing the basic function of high-precision floating-point memristive MVM, the proposed system is also possible to properly select the compaction bits of the floating-point mantissa according to application precision requirements. By neglecting the activation of the low-bit crossbars with less mantissa significance and the redundant alignment crossbars when performing computation, the energy consumption of computational crossbars and peripheral circuits are significantly reduced. The evaluation result shows that when the crossbar-based in-memory solutions of sparse linear systems have average solving residual of 0~10\+\-3\ order of magnitude compared with the software baseline, the average energy consumption of computational crossbars and peripheral analog-to-digital converters are reduced by 5%~65% and 30%~55% compared with the existing work without optimization, respectively.

HTML全文

参考文献(0)

施引文献

资源附件(0)