高级检索
    刘必成, 顾海峰, 陈铭松, 谷守珍, 陈闻杰. 一种基于斯格明子介质的高效存内计算框架[J]. 计算机研究与发展, 2019, 56(4): 798-809. DOI: 10.7544/issn1000-1239.2019.20180157
    引用本文: 刘必成, 顾海峰, 陈铭松, 谷守珍, 陈闻杰. 一种基于斯格明子介质的高效存内计算框架[J]. 计算机研究与发展, 2019, 56(4): 798-809. DOI: 10.7544/issn1000-1239.2019.20180157
    Liu Bicheng, Gu Haifeng, Chen Mingsong, Gu Shouzhen, Chen Wenjie. An Efficient Processing In Memory Framework Based on Skyrmion Material[J]. Journal of Computer Research and Development, 2019, 56(4): 798-809. DOI: 10.7544/issn1000-1239.2019.20180157
    Citation: Liu Bicheng, Gu Haifeng, Chen Mingsong, Gu Shouzhen, Chen Wenjie. An Efficient Processing In Memory Framework Based on Skyrmion Material[J]. Journal of Computer Research and Development, 2019, 56(4): 798-809. DOI: 10.7544/issn1000-1239.2019.20180157

    一种基于斯格明子介质的高效存内计算框架

    An Efficient Processing In Memory Framework Based on Skyrmion Material

    • 摘要: 存内计算(processing in memory, PIM)作为一种新兴的技术,支持数据在存储单元内就地处理,减少了数据的移动并增加了数据的并行处理,在一定程度上弥补了冯·诺依曼架构的缺陷.和传统易失随机存储介质相比,赛道型内存(racetrack memory, RM)具有密度大、非易失且静态功耗低等特点,支持高效的存内计算.为解决性能与功耗问题,提出了一种新型的基于斯格明子(Skyrmion)介质的非易失性存内计算框架.该框架采用斯格明子赛道内存(Skyrmion-based racetrack memory)作为存储单元,采用斯格明子逻辑门(Skyrmion-based logic gate)构成的加法/乘法器组成计算单元,无须大量CMOS(complementary metal oxide semiconductor)电路辅助,设计复杂度大大降低.同时,通过在电路级优化存储单元读写端口数目与在系统级改进内存地址映射方式,大幅提高该框架的运行效率.实验结果表明:相比基于磁畴壁(domain-wall)的非易失性存内计算框架,提出的框架在运行时间上节省了481%,同时在能耗上节省了42.9%.

       

      Abstract: As a new computing paradigm, processing in memory (PIM) allows the parallel computation in both processors and memories, which drastically reduce the movements between computation units and storage units. Therefore, PIM can be considered as an efficient technology to somewhat address the shortcomings of the von neumann architecture. Compared with traditional random access memories, racetrack memory has many merits including high density, non-volatility, and low static power. Therefore, it can be used for efficient PIM computing. To address the shortages of domain-wall based PIM, this paper proposes a novel PIM framework based on the Skyrmion material. In this framework, we use Skyrmion-based racetrack memories to construct storage units, and use Skyrmion-based logic gates to compose both adders and multipliers for the computation units. Since our framework does not need CMOS (complementary metal oxide semiconductor) circuits to assist the underlying computation unit construction, the design complexity is significantly reduced. Meanwhile, based on our proposed optimization methods for read and write operations at the circuit layer and address mapping mode of the memory at the system level, the performance of our framework is drastically improved. Experimental results show that compared with domain-wall based PIM framework, our approach can achieve 48.1% time improvement and 42.9% energy savings on average.

       

    /

    返回文章
    返回