• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Wei Zheng, Zhang Xingjun, Zhuo Zhimin, Ji Zeyu, Li Yonghao. PPO-Based Automated Quantization for ReRAM-Based Hardware Accelerator[J]. Journal of Computer Research and Development, 2022, 59(3): 518-532. DOI: 10.7544/issn1000-1239.20210551
Citation: Wei Zheng, Zhang Xingjun, Zhuo Zhimin, Ji Zeyu, Li Yonghao. PPO-Based Automated Quantization for ReRAM-Based Hardware Accelerator[J]. Journal of Computer Research and Development, 2022, 59(3): 518-532. DOI: 10.7544/issn1000-1239.20210551

PPO-Based Automated Quantization for ReRAM-Based Hardware Accelerator

Funds: This work was supported by the National Key Research and Development Program of China (2016YFB0200902).
More Information
  • Published Date: February 28, 2022
  • Convolutional neural networks have already exceeded the capabilities of humans in many fields. However, as the memory consumption and computational complexity of CNNs continue to increase, the “memory wall” problem, which constrains the data exchange between the processing unit and memory unit, impedes their deployments in resource-constrained environments, such as edge computing and the Internet of Things. ReRAM(resistive RAM)-based hardware accelerator has been widely applied in accelerating the computing of matrix-vector multiplication due to its advantages in terms of high density and low power, but are not adept for 32 b floating-point data computation, raising the demand for quantization to reduce the data precision. Manually determining the bitwidth for each layer is time-consuming, therefore, recent studies leverage DDPG(deep deterministic policy gradient) to perform automated quantization on FPGA(field programmable gate array) platform, but it needs to convert continuous actions into discrete actions and the resource constraints are achieved by manually decreasing the bitwidth of each layer. This paper proposes a PPO(proximal policy optimization)-based automated quantization for ReRAM-based hardware accelerator, which uses discrete action space to avoid the action space conversion step.We define a new reward function to enable the PPO agent to automatically learn the optimal quantization policy that meets the resource constraints, and give software-hardware modifications to support mixed-precision computing. Experimental results show that compared with coarse-grained quantization, the proposed method can reduce hardware cost by 20%~30% with negligible loss of accuracy. Compared with other automatic quantification, the proposed method has a shorter search time and can further reduce the hardware cost by about 4.2% under the same resource constraints. This provides insights for co-design of quantization algorithm and hardware accelerator.
  • Related Articles

    [1]Liu He, Ji Yu, Han Jianhui, Zhang Youhui, Zheng Weimin. Training and Software Simulation for ReRAM-Based LSTM Neural Network Acceleration[J]. Journal of Computer Research and Development, 2019, 56(6): 1182-1191. DOI: 10.7544/issn1000-1239.2019.20190113
    [2]Fang Rongqiang, Wang Jing, Yao Zhicheng, Liu Chang, Zhang Weigong. Modeling Computational Feature of Multi-Layer Neural Network[J]. Journal of Computer Research and Development, 2019, 56(6): 1170-1181. DOI: 10.7544/issn1000-1239.2019.20190111
    [3]Mao Haiyu, Shu Jiwu. 3D Memristor Array Based Neural Network Processing in Memory Architecture[J]. Journal of Computer Research and Development, 2019, 56(6): 1149-1160. DOI: 10.7544/issn1000-1239.2019.20190099
    [4]Chen Guilin, Ma Sheng, Guo Yang. Survey on Accelerating Neural Network with Hardware[J]. Journal of Computer Research and Development, 2019, 56(2): 240-253. DOI: 10.7544/issn1000-1239.2019.20170852
    [5]Wang Chenxi, Lü Fang, Cui Huimin, Cao Ting, John Zigman, Zhuang Liangji, Feng Xiaobing. Heterogeneous Memory Programming Framework Based on Spark for Big Data Processing[J]. Journal of Computer Research and Development, 2018, 55(2): 246-264. DOI: 10.7544/issn1000-1239.2018.20170687
    [6]Li Chuxi, Fan Xiaoya, Zhao Changhe, Zhang Shengbing, Wang Danghui, An Jianfeng, Zhang Meng. A Memristor-Based Processing-in-Memory Architecture for Deep Convolutional Neural Networks Approximate Computation[J]. Journal of Computer Research and Development, 2017, 54(6): 1367-1380. DOI: 10.7544/issn1000-1239.2017.20170099
    [7]Bian Chen, Yu Jiong, Xiu Weirong, Qian Yurong, Ying Changtian, Liao Bin. Partial Data Shuffled First Strategy for In-Memory Computing Framework[J]. Journal of Computer Research and Development, 2017, 54(4): 787-803. DOI: 10.7544/issn1000-1239.2017.20160049
    [8]Liu Zhibin, Zeng Xiaoqin, Liu Huiyi, Chu Rong. A Heuristic Two-layer Reinforcement Learning Algorithm Based on BP Neural Networks[J]. Journal of Computer Research and Development, 2015, 52(3): 579-587. DOI: 10.7544/issn1000-1239.2015.20131270
    [9]Li Ning, Xie Zhenhua, Xie Junyuan, and Chen Shifu. SEFNN—A Feed-Forward Neural Network Design Algorithm Based on Structure Evolution[J]. Journal of Computer Research and Development, 2006, 43(10): 1713-1718.
    [10]Li Kai, Huang Houkuan. A Selective Approach to Neural Network Ensemble Based on Clustering Technology[J]. Journal of Computer Research and Development, 2005, 42(4): 594-598.
  • Cited by

    Periodical cited type(2)

    1. 赵安宁,许诺,刘康,罗莉,潘炳征,薄子怡,谭承浩. 面向低磨损存内计算的多状态逻辑门综合. 计算机研究与发展. 2025(03): 620-632 . 本站查看
    2. 危华明,廖剑平. 海量数据存储中云服务器性能加速方法仿真. 计算机仿真. 2023(05): 515-519 .

    Other cited types(1)

Catalog

    Article views (279) PDF downloads (166) Cited by(3)

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return