Citation: | Song Yuhong, Edwin Hsing-Mean Sha, Zhuge Qingfeng, Xu Rui, Wang Han. RR-SC: Run-Time Reconfigurable Framework for Stochastic Computing-Based Neural Networks on Edge Devices[J]. Journal of Computer Research and Development, 2024, 61(4): 840-855. DOI: 10.7544/issn1000-1239.202220738 |
With the development of AI democratization, deep neural networks (DNNs) have been widely applied to edge devices, such as smart phones and automated driving, etc. Stochastic computing (SC) as a promising technique performs fundamental machine learning (ML) tasks using simple logic gates instead of complicated binary arithmetic circuits. SC has advantages of low-power and low-cost DNNs execution on edge devices with constrained resources (e.g., energy, computation and memory units, etc.). However, previous SC work only designs one group of setting for fixed hardware implementation, ignoring the dynamic hardware resources (e.g., battery), which leads to low hardware efficiency and short battery life. In order to save energy for battery-powered edge devices, dynamic voltage and frequency scaling (DVFS) technique is widely used for hardware reconfiguration to prolong battery life. In this paper, we creatively propose a run-time reconfigurable framework, namely RR-SC, for SC-based DNNs and first attempt to combine hardware and software reconfigurations to satisfy the time constraint of inference and maximally save energy. RR-SC using reinforcement learning (RL) can generate multiple groups of model settings at one time, which can satisfy the accuracy constraints under different hardware settings (i.e., different voltage/frequency levels). The solution has the best accuracy and hardware efficiency trade-off. Meanwhile, the model settings are switched on a backbone model at run-time, which enables lightweight software reconfiguration. Experimental results show that RR-SC can switch the lightweight settings within 110 ms to guarantee the required real-time constraint at different hardware levels. Meanwhile, it can achieve up to 7.6 times improvement for the number of model inference with only 1% accuracy loss.
[1] |
Garvey C. A framework for evaluating barriers to the democratization of artificial intelligence [C] //Proc of the 32nd AAAI Conf on Artificial Intelligence. Palo Alto, CA: AAAI, 2018: 8079−8080
|
[2] |
Wang Hanrui, Wu Zhanghao, Liu Zhijian, et al. Hat: Hardware-aware transformers for efficient natural language processing [C] //Proc of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2020: 7675−7688
|
[3] |
Song Yuhong, Jiang Weiwen, Li Bingbing, et al. Dancing along battery: Enabling transformer with run-time reconfigurability on mobile devices [C] // Proc of the 58th Design Automation Conf. Piscataway, NJ: IEEE, 2021: 1003−1008
|
[4] |
Jiang Weiwen, Yang Lei, Dasgupta S. , et al. Standing on the shoulders of giants: Hardware and neural architecture co-search with hot start[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2020, 39(11): 4154−4165 doi: 10.1109/TCAD.2020.3012863
|
[5] |
Peng Hongwu, Huang Shaoyi, Geng Tong, et al. Accelerating transformer-based deep learning models on FPGAs using column balanced block pruning [C] //Proc of the 22nd Int Symp on Quality Electronic Design (ISQED). Piscataway, NJ: IEEE, 2021: 142−148
|
[6] |
纪荣嵘,林绍辉,晁飞,等. 深度神经网络压缩与加速综述[J]. 计算机研究与发展,2018,55(9):1871−1888
Ji Rongrong, Lin Shaohui, Chao Fei, et al. Deep neural network compression and acceleration: A review[J]. Journal of Computer Research and Development, 2018, 55(9): 1871−1888 (in Chinese)
|
[7] |
龚成,卢冶,代素蓉,等. 一种超低损失的深度神经网络量化压缩方法[J]. 软件学报,2021,32(8):2391−2407
Gong Cheng, Lu Zhi, Dai Surong, et al. Ultra-low loss quantization method for deep neural network compression[J]. Journal of Software, 2021, 32(8): 2391−2407 (in Chinese)
|
[8] |
孟子尧,谷雪,梁艳春,等. 深度神经架构搜索综述[J]. 计算机研究与发展,2021,58(1):22−33
Meng Ziyao, Gu Xue, Liang Yanchun, et al. Deep neural architecture search: A survey[J]. Journal of Computer Research and Development, 2021, 58(1): 22−33 (in Chinese)
|
[9] |
李航宇,王楠楠,朱明瑞,等. 神经结构搜索的研究进展综述[J]. 软件学报,2022,33(1):129−149
Li Hangyu, Wang Nannan, Zhu Mingrui, et al. Recent advances in neural architecture search: A survey[J]. Journal of Software, 2022, 33(1): 129−149 (in Chinese)
|
[10] |
Gaines B R. Stochastic computing systems [G/OL] // Advances in Information Systems Science. Berlin: Springer, 1969 [2022-11-24]. https://link.springer.com/chapter/10.1007/978-1-4899-5841-9_2
|
[11] |
Jeavons P, Cohen D A, Shawe-Taylor J. Generating binary sequences for stochastic computing[J]. IEEE Transactions on Information Theory, 1994, 40(3): 716−720 doi: 10.1109/18.335883
|
[12] |
Qian Weikang, Li Xin, Riedel M D, et al. An architecture for fault-tolerant computation with stochastic logic[J]. IEEE Transactions on Computers, 2010, 60(1): 93−105
|
[13] |
Li Peng, Lilja D J, Qian Weikang, et al. Computation on stochastic bit streams digital image processing case studies[J]. IEEE Transactions on Very Large Scale Integration Systems, 2013, 22(3): 449−462
|
[14] |
Li Bingzhe, Qin Yaobin, Yuan Bo, et al. Neural network classifiers using stochastic computing with a hardware-oriented approximate activation function [C] //Proc of the 35th IEEE Int Conf on Computer Design (ICCD). Los Alamitos, CA: IEEE Computer Society, 2017: 97−104
|
[15] |
Wu Di, Li Jingjie, Yin Ruokai, et al. Ugemm: Unary computing architecture for gemm applications [C] //Proc of the 47th ACM/IEEE Annual Int Symp on Computer Architecture (ISCA). Piscataway, NJ: IEEE, 2020: 377−390
|
[16] |
Song Yuhong, Sha E H, Zhuge Qingfeng, et al. Bsc: Block-based stochastic computing to enable accurate and efficient TinyML [C] // Proc of the 27th Asia and South Pacific Design Automation Conf (ASP-DAC). Piscataway, NJ: IEEE, 2022: 314−319
|
[17] |
Kim K, Kim J, Yu J, et al. Dynamic energy-accuracy trade-off using stochastic computing in deep neural networks [C] // Proc of the 53rd Annual Design Automation Conf (DAC). New York: ACM, 2016: 124: 1−124: 6
|
[18] |
Sim H, Kenzhegulov S, Lee J. DPS: Dynamic precision scaling for stochastic computing-based deep neural networks [C] //Proc of the 55th Annual Design Automation Conf (DAC). New York: ACM, 2018: 13: 1−13: 6
|
[19] |
刘全,翟建伟,章宗长,等. 深度强化学习综述[J]. 计算机学报,2018,41(1):1−27
Liu Quan, Zhai Jianwei, Zhang Zongzhang, et al. A survey on deep reinforcement learning[J]. Chinese Journal of Computers, 2018, 41(1): 1−27 (in Chinese)
|
[20] |
余显,李振宇,孙胜,等. 基于深度强化学习的自适应虚拟机整合方法[J]. 计算机研究与发展,2021,58(12):2783−2797
Yu Xian, Li Zhenyu, Sun Sheng, et al. Adaptive virtual machine consolidation method based on deep reinforcement learning[J]. Journal of Computer Research and Development, 2021, 58(12): 2783−2797 (in Chinese)
|
[21] |
Sim H, Lee J. A new stochastic computing multiplier with application to deep convolutional neural networks [C] //Proc of the 54th Annual Design Automation Conf (DAC). New York: ACM, 2017: 29: 1−29: 6
|
[22] |
Tomic T, Schmid K, Lutz P, et al. Toward a fully autonomous UAV: Research platform for indoor and outdoor urban search and rescue[J]. IEEE Robotics & Automation Magazine, 2012, 19(3): 46−56
|
[23] |
Horowitz M, Indermaur T, Gonzalez R. Low-power digital design [C] // Proc of 1994 IEEE Symp on Low Power Electronics. Piscataway, NJ: IEEE, 1994: 8−11
|
[24] |
Jiang Weiwen, Zhang Xinyi, Sha E H, et al. Accuracy vs. Efficiency: Achieving both through FPGA-implementation aware neural architecture search [C/OL] // Proc of the 56th Annual Design Automation Conf (DAC). New York: ACM, 2019[2022-11-24].https://dl.acm.org/doi/abs/10.1145/3316781.3317757
|
[25] |
Williams R J. Simple statistical gradient-following algorithms for connectionist reinforcement learning[J]. Machine Learning, 1992, 8(3): 229−256
|
[26] |
Dong Xuanyi, Yang Yi. Nas-bench-201: Extending the scope of re-producible neural architecture search [C/OL] // Proc of the 8th Int Conf on Learning Representations (ICLR). 2020[2022-11-24].https://openreview.net/forum?id=HJxyZkBKDr
|
[27] |
Skadron K, Stan M, Huang Wei, et al. Temperature-aware microarchitecture [C] // Proc of the 30th Int Symp on Computer Architecture (ISCA). Los Alamitos, CA: IEEE Computer Society, 2003: 2−13
|
[28] |
Google. Odroid-XU3 [EB/OL]. 2020[2022-11-24].https://www.hardkernel.com/shop/odroid-xu3/, 2020.
|
[29] |
Liu Siting, Han jie. Energy efficient stochastic computing with sobol sequences [C] // Proc of the 20th Design, Automation & Test in Europe Conf & Exhibition (DATE). Piscataway, NJ: IEEE, 2017: 650−653
|
[30] |
Najafi M H, Lilja D J, Riedel M. Deterministic methods for stochastic computing using low-discrepancy sequences [C/OL] // Proc of the 37th IEEE/ACM Int Conf on Computer-Aided Design (ICCAD). New York: ACM, 2018[2022-11-24].https://dl.acm.org/doi/abs/10.1145/3240765.3240797
|
[1] | Zhai Ennan, Cao Jiamin, Qian Kun, Guan Yu. Towards Network Infrastructure Research for the Era of Large Language Models: Challenges, Practices, and Prospects[J]. Journal of Computer Research and Development, 2024, 61(11): 2664-2677. DOI: 10.7544/issn1000-1239.202440576 |
[2] | Qin Zhen, Zhuang Tianming, Zhu Guosong, Zhou Erqiang, Ding Yi, Geng Ji. Survey of Security Attack and Defense Strategies for Artificial Intelligence Model[J]. Journal of Computer Research and Development, 2024, 61(10): 2627-2648. DOI: 10.7544/issn1000-1239.202440449 |
[3] | Zhang Mi, Pan Xudong, Yang Min. JADE-DB:A Universal Testing Benchmark for Large Language Model Safety Based on Targeted Mutation[J]. Journal of Computer Research and Development, 2024, 61(5): 1113-1127. DOI: 10.7544/issn1000-1239.202330959 |
[4] | Zheng Mingyu, Lin Zheng, Liu Zhengxiao, Fu Peng, Wang Weiping. Survey of Textual Backdoor Attack and Defense[J]. Journal of Computer Research and Development, 2024, 61(1): 221-242. DOI: 10.7544/issn1000-1239.202220340 |
[5] | Zhang Xiaodong, Zhang Chaokun, Zhao Jijun. State-of-the-Art Survey on Edge Intelligence[J]. Journal of Computer Research and Development, 2023, 60(12): 2749-2769. DOI: 10.7544/issn1000-1239.202220192 |
[6] | Liu Qixu, Liu Jiaxi, Jin Ze, Liu Xinyu, Xiao Juxin, Chen Yanhui, Zhu Hongwen, Tan Yaokang. Survey of Artificial Intelligence Based IoT Malware Detection[J]. Journal of Computer Research and Development, 2023, 60(10): 2234-2254. DOI: 10.7544/issn1000-1239.202330450 |
[8] | Sang Jitao, Yu Jian. ChatGPT: A Glimpse into AI’s Future[J]. Journal of Computer Research and Development, 2023, 60(6): 1191-1201. DOI: 10.7544/issn1000-1239.202330304 |
[9] | Chen Yufei, Shen Chao, Wang Qian, Li Qi, Wang Cong, Ji Shouling, Li Kang, Guan Xiaohong. Security and Privacy Risks in Artificial Intelligence Systems[J]. Journal of Computer Research and Development, 2019, 56(10): 2135-2150. DOI: 10.7544/issn1000-1239.2019.20190415 |
[10] | Rao Dongning, Jiang Zhihua, Jiang Yunfei, Wu Kangheng. Learning Non-Deterministic Action Models for Web Services from WSBPEL Programs[J]. Journal of Computer Research and Development, 2010, 47(3): 445-454. |