RR-SC: 边缘设备中基于随机计算神经网络的运行时可重配置框架

宋玉红; 沙行勉; 诸葛晴凤; 许瑞; 王寒

doi:10.7544/issn1000-1239.202220738

RR-SC: 边缘设备中基于随机计算神经网络的运行时可重配置框架

RR-SC: Run-Time Reconfigurable Framework for Stochastic Computing-Based Neural Networks on Edge Devices

摘要

摘要: 随着人工智能民主化的发展，深度神经网络已经被广泛地应用于移动嵌入式设备上，例如智能手机和自动驾驶领域等. 随机计算作为一种新兴的、有前途的技术在执行机器学习任务时使用简单的逻辑门而不是复杂的二进制算术电路. 它具有在资源（如能源、计算单元和存储单元等）受限的边缘设备上执行深度神经网络低能耗、低开销的优势. 然而，之前的关于随机计算的工作都仅仅设计一组模型配置并在固定的硬件配置上实现，忽略了实际应用场景中硬件资源（如电池电量）的动态改变，这导致了低硬件效率和短电池使用时间. 为了节省电池供电的边缘设备的能源，动态电压和频率调节技术被广泛用于硬件重配置以延长电池的使用时间. 针对基于随机计算的深度神经网络，创新性地提出了一个运行时可重配置框架，即RR-SC，这个框架首次尝试将硬件和软件的重配置相结合以满足任务的时间约束并最大限度节省能源. RR-SC利用强化学习技术可以一次性生成多组模型配置，同时满足不同硬件配置（即不同的电压/频率等级）下的准确率要求. RR-SC得到的解具有最好的准确率和硬件效率权衡. 同时，多个模型配置运行时在同一个主干网络上进行切换，从而实现轻量级的软件重配置. 实验结果表明，RR-SC可以在110 ms内进行模型配置的轻量级切换，以保证在不同硬件级别上所需的实时约束. 同时，它最高可以实现7.6倍的模型推理次数提升，仅造成1%的准确率损失.

Abstract: With the development of AI democratization, deep neural networks (DNNs) have been widely applied to edge devices, such as smart phones and automated driving, etc. Stochastic computing (SC) as a promising technique performs fundamental machine learning (ML) tasks using simple logic gates instead of complicated binary arithmetic circuits. SC has advantages of low-power and low-cost DNNs execution on edge devices with constrained resources (e.g., energy, computation and memory units, etc.). However, previous SC work only designs one group of setting for fixed hardware implementation, ignoring the dynamic hardware resources (e.g., battery), which leads to low hardware efficiency and short battery life. In order to save energy for battery-powered edge devices, dynamic voltage and frequency scaling (DVFS) technique is widely used for hardware reconfiguration to prolong battery life. In this paper, we creatively propose a run-time reconfigurable framework, namely RR-SC, for SC-based DNNs and first attempt to combine hardware and software reconfigurations to satisfy the time constraint of inference and maximally save energy. RR-SC using reinforcement learning (RL) can generate multiple groups of model settings at one time, which can satisfy the accuracy constraints under different hardware settings (i.e., different voltage/frequency levels). The solution has the best accuracy and hardware efficiency trade-off. Meanwhile, the model settings are switched on a backbone model at run-time, which enables lightweight software reconfiguration. Experimental results show that RR-SC can switch the lightweight settings within 110 ms to guarantee the required real-time constraint at different hardware levels. Meanwhile, it can achieve up to 7.6 times improvement for the number of model inference with only 1% accuracy loss.

HTML全文

参考文献(30)

施引文献

资源附件(0)