ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2020, Vol. 57 ›› Issue (4): 709-722.doi: 10.7544/issn1000-1239.2020.20190863

Special Issue: 2020数据驱动网络专题

Previous Articles     Next Articles

DNN Inference Acceleration via Heterogeneous IoT Devices Collaboration

Sun Sheng1,2, Li Xujing1,2, Liu Min1,2, Yang Bo1,2, Guo Xiaobing3   

  1. 1(Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190);2(University of Chinese Academy of Sciences, Beijing 100049);3(Lenovo Research, Beijing 100085)
  • Online:2020-04-01
  • Supported by: 
    This work was supported by the National Natural Science Foundation of China (61732017, 61872028).

Abstract: Deep neural networks (DNNs) have been intensively deployed in a variety of intelligent applications (e.g., image and video recognition). Nevertheless, due to DNNs’ heavy computation burden, resource-constrained IoT devices are unsuitable to locally execute DNN inference tasks. Existing cloud-assisted approaches are severely affected by unpredictable communication latency and unstable performance of remote servers. As a countermeasure, it is a promising paradigm to leverage collaborative IoT devices to achieve distributed and scalable DNN inference. However, existing works only consider homogeneous IoT devices with static partition. Thus, there is an urgent need for a novel framework to adaptively partition DNN tasks and orchestrate distributed inference among heterogeneous resource-constrained IoT devices. There are two main challenges in this framework. First, it is difficult to accurately profile the DNNs’ multi-layer inference latency. Second, it is difficult to learn the collaborative inference strategy adaptively and in real-time in the heterogeneous environments. To this end, we first propose an interpretable multi-layer prediction model to abstract complex layer parameters. Furthermore, we leverage the evolutionary reinforcement learning (ERL) to adaptively determine the near-optimal partitioning strategy for DNN inference tasks. Real-world experiments based on Raspberry Pi are implemented, showing that our proposed method can significantly accelerate the inference speed in dynamic and heterogeneous environments.

Key words: DNN inference acceleration, heterogeneous device collaboration, evolutionary reinforce-ment learning, multi-layer prediction model, partitioning strategy

CLC Number: