Advanced Search
    Sun Sheng, Li Xujing, Liu Min, Yang Bo, Guo Xiaobing. DNN Inference Acceleration via Heterogeneous IoT Devices Collaboration[J]. Journal of Computer Research and Development, 2020, 57(4): 709-722. DOI: 10.7544/issn1000-1239.2020.20190863
    Citation: Sun Sheng, Li Xujing, Liu Min, Yang Bo, Guo Xiaobing. DNN Inference Acceleration via Heterogeneous IoT Devices Collaboration[J]. Journal of Computer Research and Development, 2020, 57(4): 709-722. DOI: 10.7544/issn1000-1239.2020.20190863

    DNN Inference Acceleration via Heterogeneous IoT Devices Collaboration

    • Deep neural networks (DNNs) have been intensively deployed in a variety of intelligent applications (e.g., image and video recognition). Nevertheless, due to DNNs’ heavy computation burden, resource-constrained IoT devices are unsuitable to locally execute DNN inference tasks. Existing cloud-assisted approaches are severely affected by unpredictable communication latency and unstable performance of remote servers. As a countermeasure, it is a promising paradigm to leverage collaborative IoT devices to achieve distributed and scalable DNN inference. However, existing works only consider homogeneous IoT devices with static partition. Thus, there is an urgent need for a novel framework to adaptively partition DNN tasks and orchestrate distributed inference among heterogeneous resource-constrained IoT devices. There are two main challenges in this framework. First, it is difficult to accurately profile the DNNs’ multi-layer inference latency. Second, it is difficult to learn the collaborative inference strategy adaptively and in real-time in the heterogeneous environments. To this end, we first propose an interpretable multi-layer prediction model to abstract complex layer parameters. Furthermore, we leverage the evolutionary reinforcement learning (ERL) to adaptively determine the near-optimal partitioning strategy for DNN inference tasks. Real-world experiments based on Raspberry Pi are implemented, showing that our proposed method can significantly accelerate the inference speed in dynamic and heterogeneous environments.
    • loading

    Catalog

      Turn off MathJax
      Article Contents

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return