支持端边云多运行时协同应用的网程系统

俞子舒; 王一帆; 曾琛; 张星洲; 彭晓晖; 徐志伟

doi:10.7544/issn1000-1239.202440676

支持端边云多运行时协同应用的网程系统

Grip System for Multi-Runtime Support in Things-Edge-Cloud Collaborative Applications

摘要

摘要: 研究人员针对不同的负载类型提出并实现了大量的运行时系统，帮助用户构建单机或分布式应用. 在端边云协同场景中，由于应用各组件在保质要求、运行时环境和通信协议方面存在异构性，因此难以通过单一运行时构建性能出色且鲁棒的端边云协同应用. 将应用的各个组件独立部署到不同的运行时会增加应用管理的难度，并且缺乏对性能和容错方面的统一支持. 为解决上述问题，实现了网程系统，支持多种运行时的统一接入和使用. 网程系统通过网元和网程抽象支持多运行时应用的统一管理，并基于所有权方法提供自定义容错和缩扩容策略的支持机制. 实验表明，在端边云环境下，相比于使用Ray，Docker，Kubernetes等单一运行时，网程系统降低了31%~77%的平均端到端延迟、26%~78%的90百分位尾延迟、22%~78%的95百分位尾延迟.

Abstract: Researchers have proposed and implemented many distributed runtime systems to help users build distributed applications. These distributed runtime systems are usually only good at processing certain types of loads in specific scenarios. However, in the things-edge-cloud scenario, the components of things-edge-cloud collaborative applications have heterogeneous quality requirements, runtime environments, and heterogeneous communication protocols, making it difficult to use one runtime to build high-performance and robust things-edge-cloud collaborative applications. Deploying application components independently to different runtimes will increase the difficulty of application management and lack of unified performance and fault-tolerance support. The Grip system is proposed to address the problem. The Grip system supports the unified access and utilization of multiple runtimes by introducing a virtual runtime adapter layer and a virtual runtime API layer. These two virtual layers specify the interfaces that need to be implemented when accessing a runtime. The Grip system supports the unified management of multi-runtime applications through Griplet and Grip abstractions. It utilizes ownership methods to provide mechanisms for supporting user-defined fault tolerance and scaling policy. Experiments show that in the things-edge-cloud environment, compared with using a single runtime such as Ray, Docker, and Kubernetes, the Grip system reduces the average end-to-end latency by 31% to 77%, the 90^th percentile tail latency by 25% to 78%, and the 95^th percentile tail latency by 22% to 78%.

HTML全文

参考文献(37)

施引文献

资源附件(0)