基于牵引控制的深度强化学习路由策略生成

孙鹏浩; 兰巨龙; 申涓; 胡宇翔

doi:10.7544/issn1000-1239.2021.20200018

基于牵引控制的深度强化学习路由策略生成

Pinning Control-Based Routing Policy Generation Using Deep Reinforcement Learning

摘要

摘要: 当前网络规模的高速增长带来网络流量复杂度的日益提高，增加了对流量特征精确建模的难度.近年来业界提出使用深度强化学习技术实现网络路由的智能化生成，一定程度上克服了人工进行流量分析和建模的缺点.然而，目前提出的解决方案普遍存在可扩展性差等问题.对此，提出了一种基于牵引控制理论的深度强化学习路由策略生成技术Hierar-DRL，通过引入牵引控制理论并结合深度强化学习的自动策略搜索能力，提高了智能路由算法可扩展性.仿真实验结果表明：所提方案相比当前最优方案的端到端时延最多降低了28.5%，证明了所提智能路由方案的有效性.

Abstract: Computer networks have been playing an important role in modern society. The rapid growth of the network scale makes the network traffic more and more complicated, which is hard to accurately model. This condition makes the optimal routing policy in communication networks an NP-hard problem. To solve this problem, traditional methods for routing and traffic engineering mainly use hand-crafted algorithms, which cannot ensure both the accuracy and efficiency. In recent years, deep reinforcement learning (DRL)-based network routing strategies have been proposed, which overcome the shortcomings of manually analysis and modelling by human experts to some extent. However, current DRL-based routing strategies all have problems in scalability, which means they cannot be used in large scale networks. Under this circumstance, this paper proposes Hierar-DRL, a DRL-based network routing technology that employs pinning control theory. Pinning control helps Hierar-DRL to select a subset of network nodes as the target control nodes of DRL. With the advantages of pinning control and the automatic policy exploring ability of DRL, Hierar-DRL shows better scalability in large networks. Simulation results show that the proposed scheme can reduce the average end-to-end transmission delay in the test network topologies by up to 28.5% compared with the state-of-the-art, which validates the proposed scheme.

HTML全文

参考文献(0)

施引文献

资源附件(0)