基于牵引控制的深度强化学习路由策略生成

孙鹏浩; 兰巨龙; 申涓; 胡宇翔

doi:10.7544/issn1000-1239.2021.20200018

基于牵引控制的深度强化学习路由策略生成

(解放军战略支援部队信息工程大学郑州 450002) (sphshine@126.com)

基金项目: 国家重点研发计划项目(2020YFB1804803)；国家自然科学基金项目(62002382,61702547,61872382)；广东省重点领域研发计划项目(2018B010113001)

详细信息

中图分类号: TP393
计量
- 文章访问数: 742
- HTML全文浏览量: 0
- PDF下载量: 296
出版历程
- 发布日期: 2021-06-30

Pinning Control-Based Routing Policy Generation Using Deep Reinforcement Learning

(PLA Strategic Support Force Information Engineering University, Zhengzhou 450002)

Funds: This work was supported by the National Key Research and Development Program of China (2020YFB1804803), the National Natural Science Foundation of China (62002382, 61702547, 61872382), and the Key Research and Development Project of Guangdong Province (2018B010113001).

摘要

摘要: 当前网络规模的高速增长带来网络流量复杂度的日益提高，增加了对流量特征精确建模的难度.近年来业界提出使用深度强化学习技术实现网络路由的智能化生成，一定程度上克服了人工进行流量分析和建模的缺点.然而，目前提出的解决方案普遍存在可扩展性差等问题.对此，提出了一种基于牵引控制理论的深度强化学习路由策略生成技术Hierar-DRL，通过引入牵引控制理论并结合深度强化学习的自动策略搜索能力，提高了智能路由算法可扩展性.仿真实验结果表明：所提方案相比当前最优方案的端到端时延最多降低了28.5%，证明了所提智能路由方案的有效性.
- 路由优化 /
- 软件定义网络 /
- 人工智能 /
- 深度强化学习 /
- 牵引控制
Abstract: Computer networks have been playing an important role in modern society. The rapid growth of the network scale makes the network traffic more and more complicated, which is hard to accurately model. This condition makes the optimal routing policy in communication networks an NP-hard problem. To solve this problem, traditional methods for routing and traffic engineering mainly use hand-crafted algorithms, which cannot ensure both the accuracy and efficiency. In recent years, deep reinforcement learning (DRL)-based network routing strategies have been proposed, which overcome the shortcomings of manually analysis and modelling by human experts to some extent. However, current DRL-based routing strategies all have problems in scalability, which means they cannot be used in large scale networks. Under this circumstance, this paper proposes Hierar-DRL, a DRL-based network routing technology that employs pinning control theory. Pinning control helps Hierar-DRL to select a subset of network nodes as the target control nodes of DRL. With the advantages of pinning control and the automatic policy exploring ability of DRL, Hierar-DRL shows better scalability in large networks. Simulation results show that the proposed scheme can reduce the average end-to-end transmission delay in the test network topologies by up to 28.5% compared with the state-of-the-art, which validates the proposed scheme.
- routing optimization /
- software-defined networking /
- artificial intelligence /
- deep reinforce-ment learning /
- pinning control

HTML全文

参考文献(0)

施引文献(8)

期刊类型引用(5)

1.	闫庆文，郭影，刘文芬，陈文，陆永灿. 一种灵活性高的16比特S盒设计方法. 计算机技术与发展. 2025(03): 91-98 . 百度学术
2.	武小年，吴庭，黄昭文，张润莲. 基于复合混沌系统的S盒构造与优化方法. 计算机科学与探索. 2025(04): 1095-1104 . 百度学术
3.	马俊. 基于AES对称加密算法的电子商务敏感数据加密存储研究. 佳木斯大学学报(自然科学版). 2024(06): 45-48 . 百度学术
4.	武小年，豆道饶，韦永壮，张润莲，李灵琛. 基于Feistel-NFSR结构的16比特S盒设计方法. 密码学报. 2023(01): 146-154 . 百度学术
5.	武小年，舒瑞，豆道饶，张润莲，韦永壮. 基于L-M-NFSR结构的16比特S盒设计方法. 计算机科学与探索. 2023(10): 2511-2518 . 百度学术