基于多目标深度强化学习的车车通信无线资源分配算法

李可; 马赛; 戴朋林; 任婧; 范平志

doi:10.7544/issn1000-1239.202330895

基于多目标深度强化学习的车车通信无线资源分配算法

Wireless Resource Allocation Algorithm Based on Multi-Objective Deep Reinforcement Learning for Vehicle-to-Vehicle Communications

摘要

摘要: 针对车联网动态不确定特性、业务类型多元化以及无线通信资源稀缺性，研究了蜂窝车联网车与网络（vehicle-to-network，V2N）和车与车（vehicle-to-vehicle，V2V）链路共存且共享频谱场景下保证业务多指标需求和无线资源有效利用的问题. 首先建立多目标优化问题模型来表示蜂窝车联网信道选择和功率控制的决策过程，该问题考虑了网络环境动态变化的影响，旨在实现优化目标V2V链路的性能（即信息年龄、延迟以及传输速率）和V2N链路传输速率之间的权衡. 在此基础上，提出了基于多目标深度强化学习的车车通信无线资源分配算法进行神经网络训练和问题求解. 通过训练好的神经网络模型可以得到多目标优化问题的帕累托前沿. 仿真实验表明，所提出算法能够有效地权衡不同通信链路可实现的性能. 与4种有代表性的算法比较，V2V链路信息年龄降低12.0%~17.2%，V2N链路传输速率提升11.4%~21.6%，V2V链路传输成功率提高4.6~13.91个百分点，决策延迟时间降低10.6%~20.3%.

Abstract: Due to the dynamic uncertainty, diversified service types and scarcity of wireless communication resources in the context of vehicle-to-everything, we explore the challenge of ensuring the requirement for multiple quality of service and the effective utilization of wireless resources in the scenario where V2N (vehicle-to-network) and V2V (vehicle-to-vehicle) links coexist and share spectrum in C-V2X (cellular vehicle-to-everything) networks. First, a multi-objective optimization problem is presented to model the decision-making process of channel selection and power control in C-V2X. The problem considers the impact of dynamic changes in the network environment, aiming to make a balance between the performance of the V2V link (i.e., age of information, delay, and capacity) and the capacity of the V2N link. On this basis, V2V wireless resource allocation algorithm based on multi-objective deep reinforcement learning is also proposed for training neural networks to solve the problem. Through the trained neural network model, the Pareto frontier of the multi-objective optimization problem can be obtained. Simulation results demonstrate that the proposed algorithm can achieve the near-optimal performance for different communication links. Compared with four representative algorithms, the age of information for V2V link is reduced by 12.0% to 17.2%, the V2N link capacity is increased by 11.4% to 21.6%, the V2V link transmission success rate is increased by 4.6% to 13.9%, and the decision delay time is reduced by 10.6% to 20.3%.

HTML全文

参考文献(36)

施引文献

资源附件(0)