• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Wang Jinyong, Huang Zhiqiu, Yang Deyan, Xiaowei Huang, Zhu Yi, Hua Gaoyang. Spatio-Clock Synchronous Constraint Guided Safe Reinforcement Learning for Autonomous Driving[J]. Journal of Computer Research and Development, 2021, 58(12): 2585-2603. DOI: 10.7544/issn1000-1239.2021.20211023
Citation: Wang Jinyong, Huang Zhiqiu, Yang Deyan, Xiaowei Huang, Zhu Yi, Hua Gaoyang. Spatio-Clock Synchronous Constraint Guided Safe Reinforcement Learning for Autonomous Driving[J]. Journal of Computer Research and Development, 2021, 58(12): 2585-2603. DOI: 10.7544/issn1000-1239.2021.20211023

Spatio-Clock Synchronous Constraint Guided Safe Reinforcement Learning for Autonomous Driving

Funds: This work was supported by the National Key Research and Development Program of China (2018YFB1003900) and the National Natural Science Foundation of China (61772270, 62077029).
More Information
  • Published Date: November 30, 2021
  • Autonomous driving systems integrate complex interactions between hardware and software. In order to ensure the safe and reliable operations, formal methods are used to provide rigorous guarantees to satisfy logical specifications and safety-critical requirements in the design stage. As a widely employed machine learning architecture, deep reinforcement learning (DRL) focuses on finding an optimal policy that maximizes a cumulative discounted reward by interacting with the environment, and has been applied to autonomous driving decision-making modules. However, black-box DRL-based autonomous driving systems cannot provide guarantees of safe operation and reward definition interpretability techniques for complex tasks, especially when they face unfamiliar situations and reason about a greater number of options. In order to address these problems, spatio-clock synchronous constraint is adopted to augment DRL safety and interpretability. Firstly, we propose a dedicated formal properties specification language for autonomous driving domain, i.e., spatio-clock synchronous constraint specification language, and present domain-specific knowledge requirements specification that is close to natural language to make the reward functions generation process more interpretable. Secondly, we present domain-specific spatio-clock synchronous automata to describe spatio-clock autonomous behaviors, i.e., controllers related to certain spatio- and clock-critical actions, and present safe state-action space transition systems to guarantee the safety of DRL optimal policy generation process. Thirdly, based on the formal specification and policy learning, we propose a formal spatio-clock synchronous constraint guided safe reinforcement learning method with the goal of easily understanding the safe reward function. Finally, we demonstrate the effectiveness of our proposed approach through an autonomous lane changing and overtaking case study in the highway scenario.
  • Cited by

    Periodical cited type(4)

    1. 印婵,祝义,王金永,陈小颖,郝国生. 面向CPS时空规则验证制导的安全强化学习. 计算机科学与探索. 2025(02): 513-527 .
    2. 蒋荣军. 基于Concenter-Net神经网络的无人驾驶汽车实时规划方法. 数学的实践与认识. 2023(05): 164-171 .
    3. 刘泽润,刘超. 可持续建成环境研究的机器学习应用进展与展望. 风景园林. 2023(07): 51-59 .
    4. 孙聪,曾荟铭,宋焕东,王运柏,张宗旭,马建峰. 基于机器学习的无人机传感器攻击在线检测和恢复方法. 计算机研究与发展. 2023(10): 2291-2303 . 本站查看

    Other cited types(15)

Catalog

    Article views (719) PDF downloads (543) Cited by(19)
    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return