ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2019, Vol. 56 ›› Issue (8): 1708-1720.doi: 10.7544/issn1000-1239.2019.20190155

Special Issue: 2019人工智能前沿进展专题

Previous Articles     Next Articles

An Experience-Guided Deep Deterministic Actor-Critic Algorithm with Multi-Actor

Chen Hongming1, Liu Quan1,2,3,4, Yan Yan1, He Bin1, Jiang Yubin1, Zhang Linlin1   

  1. 1(School of Computer Science and Technology, Soochow University, Suzhou, Jiangsu 215006);2(Provincial Key Laboratory for Computer Information Processing Technology (Soochow University), Suzhou, Jiangsu 215006);3(Key Laboratory of Symbolic Computation and Knowledge Engineering (Jilin University), Ministry of Education, Changchun, 130012);4(Collaborative Innovation Center of Novel Software Technology and Industrialization, Nanjing 210000)
  • Online:2019-08-01

Abstract: The continuous control task has always been an important research direction in reinforce-ment learning. In recent years, the development of deep learning (DL) and the advent of deterministic policy gradients algorithm (DPG), provide many good ideas for solving continuous control problems. The main difficulty faced by these methods is the exploration in the continuous action space. And some of them engage in exploratory behavior through external noise injection in the action space. However, this exploration method does not perform well in some continuous control tasks. This paper proposes an experience-guided deep deterministic actor-critic algorithm with multi-actor (EGDDAC-MA) without external noise, which learns a guiding network from excellent experiences to guide the updates of the actor network and the critic network. Besides, it uses a multi-actor actor-critic (AC) model which configures different actors for each phase in an episode. These actors are independent of each other and do not interfere with each other. Finally, the experimental results show that compared with DDPG, TRPO and PPO algorithms, the proposed algorithm has better performance in most continuous tasks in GYM simulation platform.

Key words: reinforcement learning, deep reinforcement learning, deterministic actor-critic, experience guiding, expert guiding, multi-actor

CLC Number: