A Heuristic Two-layer Reinforcement Learning Algorithm Based on BP Neural Networks

Liu Zhibin; Zeng Xiaoqin; Liu Huiyi; Chu Rong

doi:10.7544/issn1000-1239.2015.20131270

Liu Zhibin, Zeng Xiaoqin, Liu Huiyi, Chu Rong. A Heuristic Two-layer Reinforcement Learning Algorithm Based on BP Neural Networks[J]. Journal of Computer Research and Development, 2015, 52(3): 579-587. DOI: 10.7544/issn1000-1239.2015.20131270

Citation:

A Heuristic Two-layer Reinforcement Learning Algorithm Based on BP Neural Networks

Graphical Abstract

Abstract

Abstract

Reinforcement learning is a promising learning approach for agent to interact with environment from repeated training. However, it is bedeviled by the curse of dimensionality so that it can be hardly applied to large scale problems due to its low efficiency. Imbedding static prior knowledge can improve the learning performance of reinforcement learning, but inappropriate knowledge often misguides the learning process or reduces the learning speed. In this paper, an online heuristic two-layer reinforcement learning algorithm based on BP neural networks, named NNH-QL, is proposed for the purpose of avoiding the blindness and limitation of the previous learning methods. The top layer, served as reward shaping function, is constituted by BP neural networks. By shaping, the qualitative top layer provides dynamic online acquired knowledge to instruct the Q-learning based on table. In order to improve the learning efficiency of the qualitative layer, the eligibility traces are incorporated into the BP neural networks training processes. The NNH-QL method combines the flexibility of standard Q-learning and the generalization performance of BP neural networks. All the methods above offer feasible methods to solve reinforcement learning problems in larger state space. For testing, the NNH-QL algorithm is applied to an optimal path search problem. The results show that this algorithm can improve the learning performance and accelerate the learning process obviously.

FullText(HTML)

References (0)

Supplements (0)

Cited By

Turn off MathJax

Article Contents

A Heuristic Two-layer Reinforcement Learning Algorithm Based on BP Neural Networks

Abstract

Catalog

Export File

Citation

Format

Content