Abstract:
The traditional reinforcement learning methods such as Q-learning, maintain a table that maps the states to the actions. This simple dual-layer mapping structure has been widely used in many applied situations. However, dual-layer mapping structure of state-action lacks flexibility, while priori knowledge can not be effectively used to guide the learning process. To solve this problem, a new reinforcement learning framework is proposed, called multi-motive reinforcement learning (MMRL). Between state layer and action layer, MMRL framework introduces motive layer, in which multiple motives can be set based on experience. In this way, the original state-action dual-layer structure is extended to state-motive-action triple-layer structure. Under this framework, two new corresponding algorithms are presented, the first is MMQ-unique algorithm and the second is MMQ-voting algorithm. Moreover, it is stated that traditional reinforcement learning methods can be seen as a degenerate form of multi-motive reinforcement learning. That is to say, multi-motive reinforcement learning framework is a superset of traditional methods. This new framework and the corresponding algorithms improve the flexibility of reinforcement learning by adding the motive layer, and make use of priori knowledge to speed up the learning process. Experiments demonstrate that, multi-motive reinforcement learning can get better performance than the traditional reinforcement learning methods significantly by setting reasonable motives.