一种多动机强化学习框架

赵凤飞  覃  征

一种多动机强化学习框架

赵凤飞覃征

A Multi-Motive Reinforcement Learning Framework

Zhao Fengfei and Qin Zheng

摘要

摘要: 以Q学习为代表的传统强化学习方法都是维持一个状态与动作的映射表.这种状态-动作的二层映射结构缺乏灵活性，同时不能有效地使用先验知识引导学习过程.为了解决这一问题，提出了一种基于多动机强化学习(MMRL)的框架.MMRL框架在状态与动作间引入动机层，将原有的状态-动作二层结构扩展为状态-动机-动作三层结构，可根据经验设置多个动机.通过动机的设定实现了先验知识的利用，进而加快了强化学习的进程，提高了强化学习的灵活性.实验表明，通过合理的动机设定，多动机强化学习的学习速度较传统强化学习有明显提升.

Abstract: The traditional reinforcement learning methods such as Q-learning, maintain a table that maps the states to the actions. This simple dual-layer mapping structure has been widely used in many applied situations. However, dual-layer mapping structure of state-action lacks flexibility, while priori knowledge can not be effectively used to guide the learning process. To solve this problem, a new reinforcement learning framework is proposed, called multi-motive reinforcement learning (MMRL). Between state layer and action layer, MMRL framework introduces motive layer, in which multiple motives can be set based on experience. In this way, the original state-action dual-layer structure is extended to state-motive-action triple-layer structure. Under this framework, two new corresponding algorithms are presented, the first is MMQ-unique algorithm and the second is MMQ-voting algorithm. Moreover, it is stated that traditional reinforcement learning methods can be seen as a degenerate form of multi-motive reinforcement learning. That is to say, multi-motive reinforcement learning framework is a superset of traditional methods. This new framework and the corresponding algorithms improve the flexibility of reinforcement learning by adding the motive layer, and make use of priori knowledge to speed up the learning process. Experiments demonstrate that, multi-motive reinforcement learning can get better performance than the traditional reinforcement learning methods significantly by setting reasonable motives.

HTML全文

参考文献(0)

施引文献

资源附件(0)