高级检索

    一种基于角色跟踪的群体Agent再励学习算法

    A Multi-Agent Reinforcement Learning Method Based on Role Tracking

    • 摘要: 在多Agent系统中,通过学习可以使Agent不断增加和强化已有的知识与能力,并选择合理的动作最大化自己的利益.但目前有关Agent学习大都限于单Agent模式,或仅考虑Agent个体之间的对抗,没有考虑Agent的群体对抗,没有考虑Agent在团队中的角色,完全依赖对效用的感知来判断对手的策略,导致算法的收敛速度不高.因此,将单Agent学习推广到在非通信群体对抗环境下的群体Agent学习.考虑不同学习问题的特殊性,在学习模型中加入了角色属性,提出一种基于角色跟踪的群体Agent再励学习算法,并进行了实验分析.在学习过程中动态跟踪对手角色,并根据对手角色与其行为的匹配度动态决定学习速率,利用minmax-Q算法修正每个状态的效用值,最终加快学习的收敛速度,从而改进了Bowling和Littman等人的工作.

       

      Abstract: In a multi-agent system, agent can add and improve his knowledge and capability continuously by learning, and then chooses the reasonable action to maximize his benefit. However, current works are mostly restricted to single agent mode, or only confrontations between two single agents are considered and group confrontations are not considered. Current works do not consider agent roles in agent group; they estimate policy of opponents only by benefits through observation, which makes algorithm converge to some static policy slowly. For these limitations, single agent learning process is extended to group agent learning process in incommunicable environment for group confrontation. Considering the particularity among different learning problems, a reinforcement learning algorithm based on role tracking and experimental analysis is given. Role concept is added in the learning model by tracking the opponents' roles dynamically in learning process. The learning rate is determined by computing the matching value of opponents' roles with their actions, and finally the benefit value of every state is updated by using minmax-Q algorithm so as to fast the speed of convergence, which improves the work of Bowling and Littman.

       

    /

    返回文章
    返回