Abstract:
In a multi-agent system, agent can add and improve his knowledge and capability continuously by learning, and then chooses the reasonable action to maximize his benefit. However, current works are mostly restricted to single agent mode, or only confrontations between two single agents are considered and group confrontations are not considered. Current works do not consider agent roles in agent group; they estimate policy of opponents only by benefits through observation, which makes algorithm converge to some static policy slowly. For these limitations, single agent learning process is extended to group agent learning process in incommunicable environment for group confrontation. Considering the particularity among different learning problems, a reinforcement learning algorithm based on role tracking and experimental analysis is given. Role concept is added in the learning model by tracking the opponents' roles dynamically in learning process. The learning rate is determined by computing the matching value of opponents' roles with their actions, and finally the benefit value of every state is updated by using minmax-Q algorithm so as to fast the speed of convergence, which improves the work of Bowling and Littman.