Abstract:
Numerous multi-agent tasks exhibit a nearly decomposable structure, wherein interactions among agents within the same interaction set are strong while interactions between different sets are weak. Efficiently modeling this structure and leveraging it to coordinate agents can enhance the learning efficiency of multi-agent reinforcement learning algorithms for cooperative multi-agent tasks, while existing work typically neglects and fails. To address this limitation, we model the nearly decomposable structure using a dynamic graph and accordingly propose a novel algorithm named coordinated subtask pattern (CSP) that enhances both local and global coordination among agents. Specifically, CSP identifies agents’ interaction sets as subtasks and utilizes a bi-level structure to periodically distribute agents into multiple subtasks, which ensures accurate characterizations regarding their interactions on the dynamic graph. Based on the subtask assignment, CSP proposes intra-subtask and inter-subtask pattern constraints to facilitate both local and global coordination among agents. These two constraints ensure that partial agents within the same subtask are aware of their action selections and all agents select superior joint actions that maximize the overall task performance. Experimentally, we evaluate CSP across multiple maps of SMAC benchmark, and its superior performance against multiple baseline algorithms demonstrates its effectiveness on efficiently coordinating agents.