Multi-Modal Imitation Learning Method with Cosine Similarity

Hao Shaopu; Liu Quan; Xu Ping’an; Zhang Lihua; Huang Zhigang

doi:10.7544/issn1000-1239.202220119

Hao Shaopu, Liu Quan, Xu Ping’an, Zhang Lihua, Huang Zhigang. Multi-Modal Imitation Learning Method with Cosine SimilarityJ. Journal of Computer Research and Development, 2023, 60(6): 1358-1372. DOI: 10.7544/issn1000-1239.202220119

Citation:

Multi-Modal Imitation Learning Method with Cosine Similarity

Graphical Abstract

Graphical Abstract

Abstract

Abstract

Generative adversarial imitation learning is an inverse reinforcement learning (IRL) method based on generative adversarial framework to imitate expert policies from expert demonstrations. In practical tasks, expert demonstrations are often generated from multi-modal policies. However, most of the existing generative adversarial imitation learning (GAIL) methods assume that the expert demonstrations are generated from a single modal policy, which leads to the mode collapse problem where the generative adversarial imitation learning can only partially learn the modal policies. Therefore, the application of the method is greatly limited for multi-modal tasks. To address the mode collapse problem, we propose the multi-modal imitation learning method with cosine similarity (MCS-GAIL). The method introduces an encoder and a policy’s group, extracts the modal features of the expert demonstrations by the encoder, calculates the cosine similarity of the features between the sample of policy sampling and the expert demonstrations, and adds them to the loss function of the policy’s group to help the policy’s group learn the expert policies of the corresponding modalities. In addition, MCS-GAIL uses a new min-max game formulation for the policy’s group to learn different modal policies in a complementary way. Under the assumptions, we prove the convergence of MCS-GAIL by theoretical analysis. To verify the effectiveness of the method, MCS-GAIL is implemented on the Grid World and MuJoCo platforms and compared with the existing mode collapse methods. The experimental results show that MCS-GAIL can effectively learn multiple modal policies in all environments with high accuracy and stability.

FullText(HTML)

References (28)

Cited By

Turn off MathJax

Article Contents

Multi-Modal Imitation Learning Method with Cosine Similarity

Graphical Abstract

Abstract

Catalog

Export File

Citation

Format

Content