Abstract:
Motivated by applying reinforcement learning methods into autonomous robotic systems and complex decision making problems, reinforcement learning is becoming more and more popular in the community of machine learning. Traditional reinforcement learning is one kind of learning paradigm in machine learning field which is learning from the interactions between the agent and the environment. However, for the vast majority of cases, the environments for sequential decision making problems cannot provide an explicit reward signal immediately or the reward signal can be much delayed. This becomes the bottleneck for applying reinforcement learning methods into more complex tasks. So inverse reinforcement learning is proposed to recover the reward function from expert demonstrations in the Markov decision process (MDP) by assuming that the expert demonstrations is optimal. So far, the imitation learning algorithms which combines direct reinforcement learning approaches and inverse reinforcement learning approaches have already made a great progress. This paper briefly introduces the basic concepts of reinforcement learning, inverse reinforcement learning and imitation learning. And this paper also gives an introduction to the existing problems concerning with inverse reinforcement learning and some other methods in imitation learning. In addition, we also introduce some existing bottlenecks once applying the above methods into real world applications.