Abstract:
State variables of real-world problems are usually continuously real-valued variables. However, a standard reinforcement learning method is only suitable for problems with finite discrete states. To apply it to real-world problems, representation of continuous states must be properly handled. There are mainly two kinds of methods. One is parameterized function approximation method and the other is discretization method. To analyze the advantages and disadvantages of the current adaptive partition method, a partition method based on node-growing k-means clustering is proposed. Reinforcement learning methods based on the proposed clustering algorithm are presented for both discrete and continuous action problems. Simulation is conducted on mountain-car problem with discrete actions and on double integrator problem with continuous actions. Results show that the proposed method can adaptively adjust partition resolution and achieve an adaptive partition of continuous state space. Optimal policy is learned at the ame time.