Advanced Search
    Liu Quan, Yan Qicui, Fu Yuchen, Hu Daojing, and Gong Shengrong. A Hierarchical Reinforcement Learning Method Based on Heuristic Reward Function[J]. Journal of Computer Research and Development, 2011, 48(12): 2352-2358.
    Citation: Liu Quan, Yan Qicui, Fu Yuchen, Hu Daojing, and Gong Shengrong. A Hierarchical Reinforcement Learning Method Based on Heuristic Reward Function[J]. Journal of Computer Research and Development, 2011, 48(12): 2352-2358.

    A Hierarchical Reinforcement Learning Method Based on Heuristic Reward Function

    • Reinforcement learning is about controlling an autonomous agent in an unknown enviroment—often called the state space. The agent has no prior knowledge about the environment and can only obtain some knowledge by acting in the environment. Reinforcement learning, and Q-learning particularly, encounters a major problem. Learning the Q-function in tablular form may be infeasible because the amount of memory needed to store the table is excessive, and the Q-function converges only after each state being visited a lot of times. So “curse of dimensionality” is inevitably produced by large state spaces. A hierarchical reinforcement learning method based on heuristic reward function is proposed to solve the problem of “curse of dimensionality”, which make the states space grow exponentially by the number of features and slow down the convergence speed. The method can reduce state spaces greatly and quicken the speed of the study. Actions are chosen with favorable purpose and efficiency so as to optimize the reward function and quicken the convergence speed. The Tetris game is applied in the method. Analysis of algorithms and the experiment result show that the method can partly solve the “curse of dimensionality” and quicken the convergence speed prominently.
    • loading

    Catalog

      Turn off MathJax
      Article Contents

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return