• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Zhu Fei, Liu Quan, Fu Qiming, Fu Yuchen. A Least Square Actor-Critic Approach for Continuous Action Space[J]. Journal of Computer Research and Development, 2014, 51(3): 548-558.
Citation: Zhu Fei, Liu Quan, Fu Qiming, Fu Yuchen. A Least Square Actor-Critic Approach for Continuous Action Space[J]. Journal of Computer Research and Development, 2014, 51(3): 548-558.

A Least Square Actor-Critic Approach for Continuous Action Space

More Information
  • Published Date: March 14, 2014
  • The research of the reinforcement learning problem with continuous action space is one of the most challenging and difficult concerns for the time being. Conventional reinforcement learning algorithms are usually aimed at solving the problems of the small scale and discrete action space. For the problems with continuous actions space, most approaches tend to discretize the continuous space by taking advantage of prior information, and then try to find out the optimal solution. However, in many practical applications, action spaces are usually continuous, and moreover little prior information is available for discretizing the action space appropriately. In order to solve this problem, we hereby put forward a least square actor-critic algorithm (LSAC) for continuous action space, which takes advantage of approximate function to represent value function and policy respectively; and uses online least square method to obtain the parameters of approximate value function and approximate policy, where approximate value function is considered as the critic part to guide the solution of the parameter of approximate policy. We applied LSAC to solve the cart pole balancing problem and the mountain car problem which are characterized by continuous action space, and then compared the results with those returned by two classic algorithms, Cacla (continuous actor-critic learning automaton) algorithm and eNAC (episodic natural actor-critic) algorithm. The experimental results show that LSAC can solve the continuous action space problem well and has better executing performance.
  • Related Articles

    [1]Zhang Litian, Kong Jiayi, Fan Yihang, Fan Lingjun, Bao Ergude. Car Accident Prediction Based on Macro and Micro Factors in Probability Level[J]. Journal of Computer Research and Development, 2021, 58(9): 2052-2061. DOI: 10.7544/issn1000-1239.2021.20200345
    [2]Feng Wei, Hang Wenlong, Liang Shuang, Liu Xuejun, Wang Hui. Deep Stack Least Square Classifier with Inter-Layer Model Knowledge Transfer[J]. Journal of Computer Research and Development, 2019, 56(12): 2589-2599. DOI: 10.7544/issn1000-1239.2019.20180741
    [3]Li Qi, Zhong Jiang, Li Xue. DyBGP: A Dynamic-Balanced Algorithm for Graph Partitioning Based on Heuristic Strategies[J]. Journal of Computer Research and Development, 2017, 54(12): 2851-2857. DOI: 10.7544/issn1000-1239.2017.20160690
    [4]Wu Peili, Liu Kui'en, Hao Shengang, Zhang Quanxin, Tan Yu'an. Rapid Traffic Congestion Monitoring Based on Floating Car Data[J]. Journal of Computer Research and Development, 2014, 51(1): 189-198.
    [5]Fang Min, Niu Wenke, Zhang Xiaosong. Multiple Attractor Cellular Automata Classification Method and Over-Fitting Problem with CART[J]. Journal of Computer Research and Development, 2012, 49(8): 1747-1752.
    [6]Li Xiongfei, Li Jun, Qu Chengwei, Liu Lijuan, Sun Tao. Balancing Method for Skewed Training Set in Data Mining[J]. Journal of Computer Research and Development, 2012, 49(2): 346-353.
    [7]Zhao Huan, Wang Gangjin, Hu Lian, and Peng Xiujuan. Voice Activity Detection Based on Sample Entropy in Car Environments[J]. Journal of Computer Research and Development, 2011, 48(3): 471-476.
    [8]Wu Jiawei, Li Xiongfei, Sun Tao, and Li Wei. A Density-Based Clustering Algorithm Concerning Neighborhood Balance[J]. Journal of Computer Research and Development, 2010, 47(6): 1044-1052.
    [9]Yang Xiaowei, Lu Jie, Zhang Guangquan. An Effective Pruning Algorithm for Least Squares Support Vector Machine Classifier[J]. Journal of Computer Research and Development, 2007, 44(7): 1128-1136.
    [10]Zhou Minghua, Wang Guozhao. Genetic Algorithm-Based Least Square Fitting of B-Spline and Bézier Curves[J]. Journal of Computer Research and Development, 2005, 42(1): 134-143.

Catalog

    Article views (1129) PDF downloads (681) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return