ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2022, Vol. 59 ›› Issue (2): 329-341.doi: 10.7544/issn1000-1239.20210905

Special Issue: 2022空间数据智能专题

Previous Articles     Next Articles

Dynamic Ride-Hailing Route Planning Based on Deep Reinforcement Learning

Zheng Bolong1, Ming Lingfeng1, Hu Qi1, Fang Yixiang2, Zheng Kai3, Li Guohui1   

  1. 1(School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan 430074);2(School of Data Science, The Chinese University of Hong Kong (Shenzhen), Shenzhen, Guangdong 518172);3(School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 610054)
  • Online:2022-02-01
  • Supported by: 
    This work was supported by the National Natural Science Foundation of China (61902134, 62011530437), Hubei Natural Science Foundation (2020CFB871), and the Fundamental Research Funds for the Central Universities (2019kfyXKJC021, 2019kfyXJJS091).

Abstract: With the rapid development of the mobile Internet, many online ride-hailing platforms that use mobile apps to request taxis have emerged. Such online ride-hailing platforms have reduced significantly the amounts of the time that taxis are idle and that passengers spend on waiting, and improved traffic efficiency. As a key component, the taxi route planning problem aims at dispatching idle taxis to serve potential requests and improving the operating efficiency, which has received extensive attention in recent years. Existing studies mainly adopt value-based deep reinforcement learning methods such as DQN to solve this problem. However, due to the limitations of value-based methods, existing methods cannot be applied to high-dimensional or continuous action spaces. Therefore, an actor-critic with action sampling policy, called AS-AC, is proposed to learn an optimal fleet management strategy, which can perceive the distribution of supply and demand in the road network, and determine the final dispatch location according to the degree of mismatch between supply and demand. Extensive experiments on New York and Haikou taxi datasets offer insight into the performance of our model and show that it outperforms the comparison approaches.

Key words: mobile information processing systems, spatial-temporal data mining, deep reinforcement learning, ride-hailing route planning, fleet management

CLC Number: