With the growing popularity of the robot operating system (ROS), these systems are becoming increasingly complex, and the computing platforms they run on are transforming into multi-core platforms. In ROS, the order of task execution is determined by the underlying task scheduling strategy and the priorities assigned to the tasks. Minimizing the execution time of all tasks is a crucial goal in task scheduling for parallel systems. To address this challenge, we propose a reinforcement learning-based task priority assignment method, inspired by recent achievements in using reinforcement learning for handling various combinatorial optimization problems and considering the scheduling mechanisms and execution constraints of ROS2 multi-threaded executors. This method extracts the temporal and structural features of the task set, which is represented in the form of a directed acyclic graph (DAG), and efficiently learns the ROS2 scheduling policy through a combination of policy gradient and Monte Carlo tree search (MCTS) methods, providing a reasonable priority setting scheme. The goal of minimizing the completion time of DAG parallel tasks is achieved through this method. The proposed method is evaluated by simulating randomly generated task graphs in a simulation platform environment. The results show that it outperforms the benchmark methods. As an off-line analysis method, the proposed method can be easily extended to more complex ROS and can find a near-optimal solution in an acceptable amount of time.