利用深度学习辅助算法设计——以排课问题与旅行商问题为例

隋京言; 丁诗哲; 夏博洋; 刘睿之; 丁振欣; 徐黎明; 王睿; 黄旭林; 李捷鹏; 张海仓; 于春功; 王超; 卜东波

doi:10.7544/issn1000-1239.202660017

利用深度学习辅助算法设计——以排课问题与旅行商问题为例

Using Deep Learning to Assist Algorithm Design—Taking Course Scheduling Problem and Traveling Salesman Problem as Examples

摘要

摘要: 深度学习在近年来表现出强大的模式挖掘与结构推断能力，为算法设计这一高度依赖经验与直觉的领域提供了新的可能性。聚焦“无监督条件下自主学习可解释性算法规则”这一核心目标，以排课问题和旅行商问题这2个经典组合优化问题为典型案例，提出2种基于深度强化学习的方法Neural-Greedy与Neural-3-OPT。针对排课问题，Neural-Greedy可从大量问题实例中自主学出3条最优贪心规则（优先选择最早下课的课程，优先选择最晚上课的课程及二者的混合策略）；其核心机理是通过参数调控与神经元协同实现时间节点关键特征的精准聚焦，进而完成多规则同步习得。针对旅行商问题，Neural-3-OPT可从大量问题实例中自主学出高效的3-opt启发式策略，在TSP20、TSP50和TSP100随机数据集上的最优性差距分别低至0.00%、0.08%和0.74%，显著优于工业级求解器OR-Tools，且在真实数据集TSPLIB上保持更优性能；其优势源于“初期精准选优”的智能决策机理，通过优先识别对全局解质量影响显著的低效边、识别适配迭代全程的重连策略，避免单步贪心选择与无效迭代，实现快速收敛与高质量求解。研究表明，深度学习不仅能逼近最优求解性能，更能自主学出具备可解释性的算法规则，为算法设计提供了“数据驱动”新范式，也为“AI辅助算法设计”的发展提供了重要思路。

Abstract: Deep learning has recently demonstrated strong capabilities in pattern discovery and structural inference, offering new possibilities for algorithm design—a field traditionally reliant on human experience and intuition. We focus on the core objective of automatically learning interpretable algorithmic rules under unsupervised settings, using two representative combinatorial optimization problems as cases: course scheduling problem (CSP) and traveling salesman problem (TSP). We propose two deep reinforcement–based methods, Neural-Greedy and Neural-3-OPT. For CSP, Neural-Greedy autonomously learns three optimal greedy rules from large collections of problem instances: prioritizing courses with the earliest finishing times, the latest starting times, and a hybrid of the two. Its underlying mechanism lies in parameter modulation and neuron cooperation, which precisely focuses on critical temporal features and enables the simultaneous acquisition of multiple rules. For TSP, Neural-3-OPT learns efficient 3-opt heuristic autonomously from large collections of problem instances, achieving optimality gaps as low as 0.00%, 0.08%, and 0.74% on random TSP20, TSP50, and TSP100 datasets, respectively. It significantly outperforms the industrial solver OR-Tools and maintains superior performance on the real-world TSPLIB benchmark. This advantage stems from an intelligent decision mechanism of early-stage precise selection, which prioritizes inefficient edges with substantial impact on global solution quality and identifies reconnection strategies adaptive to the entire iterative process, thereby avoiding myopic greedy choices and ineffective iterations to achieve fast convergence and high-quality solutions. Overall, our results demonstrate that deep learning can not only approach optimal solution quality but also autonomously learn interpretable algorithmic rules, offering a data-driven paradigm for algorithm design and providing important insights into the development of AI-assisted algorithm design.

HTML全文

参考文献(38)

施引文献

资源附件(0)