机器学习方法赋能系统软件：挑战、实践与展望

唐楚哲; 王肇国; 陈海波

doi:10.7544/issn1000-1239.202330127

机器学习方法赋能系统软件：挑战、实践与展望

Empowering System Software with Machine Learning Methods: Challenges, Practice, and Prospects

摘要

摘要: 机器学习方法为构建系统软件带来了新的机遇. 为充分利用硬件资源支撑新型应用，系统软件的设计与实现需要不断改进与演化，以适应不同场景的需求. 机器学习方法具有从数据中提取规律并自动优化系统性能的潜力. 然而，使用机器学习方法赋能系统软件面临一些挑战，包括设计面向系统软件的定制化模型、获取足量且高质量的训练数据、降低模型开销对系统性能的影响，以及消除模型误差对系统正确性的影响等. 介绍了上海交通大学并行与分布式系统研究所在索引结构、键值存储系统、并发控制协议等方面应用机器学习方法优化系统软件的实践，并从模型设计、系统集成和实践者自身知识储备等方面总结了经验与教训. 此外，还回顾了国内外相关研究，并对此研究方向提出了展望与建议，希望为未来的研究提供参考与帮助.

Abstract: Machine learning methods have brought new opportunities for building system software that fully utilizes hardware resources to support emerging applications. However, in order to adapt to the demands of various application scenarios, system software design and implementation need continuous improvement and evolution. Meanwhile, machine learning methods have the potential to extract patterns from data and automatically optimize system performance. Despite this potential, applying machine learning methods to empower system software faces several challenges, such as customizing models for system software, obtaining training data with sufficient quality and quantity, reducing the impact of model execution costs on system performance, and avoiding the hindrance of model errors on system correctness. We present the practical experience of the Institute of Parallel and Distributed Systems (IPADS) at Shanghai Jiao Tong University in applying machine learning methods to optimize system software for index structures, key-value storage systems, and concurrency control protocols. The lessons learned from the practice in model design, system integration, and practitioner knowledge are summarized. Additionally, we briefly review relevant research at home and abroad, and propose prospects and suggestions for this line of research, including collaboration between systems and machine learning experts, building modular, reusable system prototypes, and exploring model optimization techniques dedicated to systems context. The aim is to offer references and help for future work.

HTML全文

参考文献(29)

施引文献

资源附件(0)