ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2019, Vol. 56 ›› Issue (10): 2071-2096.doi: 10.7544/issn1000-1239.2019.20190540

所属专题: 2019密码学与智能安全研究专题

• 综述 • 上一篇    下一篇

机器学习模型可解释性方法、应用与安全研究综述

纪守领1,李进锋1,杜天宇1,李博2   

  1. 1(浙江大学计算机科学与技术学院网络空间安全研究中心 杭州 310027);2(伊利诺伊大学香槟分校计算机科学学院 美国伊利诸伊州厄巴纳香槟 61822) (lijinfeng0713@zju.edu.cn)
  • 出版日期: 2019-10-16
  • 基金资助: 
    国家自然科学基金项目(61772466,U1836202);浙江省自然科学基金杰出青年项目(LR19F020003);浙江省科技计划项目(2017C01055)

Survey on Techniques, Applications and Security of Machine Learning Interpretability

Ji Shouling1, Li Jinfeng1, Du Tianyu1, Li Bo2   

  1. 1(Institute of Cyberspace Research and College of Computer Science and Technology, Zhejiang University, Hangzhou 310027);2(Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana-Champaign, IL, USA 61822)
  • Online: 2019-10-16

摘要: 尽管机器学习在许多领域取得了巨大的成功,但缺乏可解释性严重限制了其在现实任务尤其是安全敏感任务中的广泛应用.为了克服这一弱点,许多学者对如何提高机器学习模型可解释性进行了深入的研究,并提出了大量的解释方法以帮助用户理解模型内部的工作机制.然而,可解释性研究还处于初级阶段,依然还有大量的科学问题尚待解决.并且,不同的学者解决问题的角度不同,对可解释性赋予的含义也不同,所提出的解释方法也各有侧重.迄今为止,学术界对模型可解释性仍缺乏统一的认识,可解释性研究的体系结构尚不明确.在综述中,回顾了机器学习中的可解释性问题,并对现有的研究工作进行了系统的总结和科学的归类.同时,讨论了可解释性相关技术的潜在应用,分析了可解释性与可解释机器学习的安全性之间的关系,并且探讨了可解释性研究当前面临的挑战和未来潜在的研究方向,以期进一步推动可解释性研究的发展和应用.

关键词: 机器学习, 可解释性, 解释方法, 可解释机器学习, 安全性

Abstract: While machine learning has achieved great success in various domains, the lack of interpretability has limited its widespread applications in real-world tasks, especially security-critical tasks. To overcome this crucial weakness, intensive research on improving the interpretability of machine learning models has emerged, and a plethora of interpretation methods have been proposed to help end users understand its inner working mechanism. However, the research on model interpretation is still in its infancy, and there are a large amount of scientific issues to be resolved. Furthermore, different researchers have different perspectives on solving the interpretation problem and give different definitions for interpretability, and the proposed interpretation methods also have different emphasis. Till now, the research community still lacks a comprehensive understanding of interpretability as well as a scientific guide for the research on model interpretation. In this survey, we review the explanatory problems in machine learning, and make a systematic summary and scientific classification of the existing research works. At the same time, we discuss the potential applications of interpretation related technologies, analyze the relationship between interpretability and the security of interpretable machine learning, and discuss the current research challenges and potential future research directions, aiming at providing necessary help for future researchers to facilitate the research and application of model interpretability.

Key words: machine learning, interpretability, interpretation method, interpretable machine learning, security

中图分类号: