ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2020, Vol. 57 ›› Issue (9): 1971-1986.doi: 10.7544/issn1000-1239.2020.20190456

• 人工智能 • 上一篇    下一篇



  1. 1(河南财经政法大学计算机与信息工程学院 郑州 450002);2(中国人民大学信息学院 北京 100872) (
  • 出版日期: 2020-09-01
  • 基金资助: 

Interpretation and Understanding in Machine Learning

Chen Kerui1, Meng Xiaofeng2   

  1. 1(School of Computer & Information Engineering, Henan University of Economics and Law, Zhengzhou 450002);2(School of Information, Renmin University of China, Beijing 100872)
  • Online: 2020-09-01
  • Supported by: 
    This work was supported by the National Natural Science Foundation of China (91646203, 61941121, 61532010, 91846204, 61532016, 91746115) and the Young Talents Fund of Henan University of Economics and Law.

摘要: 近年来,机器学习发展迅速,尤其是深度学习在图像、声音、自然语言处理等领域取得卓越成效.机器学习算法的表示能力大幅度提高,但是伴随着模型复杂度的增加,机器学习算法的可解释性越差,至今,机器学习的可解释性依旧是个难题.通过算法训练出的模型被看作成黑盒子,严重阻碍了机器学习在某些特定领域的使用,譬如医学、金融等领域.目前针对机器学习的可解释性综述性的工作极少,因此,将现有的可解释方法进行归类描述和分析比较,一方面对可解释性的定义、度量进行阐述,另一方面针对可解释对象的不同,从模型的解释、预测结果的解释和模仿者模型的解释3个方面,总结和分析各种机器学习可解释技术,并讨论了机器学习可解释方法面临的挑战和机遇以及未来的可能发展方向.

关键词: 机器学习, 可解释性, 神经网络, 黑盒子, 模仿者模型

Abstract: In recent years, machine learning has developed rapidly, especially in the deep learning, where remarkable achievements are obtained in image, voice, natural language processing and other fields. The expressive ability of machine learning algorithm has been greatly improved; however, with the increase of model complexity, the interpretability of computer learning algorithm has deteriorated. So far, the interpretability of machine learning remains as a challenge. The trained models via algorithms are regarded as black boxes, which seriously hamper the use of machine learning in certain fields, such as medicine, finance and so on. Presently, only a few works emphasis on the interpretability of machine learning. Therefore, this paper aims to classify, analyze and compare the existing interpretable methods; on the one hand, it expounds the definition and measurement of interpretability, while on the other hand, for the different interpretable objects, it summarizes and analyses various interpretable techniques of machine learning from three aspects: model understanding, prediction result interpretation and mimic model understanding. Moreover, the paper also discusses the challenges and opportunities faced by machine learning interpretable methods and the possible development direction in the future. The proposed interpretation methods should also be useful for putting many research open questions in perspective.

Key words: machine learning, interpretation, neural network, black box, mimic model