Abstract:
While machine learning has achieved great success in various domains, the lack of interpretability has limited its widespread applications in real-world tasks, especially security-critical tasks. To overcome this crucial weakness, intensive research on improving the interpretability of machine learning models has emerged, and a plethora of interpretation methods have been proposed to help end users understand its inner working mechanism. However, the research on model interpretation is still in its infancy, and there are a large amount of scientific issues to be resolved. Furthermore, different researchers have different perspectives on solving the interpretation problem and give different definitions for interpretability, and the proposed interpretation methods also have different emphasis. Till now, the research community still lacks a comprehensive understanding of interpretability as well as a scientific guide for the research on model interpretation. In this survey, we review the explanatory problems in machine learning, and make a systematic summary and scientific classification of the existing research works. At the same time, we discuss the potential applications of interpretation related technologies, analyze the relationship between interpretability and the security of interpretable machine learning, and discuss the current research challenges and potential future research directions, aiming at providing necessary help for future researchers to facilitate the research and application of model interpretability.