Abstract:
Pre-trained models have mitigated the challenges posed by extensive training data and computational resources, and also give birth to the new paradigm of model development and application, which we refer to as model supply chain. In this framework, a pre-trained model is uploaded by its publisher and subsequently transferred, compressed, and deployed by secondary developers to meet various application needs. This emerging model supply chain introduces additional stages and multiple elements, inevitably leading to security concerns and privacy risks. Despite the widespread adoption of model supply chains, there is currently a lack of systematic review of security threats in them. To address this research gap, in this paper, we provide a comprehensive overview of the deep learning model supply chain, introducing its concept and fundamental structure. We conduct an in-depth analysis of vulnerabilities at various stages of the model's lifecycle, including design, development, deployment, and usage. Furthermore, we compare and summarize prevalent attack methods, alongside introducing corresponding security protection strategies. To assist readers in effectively utilizing pre-trained models, we review and compare publicly available model repositories. Finally, we discuss potential future research avenues in areas such as security checks, real-time detection, and problem tracing. It aims to offer insights for safer and more reliable development and use of pre-training models. For the benefit of ongoing research, related papers and open-source codes of the methods discussed are accessible at https://github.com/Dipsy0830/DNN-supply-chain-survey.