Abstract:
With the popularization and application of deep neural networks, the trained neural network model has become an important asset and has been provided as machine learning services (MLaaS) for users. However, as a special kind of user, attackers can extract the models when using the services. Considering the high value of the models and risks of being stolen, service providers start to pay more attention to the copyright protection of their models. The main technique is adopted from the digital watermark and applied to neural networks, called neural network watermarking. In this paper, we first analyze this kind of watermarking and show the basic requirements of the design. Then we introduce the related technologies involved in neural network watermarking. Typically, service providers embed watermarks in the neural networks. Once they suspect a model is stolen from them, they can verify the existence of the watermark in the model. Sometimes, the providers can obtain the suspected model and check the existence of watermarks from the model parameters (white-box). But sometimes, the providers cannot acquire the model. What they can only do is to check the input/output pairs of the suspected model (black-box). We discuss these watermarking methods and potential attacks against the watermarks from the viewpoint of robustness, stealthiness, and security. In the end, we discuss future directions and potential challenges.