物联网环境下鲁棒的源匿名联邦学习洗牌协议

陈景雪; 高克寒; 周尔强; 秦臻

doi:10.7544/issn1000-1239.202330393

摘要: 随着物联网(Internet of things, IoT)和人工智能(artificial intelligence, AI)技术的快速发展，大量的数据被物联网设备收集. 使用机器学习或深度学习等人工智能技术可以对这些数据进行训练. 训练好的模型是物联网中分析网络环境、提高服务质量(quality of service, QoS)的重要组成部分. 然而，大多数数据提供者 (物联网终端用户) 不愿意将个人数据直接分享给任何第三方进行学术研究或商业分析，因为个人数据中包含私人敏感信息. 因此，研究物联网中的安全与隐私保护是一个重要研究方向. 联邦学习 (federated learning，FL) 允许多方物联网终端用户作为训练参与者将数据保存在本地，仅上传本地训练模型至参数服务器以进行聚合，通过这种方式可以保护参与者数据隐私. 具体来说，FL面临的攻击主要有2种，即推理攻击和投毒攻击. 为了同时抵抗推理攻击和检测投毒攻击，提出了一个全新的源匿名数据洗牌方案Re-Shuffle. 提出的Re-Shuffle采用不经意传输协议实现FL中参与者模型的匿名上传，保证参数服务器只能获得参与者的原始本地模型，而不知道来自哪个参与者. 此外，为了更适应IoT环境，Re-Shuffle采用了秘密共享机制，在保证梯度数据原始性的同时，解决了传统shuffle协议中参与者的退出问题.Re-Shuffle既保证了局部模型的原始性，又保证了局部模型的隐私性，从而在保护隐私的同时检查中毒攻击. 最后给出了安全证明，对方案的检测效果进行了评价，并在Re-Shuffle方案下对2种投毒攻击检测方案的计算开销进行了评估. 结果表明Re-Shuffle能够在可接受的开销下为毒化攻击检测方案提供隐私保护.

Abstract: With the rapid development of Internet of things (IoT) and artificial intelligence (AI) technology, a large amount of data are collected by IoT devices. These data can be trained by using AI techniques such as machine learning or deep learning. A well-trained model is an important part of analyzing network environment and improving quality of service (QoS) in IoT. However, most data providers (IoT end users) are reluctant to share personal data directly with any third party for academic research or business analysis because personal data contains private or sensitive information. Therefore, it is an important research direction to study the security and privacy protection in the IoT. Federated learning (FL) allows different participants to keep their data locally and only upload the local training models to the parameter server for model aggregation, which protects the data privacy of each participant. However, FL still faces some security challenges. Concretely, there are two main attacks FL faces, i.e., inference attack and poisoning attack. In order to resist inference attacks and detect poisoning attacks simultaneously, we propose a source anonymous data shuffle scheme, Re-Shuffle. The proposed Re-Shuffle uses the oblivious transfer protocol to realize the anonymous upload of participant models in FL. It ensures that in the process of poisoning attack detection, the parameter server can obtain the local model of the participant, who is unknown. In addition, to be more suitable for the IoT environment, Re-Shuffle adopts a secret sharing mechanism, which ensures the rawness of gradient data and solves the problem of participants dropline in the traditional shuffle protocol. In this way, both the rawness and privacy of the local model are ensured, so that the poisoning attacks can be checked while the privacy is protected. Finally, we provide the security proof and evaluate the scheme’s detection effect. Besides, the computation overheads of Re-Shuffle under two kinds of poisoning attack detection schemes are evaluated. The results show that Re-Shuffle can provide privacy protection for the poisoning attacks detection scheme at an acceptable cost.

物联网环境下鲁棒的源匿名联邦学习洗牌协议

Robust Source Anonymous Federated Learning Shuffle Protocol in IoT