• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
高级检索

面向车联网数据持续共享的安全高效联邦学习

乐俊青, 谭州勇, 张迪, 刘高, 向涛, 廖晓峰

乐俊青, 谭州勇, 张迪, 刘高, 向涛, 廖晓峰. 面向车联网数据持续共享的安全高效联邦学习[J]. 计算机研究与发展, 2024, 61(9): 2199-2212. DOI: 10.7544/issn1000-1239.202330894
引用本文: 乐俊青, 谭州勇, 张迪, 刘高, 向涛, 廖晓峰. 面向车联网数据持续共享的安全高效联邦学习[J]. 计算机研究与发展, 2024, 61(9): 2199-2212. DOI: 10.7544/issn1000-1239.202330894
Le Junqing, Tan Zhouyong, Zhang Di, Liu Gao, Xiang Tao, Liao Xiaofeng. Secure and Efficient Federated Learning for Continuous IoV Data Sharing[J]. Journal of Computer Research and Development, 2024, 61(9): 2199-2212. DOI: 10.7544/issn1000-1239.202330894
Citation: Le Junqing, Tan Zhouyong, Zhang Di, Liu Gao, Xiang Tao, Liao Xiaofeng. Secure and Efficient Federated Learning for Continuous IoV Data Sharing[J]. Journal of Computer Research and Development, 2024, 61(9): 2199-2212. DOI: 10.7544/issn1000-1239.202330894
乐俊青, 谭州勇, 张迪, 刘高, 向涛, 廖晓峰. 面向车联网数据持续共享的安全高效联邦学习[J]. 计算机研究与发展, 2024, 61(9): 2199-2212. CSTR: 32373.14.issn1000-1239.202330894
引用本文: 乐俊青, 谭州勇, 张迪, 刘高, 向涛, 廖晓峰. 面向车联网数据持续共享的安全高效联邦学习[J]. 计算机研究与发展, 2024, 61(9): 2199-2212. CSTR: 32373.14.issn1000-1239.202330894
Le Junqing, Tan Zhouyong, Zhang Di, Liu Gao, Xiang Tao, Liao Xiaofeng. Secure and Efficient Federated Learning for Continuous IoV Data Sharing[J]. Journal of Computer Research and Development, 2024, 61(9): 2199-2212. CSTR: 32373.14.issn1000-1239.202330894
Citation: Le Junqing, Tan Zhouyong, Zhang Di, Liu Gao, Xiang Tao, Liao Xiaofeng. Secure and Efficient Federated Learning for Continuous IoV Data Sharing[J]. Journal of Computer Research and Development, 2024, 61(9): 2199-2212. CSTR: 32373.14.issn1000-1239.202330894

面向车联网数据持续共享的安全高效联邦学习

基金项目: 国家重点研发计划项目(2022YFB3103500);国家自然科学基金项目(61932006,62202071,62302072);中国博士后科学基金项目(2022M710518,2022M710520);重庆市自然科学基金项目(CSTB2022NSCQ-MSX0358,CSTB2022NSCQ-MSX1217)
详细信息
    作者简介:

    乐俊青: 1991年生. 博士,助理研究员. CCF会员. 主要研究方向为隐私保护、联邦学习、信息安全

    谭州勇: 1997年生. 硕士. 主要研究方向为隐私保护、联邦学习

    张迪: 1993年生. 博士,助理研究员. 主要研究方向为隐私保护、区块链、密码学

    刘高: 1991年生. 博士,助理研究员. 主要研究方向为车联网、区块链、隐私保护

    向涛: 1980年生. 博士,教授,博士生导师. CCF会员. 主要研究方向为隐私保护、信息安全、机器学习

    廖晓峰: 1964年生. 博士,教授,博士生导师. CCF会员. 主要研究方向为神经网络、隐私保护、密码学

    通讯作者:

    张迪(dizhang@cqu.edu.cn

  • 中图分类号: TP309;TP181

Secure and Efficient Federated Learning for Continuous IoV Data Sharing

Funds: The work was supported by the National Key Research and Development Program of China (2022YFB3103500), the National Natural Science Foundation of China (61932006, 62202071, 62302072), the China Postdoctoral Science Foundation (2022M710518, 2022M710520), and the Natural Science Foundation of Chongqing (CSTB2022NSCQ-MSX0358, CSTB2022NSCQ-MSX1217).
More Information
    Author Bio:

    Le Junqing: born in 1991. PhD, research assistant. Member of CCF. His main research interests include privacy protection, federated learning, and information security

    Tan Zhouyong: born in 1997. Master. His main research interests include privacy protection and federated learning

    Zhang Di: born in 1993. PhD, research assistant. Her main research interests include privacy protection, blockchain, and cryptography

    Liu Gao: born in 1991. PhD, research assistant. His main research interests include Internet of vehicles, blockchain, and privacy protection

    Xiang Tao: born in 1980. PhD, professor, PhD supervisor. Member of CCF. His main research interests include privacy protection, information security, and machine learning

    Liao Xiaofeng: born in 1964. PhD, professor, PhD supervisor. Member of CCF. His main research interests include neural network, privacy protection, and cryptography

  • 摘要:

    车联网与人工智能结合推动了自动驾驶汽车的快速发展. 分散于不同车辆中的车联网数据共享并用于训练人工智能模型可实现更高效、更可靠的智能驾驶服务. 自动驾驶汽车可通过车载摄像头、传感器等持续采集车辆实时信息、道路图像和视频等车联网数据,并用于优化更新智能交通模型,弥补车联网数据变化导致的模型准确度下降问题. 提出面向车联网环境下数据持续共享的高效安全联邦学习方案SEFL,以解决车联网数据采集低效、数据动态更新导致的灾难性遗忘、模型训练参数导致的隐私泄露等问题. 在方案SEFL中,车辆基于全局模型,只采集模型识别率较低的车联网数据,并以最大概率对应的输出作为该样本的标签,完成训练样本自动采集. 由于车辆存储空间有限,采集的新样本会覆盖旧样本,导致车辆上数据是动态变化的,传统微调训练方式容易引起灾难性遗忘问题. 为此,方案中设计了一种基于双重知识蒸馏的训练算法,确保模型学习到每个样本的知识,使模型保持较高的准确度. 此外,为了防止车辆与服务器之间传播的模型参数泄露用户隐私,提出了一种自适应的差分隐私策略来实现客户端级的强隐私保护,同时该方案能最大限度地减少差分隐私噪声对全局模型准确度的负面影响. 最后,进行了安全性分析并结合交通标志数据集GTSRB和车辆识别数据集对SEFL方案进行了性能评估. 实验结果表明所提出的SEFL方案能提供可靠的强隐私保护和高效的采集策略,并且在模型准确度方面要优于现有基于联邦学习的算法.

    Abstract:

    The combination of the Internet of vehicles (IoV) and artificial intelligence (AI) has driven the rapid development of autonomous vehicles. Sharing IoV data distributed across different vehicles for training AI models enables more efficient and reliable intelligent driving services. Autonomous vehicles can continuously gather real-time vehicle information, road images and videos among other IoV data, through onboard cameras and sensors. This data are then utilized to optimize and update intelligent traffic models, addressing issues where changes in IoV data result in decreased model accuracy. We propose an efficient and secure federated learning scheme (named as SEFL) for continuous data sharing in an IoV environment to address the problems related to inefficient data collection, catastrophic forgetting problems due to dynamic data updates and privacy leakage from model training parameters. In SEFL, to enable the automatic collection of training samples, each vehicle is based on the global model to only collect IoV data with lower recognition accuracy, and the output with the highest probability is used as the label for that sample. Since vehicle storage space is limited and new samples can overwrite old ones, the data on vehicles are dynamically changing, making traditional fine-tuning training methods prone to catastrophic forgetting. Thus, a dual-knowledge distillation-based training algorithm is proposed in SEFL to ensure that the model learns the knowledge of each sample, maintaining high accuracy. Besides, to prevent privacy leakage from the model parameters between vehicles and servers, an adaptive differential privacy strategy is proposed to achieve client-level privacy protection. Simultaneously, this strategy minimizes the negative impact of differential privacy noise on the accuracy of the global model. Finally, a security analysis and performance evaluation of SEFL scheme are conducted using the GTSRB dataset and vehicle identification dataset. The analysis and experimental results indicate that the proposed SEFL scheme can provide strong privacy protection and efficient data collection. Furthermore, SEFL outperforms existing federated learning-based algorithms in terms of model accuracy.

  • 图  1   高效安全的联邦学习架构

    Figure  1.   Architecture of efficient and secure federated learning

    图  2   车辆基于全局模型自动采集

    Figure  2.   Vehicles automatically collecting based on the global model

    图  3   基于知识蒸馏的本地独立训练

    Figure  3.   Local independent training based on knowledge distillation

    图  4   在车辆客户端基于知识蒸馏进行全局模型更新

    Figure  4.   Global model updates on vehicle clients based on knowledge distillation

    图  5   模型反转攻击

    Figure  5.   Model inversion attack

    图  6   模型超参数设置

    Figure  6.   Model hyperparameter configuration

    图  7   GTSRB上的样本采集方式对比及采集范围变化

    Figure  7.   Comparison of collecting methods and changes in collecting range on GTSRB

    图  8   本地独立训练和基于双重知识蒸馏训练

    Figure  8.   Local independent training and training based on dual knowledge distillation

    图  9   不同本地训练方式的对比

    Figure  9.   Comparison of different local training methods

    图  10   相同隐私保护程度下的准确度与信息损失对比

    Figure  10.   Comparison of accuracy and information loss with the same privacy protection

    图  11   训练过程中的模型参数替换

    Figure  11.   Model parameter replacement during training

    表  1   相关工作对比

    Table  1   Comparison of Related Work

    性能 文献 SEFL
    (本文)
    [1115] [1618] [2325] [2628]
    强隐私保护 × ×
    高效学习 ×
    高准确度 × ×
    注:×表示不支持,√表示支持.
    下载: 导出CSV
  • [1]

    Yang Fangchun, Wang Shangguang, Li Jinglin, et al. An overview of Internet of vehicles[J]. China Communications, 2014, 11(10): 1−15

    [2]

    Muhammad K, Ullah A, Lloret J, et al. Deep learning for safe autonomous driving: Current challenges and future directions[J]. IEEE Transactions on Intelligent Transportation Systems, 2020, 22(7): 4316−4336

    [3] 刘占文,赵祥模,李强,等. 基于图模型与卷积神经网络的交通标志识别方法[J]. 交通运输工程学报,2016,16(5):122−131 doi: 10.3969/j.issn.1671-1637.2016.05.014

    Liu Zhanwen, Zhao Xiangmo, Li Qiang, et al. A traffic sign recognition method based on graph models and convolutional neural networks[J]. Journal of Transportation Engineering, 2016, 16(5): 122−131 (in Chinese) doi: 10.3969/j.issn.1671-1637.2016.05.014

    [4]

    Konečný J, McMahan H B, Ramage D, et al. Federated optimization: Distributed machine learning for on-device intelligence[J]. arXiv preprint, arXiv: 1610.02527, 2016

    [5]

    McMahan B, Moore E, Ramage D, et al. Communication-efficient learning of deep networks from decentralized data[C]//Proc of the 20th Int Conf on Artificial Intelligence and Statistics. Cambridge, MA: MIT, 2017: 1273−1282

    [6]

    Zhao Yue, Li Meng, Lai Liangzhen, et al. Federated learning with non-IID data[J]. arXiv preprint, arXiv: 1806.00582, 2018

    [7]

    Bonawitz K, Eichner H, Grieskamp W, et al. Towards federated learning at scale: System design[C/OL]//Proc of the 2nd Conf on Machine Learning and Systems. 2019[2024-03-12].https://proceedings.mlsys.org/paper_files/ paper/2019/file/7b770da633baf74895be22a8807f1a8f-Paper.pdf

    [8] 刘飚,张方佼,王文鑫,等. 基于矩阵映射的拜占庭鲁棒联邦学习算法[J]. 计算机研究与发展,2021,58(11):2416−2429 doi: 10.7544/issn1000-1239.2021.20210633

    Liu Biao, Zhang Fangjiao, Wang Wenxin, et al. Byzantine-robust federated learning algorithm based on matrix mapping[J]. Journal of Computer Research and Development, 2021, 58(11): 2416−2429 (in Chinese) doi: 10.7544/issn1000-1239.2021.20210633

    [9]

    Manias M D, Shami A. Making a case for federated learning in the Internet of vehicles and intelligent transportation systems[J]. IEEE Network, 2021, 35(3): 88−94 doi: 10.1109/MNET.011.2000552

    [10]

    Xing Ling, Zhao Pengcheng, Gao Jianping, et al. A survey of the social Internet of vehicles: Secure data issues, solutions, and federated learning[J]. IEEE Intelligent Transportation Systems Magazine, 2022, 15(2): 70−84

    [11]

    Xie Kan, Zhang Zhe, Li Bo, et al. Efficient federated learning with spike neural networks for traffic sign recognition[J]. IEEE Transactions on Vehicular Technology, 2022, 71(9): 9980−9992 doi: 10.1109/TVT.2022.3178808

    [12]

    Stergiou K D, Psannis K E, Vitsas V, et al. A federated learning approach for enhancing autonomous vehicles image recognition[C]//Proc of the 4th Int Conf on Computer Communication and the Internet. Berlin: Springer, 2022: 87−90

    [13]

    Liang Feiyuan, Yang Qinglin, Liu Ruiqi, et al. Semi-synchronous federated learning protocol with dynamic aggregation in Internet of vehicles[J]. IEEE Transactions on Vehicular Technology, 2022, 71(5): 4677−4691 doi: 10.1109/TVT.2022.3148872

    [14]

    Zhou Xiaokang, Liang Wei, She Jinhua, et al. Two-layer federated learning with heterogeneous model aggregation for 6G supported Internet of vehicles[J]. IEEE Transactions on Vehicular Technology, 2021, 70(6): 5308−5317 doi: 10.1109/TVT.2021.3077893

    [15]

    Zhou Hongliang, Zheng Yifeng, Huang Hejiao, et al. Toward robust hierarchical federated learning in Internet of vehicles[J]. IEEE Transactions on Intelligent Transportation Systems, 2023, 24(5): 5600−5614 doi: 10.1109/TITS.2023.3243003

    [16]

    Li Zhizhon, Hoiem D. Learning without forgetting[J]. IEEE Transactions on Pattern Analysis and Aachine Intelligence, 2017, 40(12): 2935−2947

    [17]

    Kirkpatrick J, Pascanu R, Rabinowitz N, et al. Overcoming catastrophic forgetting in neural networks[J]. Proceedings of the National Academy of Sciences, 2017, 114(13): 3521−3526 doi: 10.1073/pnas.1611835114

    [18]

    Rebuffi S A, Kolesnikov A, Sperl G, et al. iCaRL: Incremental classifier and representation learning[C]//Proc of the 30th IEEE Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2017: 2001−2010

    [19]

    Fredrikson M, Jha S, Ristenpart T. Model inversion attacks that exploit confidence information and basic countermeasures[C]//Proc of the 22nd ACM SIGSAC Conf on Computer and Communications Security. New York: ACM, 2015: 1322−1333

    [20]

    Shokri R, Stronati M, Song Congzheng, et al. Membership inference attacks against machine learning models[C]//Proc of the 38th IEEE Symp on Security and Privacy. Piscataway, NJ: IEEE, 2017: 3−18

    [21]

    He Zecheng, Zhang Tianwei, Lee R B. Model inversion attacks against collaborative inference[C]//Proc of the 35th Annual Computer Security Applications Conf. New York: ACM, 2019: 148−162

    [22] 周纯毅,陈大卫,王尚,等. 分布式深度学习隐私与安全攻击研究进展与挑战[J]. 计算机研究与发展,2021,58(5):927−943 doi: 10.7544/issn1000-1239.2021.20200966

    Zhou Chunyi, Chen Dawei, Wang Shang, et al. Research progress and challenges in privacy and security attacks on distributed deep learning[J]. Journal of Computer Research and Development, 2021, 58(5): 927−943 (in Chinese) doi: 10.7544/issn1000-1239.2021.20200966

    [23]

    Bonawitz K, Ivanov V, Kreuter B, et al. Practical secure aggregation for privacy-preserving machine learning[C]//Proc of the 24th ACM SIGSAC Conf on Computer and Communications Security. New York: ACM, 2017: 1175−1191

    [24]

    Fang Chen, Guo Yuanbo, Wang Na, et al. Highly efficient federated learning with strong privacy preservation in cloud computing[J]. Computers & Security, 2020, 96: 101889

    [25]

    Lu Yunlong, Huang Xiaohong, Zhang Ke, et al. Blockchain empowered asynchronous federated learning for secure data sharing in Internet of vehicles[J]. IEEE Transactions on Vehicular Technology, 2020, 69(4): 4298−4311 doi: 10.1109/TVT.2020.2973651

    [26]

    Shokri R, Shmatikov V. Privacy-preserving deep learning[C]//Proc of the 22nd ACM SIGSAC Conf on Computer and Communications Security. New York: ACM, 2015: 1310−1321

    [27]

    McMahan H B, Ramage D, Talwar K, et al. Learning differentially private recurrent language models[J]. arXiv preprint, arXiv: 1710.06963, 2017

    [28]

    Wei Kang, Li Jun, Ding Ming, et al. Federated learning with differential privacy: Algorithms and performance analysis[J]. IEEE Transactions on Information Forensics and Security, 2020, 15: 3454−3469 doi: 10.1109/TIFS.2020.2988575

    [29]

    Kairouz P, McMahan H B, Avent B, et al. Advances and open problems in federated learning[J]. Foundations and Trends® in Machine Learning, 2021, 14(1/2): 1−210

    [30]

    Dwork C, Roth A. The algorithmic foundations of differential privacy[J]. Foundations and Trends® in Theoretical Computer Science, 2014, 9(3/4): 211−407

    [31]

    Dwork C. Differential privacy: A survey of results[C]//Proc of the 5th Int Conf on Theory and Applications of Models of Computation. Berlin: Springer, 2008: 1−19

    [32]

    Abadi M, Chu A, Goodfellow I, et al. Deep learning with differential privacy[C]//Proc of the 23rd ACM SIGSAC Conf on Computer and Communications Security. New York: ACM, 2016: 308−318

    [33]

    Amin K, Kulesza A, Munoz A, et al. Bounding user contributions: A bias-variance trade-off in differential privacy[C]//Proc of the 36th Int Conf on Machine Learning. New York: ACM, 2019: 263−271

    [34]

    Andrew G, Thakkar O, McMahan B, et al. Differentially private learning with adaptive clipping[J]. Advances in Neural Information Processing Systems, 2021, 34: 17455−17466

    [35]

    Le Junqing, Zhang Di, Lei Xinyu, et al. Privacy-preserving federated learning with malicious clients and honest-but-curious servers[J]. IEEE Transactions on Information Forensics and Security, 2023, 18: 4329−4344 doi: 10.1109/TIFS.2023.3295949

    [36]

    Dwork C, Lei Jing. Differential privacy and robust statistics[C]//Proc of the 41st Annual ACM Symp on Theory of Computing. New York: ACM, 2009: 371−380

    [37]

    Gou Jianping, Yu Baosheng, Maybank S J, et al. Knowledge distillation: A survey[J]. International Journal of Computer Vision, 2021, 129(6): 1789−1819 doi: 10.1007/s11263-021-01453-z

    [38]

    Fan Liyue, Li Xiong. An adaptive approach to real-time aggregate monitoring with differential privacy[J]. IEEE Transactions on Knowledge and Data Engineering, 2014, 26(9): 2094−2106 doi: 10.1109/TKDE.2013.96

  • 期刊类型引用(7)

    1. 张淑芬,张宏扬,任志强,陈学斌. 联邦学习的公平性综述. 计算机应用. 2025(01): 1-14 . 百度学术
    2. 朱智韬,司世景,王健宗,程宁,孔令炜,黄章成,肖京. 联邦学习的公平性研究综述. 大数据. 2024(01): 62-85 . 百度学术
    3. 李锦辉,吴毓峰,余涛,潘振宁. 数据孤岛下基于联邦学习的用户电价响应刻画及其应用. 电力系统保护与控制. 2024(06): 164-176 . 百度学术
    4. 刘新,刘冬兰,付婷,王勇,常英贤,姚洪磊,罗昕,王睿,张昊. 基于联邦学习的时间序列预测算法. 山东大学学报(工学版). 2024(03): 55-63 . 百度学术
    5. 赵泽华,梁美玉,薛哲,李昂,张珉. 基于数据质量评估的高效强化联邦学习节点动态采样优化. 智能系统学报. 2024(06): 1552-1561 . 百度学术
    6. 杨秀清,彭长根,刘海,丁红发,汤寒林. 基于数据质量评估的公平联邦学习方案. 计算机与数字工程. 2022(06): 1278-1285 . 百度学术
    7. 黎志鹏. 高可靠的联邦学习在图神经网络上的聚合方法. 工业控制计算机. 2022(10): 85-87+90 . 百度学术

    其他类型引用(10)

图(11)  /  表(1)
计量
  • 文章访问数:  291
  • HTML全文浏览量:  55
  • PDF下载量:  108
  • 被引次数: 17
出版历程
  • 收稿日期:  2023-10-31
  • 修回日期:  2024-05-19
  • 网络出版日期:  2024-06-12
  • 刊出日期:  2024-08-31

目录

    /

    返回文章
    返回