基于数据纵向分布的隐私保护逻辑回归

宋蕾; 马春光; 段广晗; 袁琪

doi:10.7544/issn1000-1239.2019.20190414

基于数据纵向分布的隐私保护逻辑回归

Privacy-Preserving Logistic Regression on Vertically Partitioned Data

摘要

摘要: 逻辑回归是机器学习的重要算法之一，为解决集中式训练方式不能保护隐私的问题，提出隐私保护的逻辑回归解决方案，该方案适用于数据以特征维度进行划分，纵向分布在两方情况下，两方进行协作式训练学习到共享的模型结构.两方在本地数据集上进行训练，通过交换中间计算结果而不直接暴露私有数据，利用加法同态加密算法在密文下进行运算保证计算安全，保证在交互中不能获取对方的敏感信息.同时，提供隐私保护的预测方法，保证模型部署服务器不能获取询问者的私有数据.经过分析与实验验证，在几乎不损失精度的前提下，该案可以在两方均是半诚实参与者情况下提供隐私保护.

Abstract: Logistic regression is the important algorithms of machine learning. Traditional training methods require centralized collection of training data which will cause privacy issues. To solve this problem, this paper proposes privacy-preserving logistic regression. This scheme is suitable for dividing data by feature dimension, and the training data is shared between two parties. The two parties conduct collaborative training and learn a shared model. In this scheme, the two parties train the model locally on private data set while exchanging the intermediate calculation results without directly exposing their private data. Additionally, the additively homomorphic scheme can ensure the calculation security which can be performed on the cipher text. During the training process, the participants can only obtain zero knowledge of each other and cannot get any information about model parameters and training data of another participant. At the same time, a privacy protection prediction method is provided to ensure that the model deployment server cannot obtain the private data of the inquirer. After analysis and experimental verification, within the tolerable loss of precision, the scheme is secure against semi-honest participants and provide privacy protection.

HTML全文

参考文献(0)

施引文献

资源附件(0)