Abstract:
In recent years, local differential privacy has received much attention because of its advantages of not requiring trusted third parties, less interaction, and high efficiency. However, the existing frequency estimation mechanism under local differential privacy for set-valued data fails to take into account the privacy sensitivity differences of inputs, and treats all data equally, which will over-protect the non-sensitive data and lead to low accuracy of estimation results. To address this problem, the set-valued data utility-optimized local differential privacy (SULDP) model is defined. SULDP considers the case that the original data domain contains both sensitive and non-sensitive values, and allows for a reduction in the protection of non-sensitive values without weakening the protection of sensitive values. Further, five frequency estimation mechanisms suGRR, suGRR-Sample, suRAP, suRAP-Sample and suWheel are proposed under the SULDP model. Theoretical analysis confirms that the proposed schemes can achieve exactly the same protection on sensitive data compared with local differential privacy mechanisms, and improve the accuracy by loosening the protection of non-sensitive data. Finally, the new schemes are evaluated on real and simulated datasets, and the experimental results demonstrate that the proposed five mechanisms can effectively reduce the estimation error and improve the data utility, among which suWheel mechanism achieves best performance.