Zhang Xiaojian, Xu Yaxin, Meng Xiaofeng. Approximate k-Nearest Neighbor Queries of Spatial Data Under Local Differential Privacy[J]. Journal of Computer Research and Development, 2022, 59(7): 1610-1624. DOI: 10.7544/issn1000-1239.20210397
Citation:
Zhang Xiaojian, Xu Yaxin, Meng Xiaofeng. Approximate k-Nearest Neighbor Queries of Spatial Data Under Local Differential Privacy[J]. Journal of Computer Research and Development, 2022, 59(7): 1610-1624. DOI: 10.7544/issn1000-1239.20210397
Zhang Xiaojian, Xu Yaxin, Meng Xiaofeng. Approximate k-Nearest Neighbor Queries of Spatial Data Under Local Differential Privacy[J]. Journal of Computer Research and Development, 2022, 59(7): 1610-1624. DOI: 10.7544/issn1000-1239.20210397
Citation:
Zhang Xiaojian, Xu Yaxin, Meng Xiaofeng. Approximate k-Nearest Neighbor Queries of Spatial Data Under Local Differential Privacy[J]. Journal of Computer Research and Development, 2022, 59(7): 1610-1624. DOI: 10.7544/issn1000-1239.20210397
1(School of Computer & Information Engineering, Henan University of Economics and Law, Zhengzhou 450002)
2(School of Information, Renmin University of China, Beijing 100872)
Funds: This work was supported by the National Natural Science Foundation of China (62072156, 61502146, 91646203, 91746115, 62002098), the Natural Science Foundation of Henan Province (162300410006), the Key Technologies Research and Development Program of Henan Province (202102310563), the Preferential Financing Program for Scientific and Technological Activities of Overseas Students of Henan Province, the Research Program of the Higher Education of Henan Province (19A520012), and the Young Talents Fund of Henan University of Economics and Law.
Aiming at the problem that the existing local encoding mechanisms and perturbation mechanisms cannot preserve the distance between neighbor locations when collecting the spatial data, we propose two efficient algorithms, called PELSH and PULSH, which are based on locality-sensitive hashing(LSH) structure, to respond kNN queries. The two algorithms employ multiple hashing tables with multiple hashing functions to index the locations of all users, on which are relied to answer kNN queries. Based on the hashing tables copied from the collector, each user firstly transforms his/her location into 0/1 string with Hamming embedding algorithm and then uses LSH to compress the Hamming code. Finally, the user locally runs GRR and bit perturbation mechanism on the compressed 0/1 string and reports the perturbed value to the collector. The collector accumulates the reports from all users to reconstruct hashing tables that are traveled to get the approximate kNN queries. Furthermore, in PELSH and PULSH, we use privacy budget partition and user partition strategies to design four local algorithms, called PELSHB, PELSHG, PULSHB, and PULSHG to perturb user data. PELSH and PULSH are compared with existing algorithms in the large-scale real datasets. The experimental results show PELSH and PULSH outperform their competitors, achieve the accurate results of spatial kNN queries.