高级检索
    张啸剑, 徐雅鑫, 孟小峰. 基于本地化差分隐私的空间数据近似k-近邻查询[J]. 计算机研究与发展, 2022, 59(7): 1610-1624. DOI: 10.7544/issn1000-1239.20210397
    引用本文: 张啸剑, 徐雅鑫, 孟小峰. 基于本地化差分隐私的空间数据近似k-近邻查询[J]. 计算机研究与发展, 2022, 59(7): 1610-1624. DOI: 10.7544/issn1000-1239.20210397
    Zhang Xiaojian, Xu Yaxin, Meng Xiaofeng. Approximate k-Nearest Neighbor Queries of Spatial Data Under Local Differential Privacy[J]. Journal of Computer Research and Development, 2022, 59(7): 1610-1624. DOI: 10.7544/issn1000-1239.20210397
    Citation: Zhang Xiaojian, Xu Yaxin, Meng Xiaofeng. Approximate k-Nearest Neighbor Queries of Spatial Data Under Local Differential Privacy[J]. Journal of Computer Research and Development, 2022, 59(7): 1610-1624. DOI: 10.7544/issn1000-1239.20210397

    基于本地化差分隐私的空间数据近似k-近邻查询

    Approximate k-Nearest Neighbor Queries of Spatial Data Under Local Differential Privacy

    • 摘要: 针对现有本地编码机制与本地扰动机制在收集空间数据时不具有保距性的问题,提出了基于局部敏感Hash结构(locality-sensitive hashing, LSH)的近似k-近邻(k nearest neighbor, kNN)查询算法PELSH与PULSH.这2种算法利用具有多Hash函数的多Hash表对所有用户位置数据进行索引,结合多Hash表结构响应近似kNN查询.每个用户结合收集者所共享的多Hash表副本,将自身位置数据以汉明空间嵌入方式编码成0/1串.借助LSH结构对0/1串进行Hash压缩,并利用GRR机制与按位扰动机制对压缩后的0/1串进行本地处理.收集者利用每个用户的报告值重构多Hash表索引结构,遍历多Hash表响应空间近似kNN查询.为了有效地利用LSH索引结构的特点,PELSH和PULSH算法结合隐私预算分割与用户分组策略来重构多Hash表结构,基于这2种策略设计了4种本地扰动算法PELSHB,PELSHG,PULSHB和PULSHG.PELSH和PULSH算法与现有的近似kNN查询算法在真实的大规模空间数据集上的实验结果表明,所设计的近似空间kNN查询效果优于同类算法.

       

      Abstract: Aiming at the problem that the existing local encoding mechanisms and perturbation mechanisms cannot preserve the distance between neighbor locations when collecting the spatial data, we propose two efficient algorithms, called PELSH and PULSH, which are based on locality-sensitive hashing(LSH) structure, to respond kNN queries. The two algorithms employ multiple hashing tables with multiple hashing functions to index the locations of all users, on which are relied to answer kNN queries. Based on the hashing tables copied from the collector, each user firstly transforms his/her location into 0/1 string with Hamming embedding algorithm and then uses LSH to compress the Hamming code. Finally, the user locally runs GRR and bit perturbation mechanism on the compressed 0/1 string and reports the perturbed value to the collector. The collector accumulates the reports from all users to reconstruct hashing tables that are traveled to get the approximate kNN queries. Furthermore, in PELSH and PULSH, we use privacy budget partition and user partition strategies to design four local algorithms, called PELSHB, PELSHG, PULSHB, and PULSHG to perturb user data. PELSH and PULSH are compared with existing algorithms in the large-scale real datasets. The experimental results show PELSH and PULSH outperform their competitors, achieve the accurate results of spatial kNN queries.

       

    /

    返回文章
    返回