高级检索

    基于本地差分隐私的空间范围查询方法

    Towards Spatial Range Queries Under Local Differential Privacy

    • 摘要: 基于本地差分隐私的用户数据收集与分析得到了研究者的广泛关注.用户数据的值域大小、编码机制以及扰动机制直接制约着空间范围查询的精度.针对现有编码机制与扰动机制难以有效响应空间范围查询的不足,提出了一种基于网格分割与四分树索引的空间范围查询响应方法GT-R(grid-based quadtree range query),该方法利用网格对用户数据的值域进行均匀分割,产生大小均等的单元格区域.同时利用四分树结构对所有单元格区域进行索引.每个用户结合服务器共享的四分树副本,对所拥有的数据进行编码.借助于编码后的四分树进行层次随机采样,并利用优化随机应答机制对所采层次中的结点进行本地扰动处理.服务器利用每个用户的报告值重构四分树索引结构,并响应空间范围查询.GT-R与现有的编码机制与扰动机制在真实的大规模空间数据集上实验结果表明,其分割精度以及响应范围查询效果优于同类算法.

       

      Abstract: User data collection and analysis with local differential privacy has attracted considerable attention in recent years. The trade-off among the domain size of user data, encoding method, and perturbation method directly constrains the accuracy of spatial range query. To remedy the deficiency caused by the current encoding and perturbating method, this paper employs grid and quadtree to propose an efficient solution, called GT-R, to answer spatial range query. GT-R uses a uniform grid to decompose the data domain, and generate unit sized regions. Based on these regions, an indexing quadtree is built. And then each user encodes his/her data with the quadtree shared from server, and runs the optimal randomized response on each node of the sampled level in the quadtree and reports the sampled level along with the perturbed value. The server accumulates reports from users to reconstruct a quadtree comprising sum of reports from all users. Besides, to boost the accuracy of range query, the server relies on post-processing skill for consistency on the frequency of each node. GT-R method is compared with existing methods on the large-scale real datasets. The experimental results show that GT-R outperforms its competitors, achieves the accurate results of spatial range query.

       

    /

    返回文章
    返回