• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Zhang Xiaojian, Xu Yaxin, Fu Nan, Meng Xiaofeng. Towards Private Key-Value Data Collection with Histogram[J]. Journal of Computer Research and Development, 2021, 58(3): 624-637. DOI: 10.7544/issn1000-1239.2021.20200319
Citation: Zhang Xiaojian, Xu Yaxin, Fu Nan, Meng Xiaofeng. Towards Private Key-Value Data Collection with Histogram[J]. Journal of Computer Research and Development, 2021, 58(3): 624-637. DOI: 10.7544/issn1000-1239.2021.20200319

Towards Private Key-Value Data Collection with Histogram

Funds: This work was supported by the National Natural Science Foundation of China (61502146, 91646203, 91746115, 62072156), the Natural Science Foundation of Henan (162300410006), the Key Technologies Research and Development Program of Henan Province (202102310563), and the Young Talents Fund of Henan University of Economics and Law.
More Information
  • Published Date: February 28, 2021
  • Recently, user data collection and analysis with local differential privacy has extended into key-value data. The trade-off between the size and sparsity of domain and perturbation method directly constrains the accuracy of the collection and analysis of such data. To remedy the deficiency caused by the domain size and perturbating method, this paper employs histogram technology to propose an efficient solution, called HISKV, to collect key-value data. HISKV firstly uses a user-grouping strategy and partial privacy budget to find the optimal length of truncation and enables each user to truncate his/her key-value data set. And then, based on the truncated set, each user samples one key-value pair and uses the discretization and perturbation method to process this pair. To perturb key-value data efficiently, a novel mechanism in HISKV, named LRR_KV is proposed, which allocates different perturbing probability for different keys. In LRR_KV, each user adopts this mechanism to add noise to his/her sampled pair, and sents the report to a collector. Based on the reports from all of the users, the collector estimates the frequency of each key and the mean of the values. To evaluate the utility of HISKV, we firstly conduct theoretical analysis on unbias, variance, and error bound of LRR_KV, and then perform experiments on real and synthetic datasets to compare different methods. The experimental results show that HISKV outperforms its competitors.
  • Related Articles

    [1]Yue Wenjing, Qu Wenwen, Lin Kuan, Wang Xiaoling. Survey of Cardinality Estimation Techniques Based on Machine Learning[J]. Journal of Computer Research and Development, 2024, 61(2): 413-427. DOI: 10.7544/issn1000-1239.202220649
    [2]Cao Yiran, Zhu Youwen, He Xingyu, Zhang Yue. Utility-Optimized Local Differential Privacy Set-Valued Data Frequency Estimation Mechanism[J]. Journal of Computer Research and Development, 2022, 59(10): 2261-2274. DOI: 10.7544/issn1000-1239.20220504
    [3]Ying Chenhao, Xia Fuyuan, Li Jie, Si Xueming, Luo Yuan. Incentive Mechanism Based on Truth Estimation of Private Data for Blockchain-Based Mobile Crowdsensing[J]. Journal of Computer Research and Development, 2022, 59(10): 2212-2232. DOI: 10.7544/issn1000-1239.20220493
    [4]Zhu Suxia, Wang Lei, Sun Guanglu. A Perturbation Mechanism for Classified Transformation Satisfying Local Differential Privacy[J]. Journal of Computer Research and Development, 2022, 59(2): 430-439. DOI: 10.7544/issn1000-1239.20200717
    [5]Xu Min, Deng Zhaohong, Wang Shitong, Shi Yingzhong. MMCKDE: m-Mixed Clustering Kernel Density Estimation over Data Streams[J]. Journal of Computer Research and Development, 2014, 51(10): 2277-2294. DOI: 10.7544/issn1000-1239.2014.20130718
    [6]Bai Heng, Gao Yurui, Wang Shijie, and Luo Limin. A Robust Diffusion Tensor Estimation Method for DTI[J]. Journal of Computer Research and Development, 2008, 45(7): 1232-1238.
    [7]He Xiaoyang and Wang Yasha. Model-Based Methods for Software Cost Estimation[J]. Journal of Computer Research and Development, 2006, 43(5): 777-783.
    [8]Wang Yu, Meng Xiaofeng, Wang Shan. Using Histograms to Estimate the Selectivity of XPath Expression with Value Predicates[J]. Journal of Computer Research and Development, 2006, 43(2): 288-294.
    [9]Liu Bo, Wang Zhensong, Yao Ping, Li Mingfeng. A Novel Real-Time Doppler Centroid Estimating Algorithm[J]. Journal of Computer Research and Development, 2005, 42(11): 1911-1917.
    [10]Wang Zhiming, Cai Lianhong, Ai Haizhou. Automatic Estimation of Visual Speech Parameters[J]. Journal of Computer Research and Development, 2005, 42(7): 1185-1190.

Catalog

    Article views (587) PDF downloads (284) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return