Citation: | Fu Nan, Ni Weiwei, Jiang Zepeng, Hou Lihe, Zhang Dongyue, Zhang Ruyu. Directed Graph Clustering Algorithm with Edge Local Differential Privacy[J]. Journal of Computer Research and Development, 2025, 62(1): 256-268. DOI: 10.7544/issn1000-1239.202330193 |
Graph clustering based on local differential privacy has become a hot research topic in recent years. Most existing solutions are mainly realized through modular aggregation using bit vector technology. The linear relationship between the amount of noise and the vector dimension makes balancing clustering quality and privacy challenges. Aiming at the above problems, a directed graph clustering algorithm, DGC-LDP (directed graph clustering under LDP), is proposed based on edge local differential privacy (Edge-LDP). Concretely, the direct encoding method replaces the bit vector encoding method to reduce the amount of data in privacy processing. Meanwhile, a dynamic perturbation mechanism is designed based on the graph structure to balance privacy and statistical utility by adaptively adding noise. Then, according to the individual information uploaded by the terminal, the collector extracts the similarity information between nodes and designs a node aggregation algorithm based on the silhouette coefficient measurement model to generate clusters. Finally, an iterative mechanism is built between the terminal and the collector, and the collector iteratively optimizes the node aggregation form based on the statistical information fed back by the mechanism to achieve high-quality clustering. Theoretical analysis and experimentation on real-world datasets demonstrate that our proposed algorithm can obtain desirable clustering results while satisfying Edge-LDP.
[1] |
Duchi J C, Jordan M I, Wainwright M J. Local privacy and statistical minimax rates[C] // Proc of the 54th IEEE Annual Symp on Foundations of Computer Science. Piscataway, NJ: IEEE, 2013: 429−438
|
[2] |
叶青青,孟小峰,朱敏杰,等. 本地化差分隐私研究综述[J]. 软件学报,2017,29(7):1981−2005
Ye Qingqing, Meng Xiaofeng, Zhu Minjie, et al. Survey on local differential privacy[J]. Journal of Software, 2017, 29(7): 1981−2005 (in Chinese)
|
[3] |
Li Weiqing, Chen Hongyu, Zhao Ruifeng, et al. A federated recommendation system based on local differential privacy clustering[C] // Proc of the 7th IEEE Smart World, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Int of People and Smart City Innovation. Piscataway, NJ: IEEE, 2021: 364−369
|
[4] |
Liu Gaoyuan, Tang Peng, Hu Chengyu, et al. Multi-dimensional data publishing with local differential privacy[C] //Proc of the 26th Int Conf on Extending Database Technology. Berlin: Springer, 2023: 183−194
|
[5] |
张少波,原刘杰,毛新军,等. 基于本地差分隐私的K-modes聚类数据隐私保护方法[J]. 电子学报,2022,50(9):2181−2188
Zhang Shaobo, Yuan Liujie, Mao Xinjun, et al. Privacy protection method for K-modes clustering data with local differential privacy[J]. Journal of Acta Electronica Sinica, 2022, 50(9): 2181−2188 (in Chinese)
|
[6] |
朱素霞 ,王蕾 ,孙广路 . 满足本地差分隐私的分类变换扰动机制[J]. 计算机研究与发展,2022,59(2):430-439
Zhu Suxia, Wang Lei, Sun Guanglu. A perturbation mechanism for classified transformation satisfying local differential privacy [J]. Journal of Computer Research and Development, 2022, 59(2): 430−439 (in Chinese)
|
[7] |
Dwork C. Differential privacy[C] // Proc of the 33rd Int Colloquium on Automata, Languages, and Programming. Berlin: Springer, 2006: 1−12
|
[8] |
Ye Qingqing, Hu Haibo, Au M H, et al. Towards locally differentially private generic graph metric estimation[C] // Proc of the 36th IEEE Int Conf on Data Engineering (ICDE). Piscataway, NJ: IEEE, 2020: 1922−1925
|
[9] |
Qin Zhan, Yu Ting, Yang Yin, et al. Generating synthetic decentralized social graphs with local differential privacy[C] //Proc of the 24th ACM SIGSAC Conf on Computer and Communications Security. New York: ACM, 2017: 425−438
|
[10] |
Zhang Yuxuan, Wei Jianghong, Zhang Xiaojian, et al. A two-phase algorithm for generating synthetic graph under local differential privacy[C] //Proc of the 8th Int Conf on Communication and Network Security. New York: ACM, 2018: 84−89
|
[11] |
Wei Chengkun, Ji Shouling, Liu Changchang, et al. AsgLDP: Collecting and generating decentralized attributed graphs with local differential privacy[J]. IEEE Transactions on Information Forensics and Security, 2020, 15: 3239−3254 doi: 10.1109/TIFS.2020.2985524
|
[12] |
Yang Jingyu, Ma Xuebin, Bai Xiangyu, et al. Graph publishing with local differential privacy for hierarchical social networks[C] // Proc of the 10th IEEE Int Conf on Electronics Information and Emergency Communication (ICEIEC). Piscataway, NJ: IEEE, 2020: 123−126
|
[13] |
Sun Haipei, Xiao Xiaokui, Khalil I, et al. Analyzing subgraph statistics from extended local views with decentralized differential privacy[C] //Proc of the 26th ACM SIGSAC Conf on Computer and Communications Security. New York: ACM, 2019: 703−717
|
[14] |
Imola J, Murakami T, Chaudhuri K, Locally differentially private analysis of graph statistics[C] // Proc of the 30th USENIX Security Symp. Berkeley, CA: USENIX Association, 2021: 983−1000
|
[15] |
Liu Zichun, Huang Liusheng, Xu Hongli, et al. PrivAG: Analyzing attributed graph data with local differential privacy[C] // Proc of the 26th IEEE Int Conf on Parallel and Distributed Systems (ICPADS). Piscataway, NJ: IEEE, 2020: 422−429
|
[16] |
Zhang Zhejian. LDPCD: A novel method for locally differentially private community detection [J]. Computational Intelligence and Neuroscience 2022, 2022: 4080047
|
[17] |
Fu Nan, Ni Weiwei, Zhang Sen, et al. GC-NLDP: A graph clustering algorithm with local differential privacy[J]. Computers & Security, 2023, 124: 102967
|
[18] |
Hou Lihe, Ni Weiwei, Zhang Sen, et al. Wdt-SCAN: Clustering decentralized social graphs with local differential privacy[J]. Computers & Security, 2023, 125: 103036
|
[19] |
Shamir A. How to share a secret[J]. Communications of the ACM, 1979, 22(11): 612−613 doi: 10.1145/359168.359176
|
[20] |
Warner S L. Randomized response: A survey technique for eliminating evasive answer bias[J]. Journal of the American Statistical Association, 1965, 60(309): 63−69 doi: 10.1080/01621459.1965.10480775
|
[21] |
Wang Teng, Zhang Xuefeng, Feng Jingyu, et al. A comprehensive survey on local differential privacy toward data statistics and analysis[J]. Sensors, 2020, 20(24): 7030 doi: 10.3390/s20247030
|
[22] |
Yang Mengmeng, Lyu Lingjuan, Zhao Jun, et at. Local differential privacy and its applications: A comprehensive survey [J]. arXiv preprint, arXiv: 2008.03686, 2020
|
[23] |
Wang Tianhao, Blocki J, Li Ninghui, et al. Locally differentially private protocols for frequency estimation[C] //Proc of the 26th USENIX Security Symp. Berkeley, CA: USENIX Association, 2017: 729−745
|
[24] |
Erlingsson Ú, Pihur V, Korolova A. Rappor: Randomized aggregatable privacy-preserving ordinal response[C] //Proc of the 21st ACM SIGSAC Conf on Computer and Communications Security. New York: ACM, 2014: 1054−1067
|
[25] |
Zhang Jun, Xiao Xiaokui, Xie Xing. PrivTree: A differentially private algorithm for hierarchical decompositions [C] //Proc of the 2016 ACM Int Conf on Management of Data. New York: ACM, 2016: 155−170
|
[26] |
Liu Shang, Cao Yang, Murakami T, et al. A crypto-assisted approach for publishing graph statistics with node local differential privacy [J]. arXiv preprint, arXiv: 2209.02231, 2022
|
[27] |
Kairouz P, Bonawitz K, Ramage D. Discrete distribution estimation under local privacy[C] //Proc of the 33rd Int Conf on Machine Learning. New York: ACM, 2016: 2436−2444
|
[28] |
Allard A, Serrano M, Boguñá M. Geometric description of clustering in directed networks [J]. arXiv preprint, arXiv: 2302.09055, 2023
|
[29] |
Rand W M. Objective criteria for the evaluation of clustering methods [J]. Journal of the American Statistical Association, 1971, 66(336): 846−850
|
[30] |
Vinh N X, Epps J, Bailey J. Information theoretic measures for clusterings comparison: Variants, properties, aormalization and correction for chance[J]. Journal of Machine Learning Research, 2010, 11: 2837−2854
|
[1] | Chen Yewang, Shen Lianlian, Zhong Caiming, Wang Tian, Chen Yi, Du Jixiang. Survey on Density Peak Clustering Algorithm[J]. Journal of Computer Research and Development, 2020, 57(2): 378-394. DOI: 10.7544/issn1000-1239.2020.20190104 |
[2] | Zhao Huihui, Zhao Fan, Chen Renhai, Feng Zhiyong. Efficient Index and Query Algorithm Based on Geospatial Big Data[J]. Journal of Computer Research and Development, 2020, 57(2): 333-345. DOI: 10.7544/issn1000-1239.2020.20190565 |
[3] | Xu Zhengguo, Zheng Hui, He Liang, Yao Jiaqi. Self-Adaptive Clustering Based on Local Density by Descending Search[J]. Journal of Computer Research and Development, 2016, 53(8): 1719-1728. DOI: 10.7544/issn1000-1239.2016.20160136 |
[4] | Gong Shufeng, Zhang Yanfeng. EDDPC: An Efficient Distributed Density Peaks Clustering Algorithm[J]. Journal of Computer Research and Development, 2016, 53(6): 1400-1409. DOI: 10.7544/issn1000-1239.2016.20150616 |
[5] | Meng Xiaofeng, Zhang Xiaojian. Big Data Privacy Management[J]. Journal of Computer Research and Development, 2015, 52(2): 265-281. DOI: 10.7544/issn1000-1239.2015.20140073 |
[6] | Liu Yahui, Zhang Tieying, Jin Xiaolong, Cheng Xueqi. Personal Privacy Protection in the Era of Big Data[J]. Journal of Computer Research and Development, 2015, 52(1): 229-247. DOI: 10.7544/issn1000-1239.2015.20131340 |
[7] | Liu Zhuo, Yang Yue, Zhang Jianpei, Yang Jing, Chu Yan, Zhang Zebao. An Adaptive Grid-Density Based Data Stream Clustering Algorithm Based on Uncertainty Model[J]. Journal of Computer Research and Development, 2014, 51(11): 2518-2527. DOI: 10.7544/issn1000-1239.2014.20130869 |
[8] | Xu Min, Deng Zhaohong, Wang Shitong, Shi Yingzhong. MMCKDE: m-Mixed Clustering Kernel Density Estimation over Data Streams[J]. Journal of Computer Research and Development, 2014, 51(10): 2277-2294. DOI: 10.7544/issn1000-1239.2014.20130718 |
[9] | Wang Ning, Li Jie. Two-Tiered Correlation Clustering Method for Entity Resolution in Big Data[J]. Journal of Computer Research and Development, 2014, 51(9): 2108-2116. DOI: 10.7544/issn1000-1239.2014.20131345 |
[10] | Xie Kunwu, Bi Xiaoling, and Ye Bin. Clustering Algorithm of High-Dimensional Data Based on Units[J]. Journal of Computer Research and Development, 2007, 44(9): 1618-1623. |