Citation: | Wang Hanzhi, Yi Lu, Wei Zhewei, Gan Junhao, Yuan Ye, Wen Jirong, Du Xiaoyong. Random-Walk Probability Computation on Dynamic Weighted Graphs[J]. Journal of Computer Research and Development, 2024, 61(8): 1865-1881. DOI: 10.7544/issn1000-1239.202440148 |
Computing random-walk probabilities on graphs is the subject of extensive research in both graph theory and data mining research. However, existing work mainly focuses on static graphs, and cannot efficiently support dynamic weighted graphs, which are ubiquitous in real-world applications. We study the problem of computing random-walk probabilities on dynamic weighted graphs. We propose to use a sampling schema called coin flip sampling, rather than the more commonly adopted weighted sampling schema, for simulating random walks in dynamic weighted graphs. We demonstrate that simulations based on coin-flip sampling maintain the unbiasedness of the resulting random-walk probability approximations. Moreover, this approach allows us to simultaneously achieve a near-optimal query time complexity and an optimal O(1) update time overhead per edge insertion or deletion. This is a significant improvement over existing methods, which typically incur substantial sampling costs or rely on intricate auxiliary structures that are hard to maintain in a dynamic setting. We present both theoretical analysis and empirical evaluations to substantiate the superiority of our method on dynamic weighted graphs.
[1] |
Banerjee S, Lofgren P. Fast bidirectional probability estimation in Markov models[J]. arXiv:1507.05998v1
|
[2] |
Fogaras D and Rácz B. Scaling link-based similarity search[C]//Proc of the 14th Int Conf on World Wide Web (WWW '05). New York: ACM, 2005: 641–650
|
[3] |
Fogaras D, Rácz B, Csalogány K, et al. Towards scaling fully personalized pagerank: Algorithms, lower bounds, and experiments[J]. Internet Mathematics, 2005, 2(3): 333−358 doi: 10.1080/15427951.2005.10129104
|
[4] |
Kallaugher J, Kapralov M, Price E. Simulating random walks in random streams[C]//Proc of the 2022 Annual ACM-SIAM Symp on Discrete Algorithms (SODA). New York: Society for Industrial and Applied Mathematics, 2022: 3091−3126
|
[5] |
Li Ronghua, Yu J X, Lu Qin, et al. On random walk based graph sampling[C]/Proc of IEEE 31st Int Conf on Data Engineering. Piscataway, NJ: IEEE, 2015: 927−938
|
[6] |
Nazi A, Zhou Z, Thirumuruganathan S, et al. Walk, Not Wait: Faster sampling over online social networks[J]. arXiv:1410.7833v2
|
[7] |
Wang S, Yang R, Xiao X, et al. I: Simple and effective approximate single-source personalized pagerank[C]//Proc of the 23rd ACM SIGKDD Int Conf on Knowledge Discovery and Data Mining. New York: ACM, 2017: 505−514
|
[8] |
Chen M, Wei Z, Ding B, et al. Scalable graph neural networks via bidirectional propagation[J]. Advances in Neural Information Processing Systems, 2020, 33: 14556−14566
|
[9] |
Perozzi B, Al-Rfou R, and Skiena S. DeepWalk: Online learning of social representations[C]//Proc of the 20th ACM SIGKDD Int Conf on Knowledge discovery and data mining (KDD '14). New York: ACM, 2014: 701–710
|
[10] |
Wu F, Souza A, Zhang T, et al. Simplifying graph convolutional networks[C]//Proc of Int Conf on Machine Learning, Long Beach, California: PMLR, 2019: 6861–6871
|
[11] |
Wang H, He M, Wei Z, et al. Approximate graph propagation[C]//Proc of the 27th ACM SIGKDD Conf on Knowledge Discovery & Data Mining. New York: ACM, 2021: 1686−1696
|
[12] |
Yang R, Xiao X, Wei Z, et al. 2019. Efficient estimation of heat kernel PageRank for local clustering[C]//Proc of the 2019 Int Conf on Management of Data (SIGMOD’19). New York: ACM, 2019: 1339–1356
|
[13] |
郭佳雯,白淇介,林铸天,等. 基于非递减时序随机游走的动态异质网络嵌入[J]. 计算机研究与发展,2021,58(8):1624−1641 doi: 10.7544/issn1000-1239.2021.20210317
Guo Jiawen, Bai Qijie, Lin Zhutian, et al. Dynamic heterogeneous network embedding based on non-decreasing temporal random walk[J]. Journal of Computer Research and Development,2021,58(8):1624−1641 doi: 10.7544/issn1000-1239.2021.20210317
|
[14] |
张小驰,于华,宫秀军. 一种基于随机游走的迭代加权子图查询算法[J]. 计算机研究与发展,2015,52(12):2824−2833 doi: 10.7544/issn1000-1239.2015.20140801
Zhang Xiaochi, Yu Hua, Gong Xiujun. A random walk based iterative weighted algorithm for sub-graph query[J]. Journal of Computer Research and Development,2015,52(12):2824−2833 doi: 10.7544/issn1000-1239.2015.20140801
|
[15] |
Wu X, Pang H, Fan Y, et al. ProbWalk: A random walk approach in weighted graph embedding[J]. Procedia Computer Science, 2021, 183: 683−689 doi: 10.1016/j.procs.2021.02.115
|
[16] |
Jung J, Shin K, Sael L, et al. Random walk with restart on large graphs using block elimination[J]. ACM Transactions on Database Systems, 2016, 41(2): 1−43
|
[17] |
Wong C K, Easton M C. An efficient method for weighted sampling without replacement[J]. SIAM Journal on Computing, 1980, 9(1): 111−113 doi: 10.1137/0209009
|
[18] |
Walker A J. New fast method for generating discrete random numbers with arbitrary frequency distributions[J]. Electronics Letters, 1974, 10(8): 127−128 doi: 10.1049/el:19740097
|
[19] |
Walker A J. An efficient method for generating discrete random variables with general distributions[J]. ACM Transactions on Mathematical Software, 1977, 3(3): 253−256 doi: 10.1145/355744.355749
|
[20] |
Casella G, Robert C P, Wells M T. Generalized accept-reject sampling schemes[J]. Lecture notes-monograph series, 2004: 342−347
|
[21] |
Guo Q, Wang S, Wei Z, et al. Influence maximization revisited: Efficient sampling with bound tightened[J]. ACM Transactions on Database Systems, 2022, 47(3): 1−45
|
[22] |
Yi L, Wang H, Wei Z. Optimal dynamic subset sampling: Theory and applications[C]//Proc of the 29th ACM SIGKDD Conf on Knowledge Discovery & Data Mining. New York: ACM, 2023: 3116–3127
|
[23] |
Bhattacharya S, Kiss P, Sidford A, et al. Near-optimal dynamic rounding of fractional matchings in bipartite graphs[J]. arXiv preprint, arXiv: 2306.11828, 2023
|
[24] |
Mitzenmacher M, Upfal E. Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis[M]. Cambridge, UK: Cambridge University Press, 2017
|
[25] |
Tsai M T, Wang D W, Liau C J, et al. Heterogeneous subset sampling[C]//Proc of Computing and Combinatorics: 16th Annual Int Conf. Berlin: Springer, 2010: 500−509
|
[26] |
Bringmann K, Panagiotou K. Efficient sampling methods for discrete distributions[C]//Proc of the 39th Int Colloquium (ICALP 2012). Berlin: Springer, 2012: 133−144
|
[27] |
Benson A R, Abebe R, Schaub M T, et al. Simplicial closure and higher-order link prediction[J]. Proceedings of the National Academy of Sciences, 2018, 115(48): E11221−E11230
|
[28] |
Hessel J, Tan C, Lee L. Science, askscience, and badscience: On the coexistence of highly related communities[C]//Proc of the Int AAAI Conf on web and social media. Palo Alto, CA: AAAI, 2016: 171–180
|
[29] |
Liu P, Benson A R, Charikar M. Sampling methods for counting temporal motifs[C]//Proc of the 12th ACM Int Conf on Web Search and Data Mining. New York: ACM, 2019: 294−302
|
[30] |
Kumar R, Liu P, Charikar M, et al. Retrieving top weighted triangles in graphs[C]//Proc of the 13th Int Conf on Web Search and Data Mining. New York: ACM, 2020: 295−303
|
[31] |
Boldi P and Vigna S. 2004. The webgraph framework I: Compression techniques[C]//Proc of the 13th Int Conf on World Wide Web (WWW’04). New York: ACM, 595–602
|
[32] |
Boldi P, Rosa M, Santini M, et al. 2011. Layered label propagation: A multiresolution coordinate-free ordering for compressing social networks[C]//Proc of the 20th Int Conf on World wide web (WWW’11). New York: ACM, 587–596
|
[33] |
Kunegis J. 2013. KONECT: The Koblenz network collection[C]//Proc of the 22nd Int Conf on World Wide Web (WWW’13 Companion). New York: ACM, 1343–1350
|
[34] |
Kwak H, Lee C, Park H, et al. 2010. What is Twitter, a social network or a news media? [C]//Proc of the 19th Int Conf on World wide web (WWW '10). New York: ACM, 591–600
|
[35] |
Spielman D A, Teng S H. Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems[C]//Proc of the 36th Annual ACM Symp on Theory of Computing. New York: ACM, 2004: 81−90
|
[36] |
Yin H, Benson A R, Leskovec J, et al. Local higher-order graph clustering[C]//Proc of the 23rd ACM SIGKDD Int Conf on Knowledge Discovery and Data Mining. New York: ACM, 2017: 555−564
|