Processing math: 100%
  • 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Wang Hanzhi, Yi Lu, Wei Zhewei, Gan Junhao, Yuan Ye, Wen Jirong, Du Xiaoyong. Random-Walk Probability Computation on Dynamic Weighted Graphs[J]. Journal of Computer Research and Development, 2024, 61(8): 1865-1881. DOI: 10.7544/issn1000-1239.202440148
Citation: Wang Hanzhi, Yi Lu, Wei Zhewei, Gan Junhao, Yuan Ye, Wen Jirong, Du Xiaoyong. Random-Walk Probability Computation on Dynamic Weighted Graphs[J]. Journal of Computer Research and Development, 2024, 61(8): 1865-1881. DOI: 10.7544/issn1000-1239.202440148

Random-Walk Probability Computation on Dynamic Weighted Graphs

Funds: This work was supported by the National Natural Science Foundation of China (U2241212, 61932001), the Beijing Natural Science Foundation (4222028), the Beijing Outstanding Young Scientist Program (BJJWZYJH012019100020098), the Huawei-Renmin University Joint Program on Information Retrieval, the Outstanding Innovative Talents Cultivation Funded Programs 2023 of Renmin Univertity of China, the Fund for Building World-class Universities (disciplines) of Renmin University of China, the Engineering Research Center of Next-Generation Intelligent Search and Recommendation Funds, Ministry of Education, and the Public Computing Cloud, Renmin University of China.
More Information
  • Author Bio:

    Wang Hanzhi: born in 1997. PhD candidate. Her main research interest includes graph analysis and learning on large graphs

    Yi Lu: born in 2000. PhD student. Her main research interests include graph analysis and learning, dynamic graph neural network,data removal for graph neural network

    Wei Zhewei: born in 1986. PhD, professor. His main research interests include big data and machine learning algorithms, graph analysis and learning, and streaming algorithms

    Gan Junhao: born in 1988. PhD, senior lecturer. His main research interests include practical algorithms with non-trivial theoretical guarantees for solving problems on massive data

    Yuan Ye: born in 1981. PhD, professor. His main research interests include big data management and analysis, artificial intelligence, and distributed computing

    Wen Jirong: born in 1972. PhD, professor. His main research interests include information retrieval, data mining, and machine learning

    Du Xiaoyong: born in 1963. PhD, professor. His main research interests include database system, and big data management and analysis

  • Received Date: March 01, 2024
  • Revised Date: April 24, 2024
  • Available Online: May 16, 2024
  • Computing random-walk probabilities on graphs is the subject of extensive research in both graph theory and data mining research. However, existing work mainly focuses on static graphs, and cannot efficiently support dynamic weighted graphs, which are ubiquitous in real-world applications. We study the problem of computing random-walk probabilities on dynamic weighted graphs. We propose to use a sampling schema called coin flip sampling, rather than the more commonly adopted weighted sampling schema, for simulating random walks in dynamic weighted graphs. We demonstrate that simulations based on coin-flip sampling maintain the unbiasedness of the resulting random-walk probability approximations. Moreover, this approach allows us to simultaneously achieve a near-optimal query time complexity and an optimal O(1) update time overhead per edge insertion or deletion. This is a significant improvement over existing methods, which typically incur substantial sampling costs or rely on intricate auxiliary structures that are hard to maintain in a dynamic setting. We present both theoretical analysis and empirical evaluations to substantiate the superiority of our method on dynamic weighted graphs.

  • [1]
    Banerjee S, Lofgren P. Fast bidirectional probability estimation in Markov models[J]. arXiv:1507.05998v1
    [2]
    Fogaras D and Rácz B. Scaling link-based similarity search[C]//Proc of the 14th Int Conf on World Wide Web (WWW '05). New York: ACM, 2005: 641–650
    [3]
    Fogaras D, Rácz B, Csalogány K, et al. Towards scaling fully personalized pagerank: Algorithms, lower bounds, and experiments[J]. Internet Mathematics, 2005, 2(3): 333−358 doi: 10.1080/15427951.2005.10129104
    [4]
    Kallaugher J, Kapralov M, Price E. Simulating random walks in random streams[C]//Proc of the 2022 Annual ACM-SIAM Symp on Discrete Algorithms (SODA). New York: Society for Industrial and Applied Mathematics, 2022: 3091−3126
    [5]
    Li Ronghua, Yu J X, Lu Qin, et al. On random walk based graph sampling[C]/Proc of IEEE 31st Int Conf on Data Engineering. Piscataway, NJ: IEEE, 2015: 927−938
    [6]
    Nazi A, Zhou Z, Thirumuruganathan S, et al. Walk, Not Wait: Faster sampling over online social networks[J]. arXiv:1410.7833v2
    [7]
    Wang S, Yang R, Xiao X, et al. I: Simple and effective approximate single-source personalized pagerank[C]//Proc of the 23rd ACM SIGKDD Int Conf on Knowledge Discovery and Data Mining. New York: ACM, 2017: 505−514
    [8]
    Chen M, Wei Z, Ding B, et al. Scalable graph neural networks via bidirectional propagation[J]. Advances in Neural Information Processing Systems, 2020, 33: 14556−14566
    [9]
    Perozzi B, Al-Rfou R, and Skiena S. DeepWalk: Online learning of social representations[C]//Proc of the 20th ACM SIGKDD Int Conf on Knowledge discovery and data mining (KDD '14). New York: ACM, 2014: 701–710
    [10]
    Wu F, Souza A, Zhang T, et al. Simplifying graph convolutional networks[C]//Proc of Int Conf on Machine Learning, Long Beach, California: PMLR, 2019: 6861–6871
    [11]
    Wang H, He M, Wei Z, et al. Approximate graph propagation[C]//Proc of the 27th ACM SIGKDD Conf on Knowledge Discovery & Data Mining. New York: ACM, 2021: 1686−1696
    [12]
    Yang R, Xiao X, Wei Z, et al. 2019. Efficient estimation of heat kernel PageRank for local clustering[C]//Proc of the 2019 Int Conf on Management of Data (SIGMOD’19). New York: ACM, 2019: 1339–1356
    [13]
    郭佳雯,白淇介,林铸天,等. 基于非递减时序随机游走的动态异质网络嵌入[J]. 计算机研究与发展,2021,58(8):1624−1641 doi: 10.7544/issn1000-1239.2021.20210317

    Guo Jiawen, Bai Qijie, Lin Zhutian, et al. Dynamic heterogeneous network embedding based on non-decreasing temporal random walk[J]. Journal of Computer Research and Development,2021,58(8):1624−1641 doi: 10.7544/issn1000-1239.2021.20210317
    [14]
    张小驰,于华,宫秀军. 一种基于随机游走的迭代加权子图查询算法[J]. 计算机研究与发展,2015,52(12):2824−2833 doi: 10.7544/issn1000-1239.2015.20140801

    Zhang Xiaochi, Yu Hua, Gong Xiujun. A random walk based iterative weighted algorithm for sub-graph query[J]. Journal of Computer Research and Development,2015,52(12):2824−2833 doi: 10.7544/issn1000-1239.2015.20140801
    [15]
    Wu X, Pang H, Fan Y, et al. ProbWalk: A random walk approach in weighted graph embedding[J]. Procedia Computer Science, 2021, 183: 683−689 doi: 10.1016/j.procs.2021.02.115
    [16]
    Jung J, Shin K, Sael L, et al. Random walk with restart on large graphs using block elimination[J]. ACM Transactions on Database Systems, 2016, 41(2): 1−43
    [17]
    Wong C K, Easton M C. An efficient method for weighted sampling without replacement[J]. SIAM Journal on Computing, 1980, 9(1): 111−113 doi: 10.1137/0209009
    [18]
    Walker A J. New fast method for generating discrete random numbers with arbitrary frequency distributions[J]. Electronics Letters, 1974, 10(8): 127−128 doi: 10.1049/el:19740097
    [19]
    Walker A J. An efficient method for generating discrete random variables with general distributions[J]. ACM Transactions on Mathematical Software, 1977, 3(3): 253−256 doi: 10.1145/355744.355749
    [20]
    Casella G, Robert C P, Wells M T. Generalized accept-reject sampling schemes[J]. Lecture notes-monograph series, 2004: 342−347
    [21]
    Guo Q, Wang S, Wei Z, et al. Influence maximization revisited: Efficient sampling with bound tightened[J]. ACM Transactions on Database Systems, 2022, 47(3): 1−45
    [22]
    Yi L, Wang H, Wei Z. Optimal dynamic subset sampling: Theory and applications[C]//Proc of the 29th ACM SIGKDD Conf on Knowledge Discovery & Data Mining. New York: ACM, 2023: 3116–3127
    [23]
    Bhattacharya S, Kiss P, Sidford A, et al. Near-optimal dynamic rounding of fractional matchings in bipartite graphs[J]. arXiv preprint, arXiv: 2306.11828, 2023
    [24]
    Mitzenmacher M, Upfal E. Probability and Computing: Randomization and Probabilistic Techniques in Algorithms and Data Analysis[M]. Cambridge, UK: Cambridge University Press, 2017
    [25]
    Tsai M T, Wang D W, Liau C J, et al. Heterogeneous subset sampling[C]//Proc of Computing and Combinatorics: 16th Annual Int Conf. Berlin: Springer, 2010: 500−509
    [26]
    Bringmann K, Panagiotou K. Efficient sampling methods for discrete distributions[C]//Proc of the 39th Int Colloquium (ICALP 2012). Berlin: Springer, 2012: 133−144
    [27]
    Benson A R, Abebe R, Schaub M T, et al. Simplicial closure and higher-order link prediction[J]. Proceedings of the National Academy of Sciences, 2018, 115(48): E11221−E11230
    [28]
    Hessel J, Tan C, Lee L. Science, askscience, and badscience: On the coexistence of highly related communities[C]//Proc of the Int AAAI Conf on web and social media. Palo Alto, CA: AAAI, 2016: 171–180
    [29]
    Liu P, Benson A R, Charikar M. Sampling methods for counting temporal motifs[C]//Proc of the 12th ACM Int Conf on Web Search and Data Mining. New York: ACM, 2019: 294−302
    [30]
    Kumar R, Liu P, Charikar M, et al. Retrieving top weighted triangles in graphs[C]//Proc of the 13th Int Conf on Web Search and Data Mining. New York: ACM, 2020: 295−303
    [31]
    Boldi P and Vigna S. 2004. The webgraph framework I: Compression techniques[C]//Proc of the 13th Int Conf on World Wide Web (WWW’04). New York: ACM, 595–602
    [32]
    Boldi P, Rosa M, Santini M, et al. 2011. Layered label propagation: A multiresolution coordinate-free ordering for compressing social networks[C]//Proc of the 20th Int Conf on World wide web (WWW’11). New York: ACM, 587–596
    [33]
    Kunegis J. 2013. KONECT: The Koblenz network collection[C]//Proc of the 22nd Int Conf on World Wide Web (WWW’13 Companion). New York: ACM, 1343–1350
    [34]
    Kwak H, Lee C, Park H, et al. 2010. What is Twitter, a social network or a news media? [C]//Proc of the 19th Int Conf on World wide web (WWW '10). New York: ACM, 591–600
    [35]
    Spielman D A, Teng S H. Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems[C]//Proc of the 36th Annual ACM Symp on Theory of Computing. New York: ACM, 2004: 81−90
    [36]
    Yin H, Benson A R, Leskovec J, et al. Local higher-order graph clustering[C]//Proc of the 23rd ACM SIGKDD Int Conf on Knowledge Discovery and Data Mining. New York: ACM, 2017: 555−564
  • Cited by

    Periodical cited type(0)

    Other cited types(3)

Catalog

    Article views (405) PDF downloads (133) Cited by(3)

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return