Parallel Streaming Graph Processing with Hybrid Incremental Computation
-
Graphical Abstract
-
Abstract
Streaming graphs can effectively model rapidly changing data in real life, and have been widely used in social network analysis, content recommendation, anomaly detection, and other fields. Based on the fact that the two graph snapshots before and after the graph update have a large amount of the same data, incremental computation reduces the amount of memory accesses and computation in the iterations by storing and reusing the historical results, which effectively improves the performance of streaming graph processing. However, the existing solutions for incremental computation optimization of graph algorithms are often limited by graph algorithms that satisfy specific properties, thus it is difficult to apply them to general graph algorithms. In this paper, the incremental computation is further divided into two modes of refinement and recomputation. The similarities and differences between the two modes are analyzed theoretically, and the performance differences between them are tested experimentally with different graph datasets, graph algorithms, and graph update settings. Based on the two modes, the hybrid incremental computation mode is designed to ensure the correctness of switching, and the switching time is predicted accurately by a random forest classifier. Through performance evaluation and switching effect analysis, it is proved that the general graph algorithms can effectively switch under the hybrid mode. Compared with the start-of-the-art solution DZIG, the average speedup is 1.25×.
-
-