ISSN 1000-1239 CN 11-1777/TP

• 软件技术 •

### 大图数据上顶点驱动的并行最小生成树算法

1. (东北大学信息科学与工程学院 沈阳 110819) (guyu@ise.neu.edu.cn)
• 出版日期: 2014-12-01
• 基金资助:
基金项目：国家“九七三”重点基础研究发展计划基金项目(2012CB316201)；国家自然科学基金项目(61472071,61272179,61033007,61173028)；中央高校基本科研业务费专项资金项目(N130404010,N110404006)

### Vertex-Driven Parallel Minimum Spanning Tree Algorithms on Large Graphs

Gu Yu, Yang Jiaxue, Bao Yubin, Yu Ge

1. (College of Information Science and Engineering, Northeastern University, Shenyang 110819)
• Online: 2014-12-01

Abstract: The minimum spanning tree(MST) algorithm is one of the most classic algorithms in the graph theory. Some complex graph algorithms based on MST including clustering, classification and shortest path queries have been improved significantly in terms of efficiency and quality. However, with the rapid development of Internet, the scales of graphs have been becoming larger and larger. Large scale graphs which contain millions or even billions of vertices have become more common. Therefore, how to implement query processing and data mining algorithms on large scale graphs has become a problem to be solved urgently. In addition, because of the dynamic properties of large-scale graphs, how to maintain results dynamically has also become one of the most attractive problems. However, the state of the art MST algorithms can’t handle such massive and dynamic graph data. In this paper, we propose a vertex-driven parallel MST algorithm called PB based on a partition Prim algorithm named PP, and demonstrate the correctness of PB. Moreover, we implement the whole process of PB algorithm on the MapReduce and BSP framework respectively. Taking deletion-only graphs into consideration, we put forward a maintenance algorithm for MST which can conduct efficient incremental evaluation. All the computation and maintenance costs are further analyzed and compared. Finally, experiments based on real and synthetic data sets demonstrate the reliabilty, efficiency and scalability of our algorithms.