ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2015, Vol. 52 ›› Issue (4): 960-971.doi: 10.7544/issn1000-1239.2015.20131343

Previous Articles     Next Articles

Edge Cluster Based Large Graph Partitioning and Iterative Processing in BSP

Leng Fangling1,Liu Jinpeng1,Wang Zhigang1,Chen Changning1,Bao Yubin1,Yu Ge1,Deng Chao2   

  1. 1(College of Information Science and Engineering, Northeastern University, Shenyang 110819); 2(Division for Business Support, China Mobile Institute, Beijing 100053)
  • Online:2015-04-01

Abstract: With the development of Internet and the gradual maturity of related techniques in recent years, the processing of large graphs has become a new hot research topic. Since it is not appropriate for traditional cloud computing platforms to process graph data iteratively, such as Hadoop, researchers have proposed some solutions based on the BSP model, such as Pregel, Hama and Giraph. However, since graph algorithms need to frequently exchange intermediate results in accordance with the graph’s topological structure, the tremendous communication overhead impacts the processing performance of systems based on the BSP model greatly. In this paper, we first analyze the solutions proposed by the well-known BSP-based systems in reducing communication overhead, and then propose a graph partition strategy named edge cluster based vertically hybrid partitioning (EC-VHP), building a cost benefit model to study its effectiveness to the communication overhead. Then based on EC-VHP, we propose a vertex-edge computation model, and design both a plain hash index structure and a multi-queue parallel sequential index structure to further improve the processing efficiency of message communication. Finally, our experiments on real and synthetic data sets demonstrate the efficiency and accuracy of the EC-VHP and the index mechanism.

Key words: large graph, BSP model, graph partition, vertex-edge computation model, index structure

CLC Number: