• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
高级检索

BSP模型下基于边聚簇的大图划分与迭代处理

冷芳玲, 刘金鹏, 王志刚, 陈昌宁, 鲍玉斌, 于戈, 邓超

冷芳玲, 刘金鹏, 王志刚, 陈昌宁, 鲍玉斌, 于戈, 邓超. BSP模型下基于边聚簇的大图划分与迭代处理[J]. 计算机研究与发展, 2015, 52(4): 960-971. DOI: 10.7544/issn1000-1239.2015.20131343
引用本文: 冷芳玲, 刘金鹏, 王志刚, 陈昌宁, 鲍玉斌, 于戈, 邓超. BSP模型下基于边聚簇的大图划分与迭代处理[J]. 计算机研究与发展, 2015, 52(4): 960-971. DOI: 10.7544/issn1000-1239.2015.20131343
Leng Fangling, Liu Jinpeng, Wang Zhigang, Chen Changning, Bao Yubin, Yu Ge, Deng Chao. Edge Cluster Based Large Graph Partitioning and Iterative Processing in BSP[J]. Journal of Computer Research and Development, 2015, 52(4): 960-971. DOI: 10.7544/issn1000-1239.2015.20131343
Citation: Leng Fangling, Liu Jinpeng, Wang Zhigang, Chen Changning, Bao Yubin, Yu Ge, Deng Chao. Edge Cluster Based Large Graph Partitioning and Iterative Processing in BSP[J]. Journal of Computer Research and Development, 2015, 52(4): 960-971. DOI: 10.7544/issn1000-1239.2015.20131343
冷芳玲, 刘金鹏, 王志刚, 陈昌宁, 鲍玉斌, 于戈, 邓超. BSP模型下基于边聚簇的大图划分与迭代处理[J]. 计算机研究与发展, 2015, 52(4): 960-971. CSTR: 32373.14.issn1000-1239.2015.20131343
引用本文: 冷芳玲, 刘金鹏, 王志刚, 陈昌宁, 鲍玉斌, 于戈, 邓超. BSP模型下基于边聚簇的大图划分与迭代处理[J]. 计算机研究与发展, 2015, 52(4): 960-971. CSTR: 32373.14.issn1000-1239.2015.20131343
Leng Fangling, Liu Jinpeng, Wang Zhigang, Chen Changning, Bao Yubin, Yu Ge, Deng Chao. Edge Cluster Based Large Graph Partitioning and Iterative Processing in BSP[J]. Journal of Computer Research and Development, 2015, 52(4): 960-971. CSTR: 32373.14.issn1000-1239.2015.20131343
Citation: Leng Fangling, Liu Jinpeng, Wang Zhigang, Chen Changning, Bao Yubin, Yu Ge, Deng Chao. Edge Cluster Based Large Graph Partitioning and Iterative Processing in BSP[J]. Journal of Computer Research and Development, 2015, 52(4): 960-971. CSTR: 32373.14.issn1000-1239.2015.20131343

BSP模型下基于边聚簇的大图划分与迭代处理

基金项目: 国家自然科学基金重点项目(61033007); 国家自然科学基金项目(61173028, 61272179);中央高校基本科研业务费专项基金项目(N100704001); 教育部-中国移动科研基金项目(MCM20125021)
详细信息
  • 中图分类号: TP311.13

Edge Cluster Based Large Graph Partitioning and Iterative Processing in BSP

  • 摘要: 近年来随着互联网的普及和相关技术的日益成熟,大规模图数据处理成为新的研究热点.由于传统的如Hadoop等通用云平台不适合迭代式地处理图数据,研究人员基于BSP模型提出了新的处理方案,如Pregel,Hama,Giraph等.然而,图处理算法需要按照图的拓扑结构频繁交换中间计算结果而导致巨大的通信开销,这严重地影响了基于BSP模型的系统的处理性能.首先从降低消息通信的角度分析当前主流BSP系统的处理方案,然后提出了一种基于边聚簇的垂直混合划分策略(EC-VHP),并建立代价收益模型分析其消息通信优化的效果.在EC-VHP的基础上,提出了一个点-边计算模型,并设计了简单Hash索引和多队列并行顺序索引机制,进一步提高消息通信的处理效率.最后,在真实数据集和模拟数据集上的大量实验,验证了EC-VHP策略和索引机制的正确性和有效性.
    Abstract: With the development of Internet and the gradual maturity of related techniques in recent years, the processing of large graphs has become a new hot research topic. Since it is not appropriate for traditional cloud computing platforms to process graph data iteratively, such as Hadoop, researchers have proposed some solutions based on the BSP model, such as Pregel, Hama and Giraph. However, since graph algorithms need to frequently exchange intermediate results in accordance with the graph’s topological structure, the tremendous communication overhead impacts the processing performance of systems based on the BSP model greatly. In this paper, we first analyze the solutions proposed by the well-known BSP-based systems in reducing communication overhead, and then propose a graph partition strategy named edge cluster based vertically hybrid partitioning (EC-VHP), building a cost benefit model to study its effectiveness to the communication overhead. Then based on EC-VHP, we propose a vertex-edge computation model, and design both a plain hash index structure and a multi-queue parallel sequential index structure to further improve the processing efficiency of message communication. Finally, our experiments on real and synthetic data sets demonstrate the efficiency and accuracy of the EC-VHP and the index mechanism.
计量
  • 文章访问数:  1371
  • HTML全文浏览量:  0
  • PDF下载量:  817
  • 被引次数: 0
出版历程
  • 发布日期:  2015-03-31

目录

    /

    返回文章
    返回