ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2018, Vol. 55 ›› Issue (3): 651-661.doi: 10.7544/issn1000-1239.2018.20160845

• 软件技术 • 上一篇    下一篇

基于GPU的RDF类型同构并行算法

冯佳颖1,3,张小旺1,3,冯志勇2,3   

  1. 1(天津大学计算机科学与技术学院 天津 300350); 2(天津大学软件学院 天津 300350); 3(天津市认知计算与应用重点实验室 天津 300350) (fengjiaying@tju.edu.cn)
  • 出版日期: 2018-03-01
  • 基金资助: 
    国家重点研发计划项目(2016YFB1000603);国家自然科学基金项目(61672377);天津市科技支撑重点项目(16YFZCGX00210)

Parallel Algorithms for RDF Type-Isomorphism on GPU

Feng Jiaying1,3, Zhang Xiaowang1,3, Feng Zhiyong2,3   

  1. 1(School of Computer Science and Technology, Tianjin University, Tianjin 300350); 2(School of Computer Software, Tianjin University, Tianjin 300350); 3(Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin 300350)
  • Online: 2018-03-01

摘要: 资源描述框架(resource description framework, RDF)作为W3C(World Wide Web Consortium)组织提出的语义网数据规范,描述了资源及其之间的关系.随着RDF数据规模不断增加,高效地检索RDF数据成为当前面临的重大挑战.在RDF数据上的查询响应问题可以被简化为子图同构问题.作为子图同构的重要部分,类型同构(type-isomorphism)在处理部分RDF查询,如星状查询和链状查询等,具有较高的性能.目前,现有解决类型同构的方法匹配效率均依赖于CPU的计算能力.近年来,图像处理单元(graphic processing units, GPU)的发展提高了图数据处理的性能.与CPU相比,GPU多处理器具有高并发、易扩展以及价格成本低等优势.由于CPU处理大规模RDF数据的计算能力有限,提出一种基于GPU的RDF类型同构算法,使类型同构问题在GPU架构上通过并行的方式解决.最后,实现了基于GPU的RDF类型同构算法,并在基准数据集LUBM上对该算法进行性能测试,实验结果表明:该算法显著优于基于CPU架构的算法.

关键词: 资源描述框架, SPARQL查询处理, 子图同构, 类型同构, 图像处理单元

Abstract: Resource description framework (RDF), officially recommended by the World Wide Web Consortium (W3C), describes resources and the relationships of them on the Web. With the volume of RDF data rapidly increasing, a high performance method is necessary to efficiently process SPAQRL (simple protocol and RDF query language) query over RDF data, which can be reduced to the classical problem—subgraph isomorphism. As an important class of subgraph isomorphism, type-isomorphism helps many interesting queries over RDF data to get high performance such as star or linear query structures. However, many existing approaches, which are proposed to solve type-isomorphism, mostly depend on calculative capabilities of CPU. In recent years, graphic processing units (GPU) has been adopted to accelerate graph data processing widely in several works, which have better computational performance, superior scalability, and more reasonable prices. Considering the limited calculative capabilities of CPU in handling large-scale RDF data, we propose an algorithm that processes type-isomorphism problem on parallel GPU architecture over RDF datasets. In this paper, we implement the algorithm and evaluate it in the benchmark datasets—lehigh university benchmark (LUBM) through a mass of experiments. The experimental results show that our algorithm outperforms significantly than the CPU-based algorithms.

Key words: resource description framework, SPARQL query processing, subgraph isomorphism, type-isomorphism, graphic processing units

中图分类号: