ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2020, Vol. 57 ›› Issue (2): 333-345.doi: 10.7544/issn1000-1239.2020.20190565

所属专题: 2020大数据与智能存储系统前沿技术专题

• 系统结构 • 上一篇    下一篇

基于地理空间大数据的高效索引与检索算法

赵慧慧1,2, 赵 凡2,3, 陈仁海1,2, 冯志勇1,2   

  1. 1(天津大学智能与计算学部 天津 300350);2(天津大学深圳研究院 广东深圳 518000);3(天津大学国际工程师学院 天津 300350) (1442700849@qq.com)
  • 出版日期: 2020-02-01
  • 基金资助: 
    国家自然科学基金项目(61702357,61672377);深圳市科技创新委员会学科布局项目(JCYJ20170816093943197);天津市自然科学基金项目(18JCQNJC00300);天津大学北洋学者青年骨干教师项目(2019XRG-0004)

Efficient Index and Query Algorithm Based on Geospatial Big Data

Zhao Huihui1,2, Zhao Fan2,3, Chen Renhai1,2, and Feng Zhiyong1,2   

  1. 1(College of Intelligence and Computing, Tianjin University, Tianjin 300350);2(Shenzhen Research Institute of Tianjin University, Shenzhen, Guangdong 518000);3(Tianjin International Engineering Institute, Tianjin University, Tianjin 300350)
  • Online: 2020-02-01
  • Supported by: 
    This work was supported by the National Natural Science Foundation of China (61702357, 61672377), the Shenzhen Science and Technology Foundation (JCYJ20170816093943197), the Natural Science Foundation of Tianjin (18JCQNJC00300), and the Beiyang Scholar Foundation of Tianjin University (2019XRG-0004).

摘要: 近年来,随着智能目标识别、电子传感器、协同控制以及计算机网络等先进技术的快速发展,智能交通系统实现了质的飞越,现代智能交通系统可以实现车、路、云端于一体的智能交通运输管理平台.但智能交通系统依赖于每天产生的大量的2维地理空间信息数据,因此,如何对大规模的地理空间数据进行高效的存储和查询对智能交通系统未来的普及和发展具有重要意义.然而,由于城市交通信息的复杂性、数据量大、更新速度快等特征,当前的空间索引技术很难针对2维地理空间信息数据进行高效的检索.为了优化空间大数据下2维地理空间信息数据的存储组织结构、提高检索效率,提出了一种对2维地理空间信息数据进行多层切片递归的空间索引树构造算法(multi-layer slice recursive, MSR).提出的算法首先对地图数据第1维度进行排序划分切片,生成FD(first division)切片;然后对FD切片中的地图数据进行第2维度排序,生成SD(second division)切片,在SD切片中对当前切片和相邻切片划分空间对象;最后对空间对象长度与节点容量比较进行数据聚类操作,通过判断所有切片是否完成聚类操作,自下而上递归生成MSR 树.实验表明,MSR算法构建的2维空间存储结构的查询性能优于现在最具代表性的空间索引技术基于R树的批量加载算法(sort tile recursive, STR)、STR-网格混合算法(str-grid)及高效几何范围查询算法(efficient geometric range query, EGRQ).

关键词: 2维地理空间信息, 空间索引技术, 空间大数据, MSR算法, 聚类

Abstract: In recent years, with the rapid development of advanced technologies such as intelligent target recognition, electronic sensors, collaborative control and computer networks, intelligent transportation systems have achieved qualitative leapfrogging. Modern intelligent transportation systems can realize intelligent transportation of vehicles, roads and clouds management platform. However, the intelligent transportation system relies on a large amount of two-dimensional geospatial information data generated every day. Therefore, how to efficiently store and query large-scale geospatial data is of great significance for the future popularization and development of the intelligent transportation system. However, due to the complexity of urban traffic information, large amount of data, and fast update speed, the current spatial indexing technology is difficult to efficiently search for two-dimensional geospatial information data. In order to optimize the storage organization structure of two-dimensional geospatial information data under spatial big data and improve retrieval efficiency, this paper proposes a spatial index tree construction algorithm for multi-layer slice recursion of two-dimensional geospatial information data (multi-layer slice recursive, MSR). The proposed algorithm first sorts and divides the first dimension of the map data to generate FD slices. Then, the second dimension of the map data in the FD slice is sorted to generate SD slices, and in the SD slice, the current slice and the adjacent slices are divided into spatial objects. Finally, data clustering operation is performed on the comparison between the length of the spatial object and the node capacity, and the MSR Tree is recursively generated from the bottom up by judging whether all the slices complete the clustering operation. Experimental results show that the query performance of the 2-dimensional space storage structure constructed by the MSR algorithm is better than the most representative spatial indexing technology based on the R-tree batch-loading algorithm (sort tile recursive, STR), STR-grid hybrid algorithm (str-grid), and efficient geometric range query (EGRQ).

Key words: two-dimensional geospatial information, spatial indexing technology, big geo-data, MSR algorithm, clustering

中图分类号: