基于空间位置关系的轨迹数据高效降维和查询算法

巢成; 蒲非凡; 许建秋; 高云君

doi:10.7544/issn1000-1239.202330609

基于空间位置关系的轨迹数据高效降维和查询算法

Efficient Dimensionality Reduction and Query Algorithm of Trajectory Data Based on Spatial Position Relation

摘要

摘要: 由于新型信息技术的快速发展，社会处于数字化、信息化转型的关键时期，各行业对于以数据库技术为基础的信息系统的需求也日益凸显. 基于位置的服务依赖于海量实时生成的轨迹数据，在处理亿万级随时间连续变化的轨迹数据时，降维算法和查询技术一直是研究的关键，通过降低轨迹数据的规模，减少查询操作时处理数据的时间，能有效提升查询的性能，而能否实现高质量、高效率查询对于数据库而言至关重要. 提出了面向轨迹数据的均匀网格编码，并在进一步优化后提出非均匀网格降维算法，将轨迹数据的坐标转化为1维字符串存储，对不符合要求的网格进行合并处理；通过空间位置映射充分保留轨迹数据间复杂的相互关系，并采用范围查询与最近邻查询对降维后的数据进行性能测试. 实验使用不同城市真实轨迹数据与模拟生成轨迹数据作为数据集，将提出的均匀网格算法、非均匀网格算法与3种基准方法进行对比. 实验证明，优化后的非均匀网格算法降维后数据的空间位置关系相似度可高达82.50%，范围查询时间较其他查询时间提升了至少73.86%，最近邻查询时间提升了至少52.26%，与其他基准方法相比取得了更好的效果.

Abstract: Due to the rapid development of information technology, society is in a critical period of digitalization and information transformation, and the demand for information systems based on database technology in various industries is becoming increasingly prominent. Location-based services rely on massive real-time generated trajectory data. In the processing of hundreds of millions of continuously changing trajectory data, dimensionality reduction algorithm and query technology have been the key to research. By reducing the scale of trajectory data and reducing the time of data processing during query operations, the performance of query can be effectively improved, and whether high-quality and efficient query can be achieved is very important for the database. In this paper, a UGC(uniform grid code) and a NGDR(non-uniform grid dimensionality reduction algorithm) for trajectory data are proposed, which convert the coordinates of trajectory data into one-dimensional string storage, merge the grids that do not meet the requirements, fully retain the complex interrelationship between trajectory data through spatial position mapping, and use range query and nearest neighbor query to test the performance of the reduced data. The real trajectory and virtual generated trajectory data in different cities are used as datasets, and the uniform grid code algorithm, non-uniform grid algorithm proposed in this paper are compared with three benchmark methods. Experiments show that the spatial position relationship similarity of the data after NGDR can be up to 82.5%. The range query time of NGDR is improved at least by 73.86% compared with the other queries, and the nearest neighbour query time is improved at least by 52.26%, which achieves better results than other benchmark methods.

HTML全文

参考文献(34)

施引文献

资源附件(0)