ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2022, Vol. 59 ›› Issue (2): 264-281.doi: 10.7544/issn1000-1239.20210913

Special Issue: 2022空间数据智能专题

Previous Articles     Next Articles

Spatial Occupancy-Based Dominant Co-Location Patterns Mining

Fang Yuan1,2, Wang Lizhen3, Wang Xiaoxuan3, Yang Peizhong3   

  1. 1(School of Mathematics and Statistics, Yunnan University, Kunming 650500);2(South-Western Institute for Astronomy Research, Yunnan University, Kunming 650500);3(School of Information Science and Engineering, Yunnan University, Kunming 650500)
  • Online:2022-02-01
  • Supported by: 
    This work was supported by the National Natural Science Foundation of China (61966036, 61662086), the Project of Innovative Team of Yunnan Province (2018HC019), and the Post Doctor Foundation of Yunnan University (C176220200).

Abstract: Traditional spatial co-location pattern mining aims to discover the subset of spatial feature set whose instances are prevalently located together in geographic neighborhoods. Most previous studies take the prevalence of patterns as an interestingness measure. However, It may well be that users are not only interested in identifying the prevalence of a feature set, but also its completeness, namely the portion of co-location instances that a pattern occupies in their neighborhood. Combining the prevalence and completeness of co-location patterns, we can provide users with a set of higher quality co-location patterns called dominant spatial co-location patterns (DSCPs). In this paper, we introduce an occupancy measure into the spatial co-location pattern mining task to measure the completeness of co-location patterns. Then we formulate the problem of DSCPs mining by considering both the completeness and prevalence. Thirdly, we present a basic algorithm for discovering DSCPs. In order to reduce the high computational cost, a series of pruning strategies are given to improve the algorithm efficiency. Finally, the experiments are conducted both on synthetic and real-world data sets, and the efficiency and effectiveness of the proposed algorithms are evaluated. The running time on synthetic data sets shows our pruning strategies are efficient. The mining results in two real-world applications demonstrate that DSCPs are reasonable and acceptable.

Key words: spatial data mining, dominant spatial co-location patterns (DSCPs), occupancy metrics, prevalence metrics, spatial association rules

CLC Number: