Abstract:
Traditional spatial co-location pattern mining aims to discover the subset of spatial feature set whose instances are prevalently located together in geographic neighborhoods. Most previous studies take the prevalence of patterns as an interestingness measure. However, It may well be that users are not only interested in identifying the prevalence of a feature set, but also its completeness, namely the portion of co-location instances that a pattern occupies in their neighborhood. Combining the prevalence and completeness of co-location patterns, we can provide users with a set of higher quality co-location patterns called dominant spatial co-location patterns (DSCPs). In this paper, we introduce an occupancy measure into the spatial co-location pattern mining task to measure the completeness of co-location patterns. Then we formulate the problem of DSCPs mining by considering both the completeness and prevalence. Thirdly, we present a basic algorithm for discovering DSCPs. In order to reduce the high computational cost, a series of pruning strategies are given to improve the algorithm efficiency. Finally, the experiments are conducted both on synthetic and real-world data sets, and the efficiency and effectiveness of the proposed algorithms are evaluated. The running time on synthetic data sets shows our pruning strategies are efficient. The mining results in two real-world applications demonstrate that DSCPs are reasonable and acceptable.