Abstract:
The popularity of mobile positioning terminals makes users’ locations be easily accessible, which contributes huge amount of trajectory data. Universal companion pattern mining aims at discovering those highly overlapping behavior paths between moving objects in spatio-temporal dimensions, and it is very valuable and challenging to provide effective and efficient pattern mining methods on large-scale trajectories. Obviously, the mining strategy on a centralized environment is incompetent for the consideration of scalability caused by huge and growing trajectory data. Existing distributed mining frameworks are weak in both providing effective input for efficient pattern mining and the processing ability on a large number of loose connections in massive trajectories, which should be covered to improve mining performance. In this study, we propose a distributed two-stage mining framework, DMFUCP, which embeds optimization on data preprocessing and loose connection analysis to provide more efficient and effective universal companion pattern mining. In the data preprocessing stage of DMFUCP, we design both a density clustering algorithm DBSCANCD and a clustering balance algorithm TCB to input high-quality trajectory data with less noisy for mining tasks. In the mining stage of DMFUCP, we propose both a G pruning repartition algorithm GSPR and a segmented enumeration algorithm SAE. GSPR introduces a parameter G to segment long trajectories and then repartitions all segments to improve the processing effectiveness on loose connections. SAE guarantees the mining performance through multithreading and forward closure. Compared with those existing companion pattern mining frameworks on real datasets, DMFUCP reduces the time required to mine each set of universal companion pattern by 20% to 40% while providing better universal companion pattern discovery capabilities.