• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
He Honghui, Wang Lizhen, and Zhou Lihua. pgi-distance: An Efficient Method Supporting Parallel KNN-join Process[J]. Journal of Computer Research and Development, 2007, 44(10): 1774-1781.
Citation: He Honghui, Wang Lizhen, and Zhou Lihua. pgi-distance: An Efficient Method Supporting Parallel KNN-join Process[J]. Journal of Computer Research and Development, 2007, 44(10): 1774-1781.

pgi-distance: An Efficient Method Supporting Parallel KNN-join Process

More Information
  • Published Date: October 14, 2007
  • The KNN-join has extensive applications where high-dimensional data is involved, because it is powerful and can provide more meaningful results than range-join. With its set-a-time nature, the KNN-join is able to facilitate many data mining tasks such as classification, outlier detection and clustering. MuX and Goreder are two algorithms specially designed for KNN-join. Since the MuX makes use of R-tree to reduce CPU cost, it suffers like all of the algorithms based on R-tree. In order to optimize both I/O and CPU cost, the Goreder applies a grid on data space. With the difference of data distribution, however, it is difficult to determine the cell's appropriate granularity. In the consequence, the Goreder tends to perform poorly. In this paper, a novel parallel method—pgi-distance (parallel grid index-distance)—is proposed, which combines the advantages of MuX and Goreder. On the one hand, the pgi-distance utilizes a two-tier structure to optimize I/O and CPU time separately. On the other hand, the index based on distance makes the pgi-distance adapted well to data dimensionality and data distribution. Especially the index in the pgi-distance is B\++-tree, which is supported by all commercial DBMS. So, this method becomes more applicable than that based on R-tree. Extensive experiments on both synthetic datasets and real datasets are conducted, and the results illustrate that pgi-distance is efficient.
  • Related Articles

    [1]Zhang Wei, Han Linyu, Zhang Dianlei, Ren Pengjie, Ma Jun, Chen Zhumin. GeoPMF: A Distance-Aware Tour Recommendation Model[J]. Journal of Computer Research and Development, 2017, 54(2): 405-414. DOI: 10.7544/issn1000-1239.2017.20150822
    [2]Xin Wei, Sun Huiping, Chen Zhong. Analysis and Design of Distance-Bounding Protocols for RFID[J]. Journal of Computer Research and Development, 2013, 50(11): 2358-2366.
    [3]Yang Liu, Yu Jian, Jing Liping. An Adaptive Large Margin Nearest Neighbor Classification Algorithm[J]. Journal of Computer Research and Development, 2013, 50(11): 2269-2277.
    [4]Zhao Xiaoming, Ye Xijian. A New Approach to Ridgelet Transform[J]. Journal of Computer Research and Development, 2008, 45(5): 915-922.
    [5]He Honghui, Wang Lizhen, and Zhou Lihua. pgi-distance: An Efficient Method Supporting Parallel KNN-join Process[J]. Journal of Computer Research and Development, 2007, 44(10): 1774-1781.
    [6]Liu Bing, Wang Wei, and Shi Baile. The Tight Estimation Distance Using Wavelet[J]. Journal of Computer Research and Development, 2006, 43(10): 1732-1737.
    [7]Ming Xing, Liu Yuanning, Zhu Xiaodong, Xu Tao. Iris Recognition Based on Wavelet Transform with Shift Invariance Preprocessing[J]. Journal of Computer Research and Development, 2006, 43(7): 1186-1193.
    [8]Song Chuanming, Wang Xianghai. A Shape Adaptive Integer Wavelet Coding Algorithm Based on New Quantization Scheme[J]. Journal of Computer Research and Development, 2006, 43(4): 695-701.
    [9]Ji Qijin and Dong Yongqiang. Self-Similar Traffic Synthesizing Using Gaussian Mixture Model in Wavelet Domain[J]. Journal of Computer Research and Development, 2006, 43(3): 389-394.
    [10]Shang Zhaowei, Zhang Mingxin, Zhao Ping, Shen Junyi. Different Complex Wavelet Transforms for Texture Retrieval and Similarity Measure[J]. Journal of Computer Research and Development, 2005, 42(10): 1746-1751.

Catalog

    Article views (928) PDF downloads (706) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return