ISSN 1000-1239 CN 11-1777/TP

• Paper • Previous Articles     Next Articles

k-LDCHD—A Local Density Based k-Neighborhood Clustering Algorithm for High Dime nsional Space

Ni Weiwei, Sun Zhihui, and Lu Jieping   

  1. (Department of Computer Science and Engineering, Southeast University, Nanjing 210096)
  • Online:2005-05-15

Abstract: Clustering is an important research in data mining. Clustering in high dimension al space is especially difficult for the spatial distribution of the data, too m uch noise data points, and the phenomenon that the distance between the distance s to the nearest and farthest neighbors of a data point goes to zero. By analyzi ng limitations of the existing algorithms, definitions such as k-neighborhood se t and k-radius are introduced. A local density based k-neighborhood clustering a lgorithm k-PCLDHD is proposed to solve this problem. To improve the algorithm's efficiency, the optimized algorithm k-LDCHD is proposed. The definition of refer ence distance is applied to make a pretreatment to the data set, thus avoiding q uite a lot of scans to the data set after using double reference points, and the effectiveness is improved greatly. The theoretical analysis and experimental re sults indicate that the algorithm can solve the problem of clustering in high di mensional space. It's effective and efficient.

Key words: k-neighbor radius, double reference point, reference radius, high dimensional sp ace