Abstract:
Cluster analysis is one of the important techniques in data mining. One of the key problems for clustering algorithm is the dissimilarity measure or similarity measure, and the clustering results are directly dependent on the dissimilarity measure or similarity measure, especially for the clustering algorithms based on similarity matrix, such as spectral clustering. Spectral clustering is a recently developed clustering algorithm. Compared with the traditional partitioning clustering algorithms, spectral clustering algorithm is not limited to spherical clusters, which can successfully discover irregular shape clusters. Gaussian kernel is most commonly used as the similarity measure for most of spectral clustering methods in the literature. In this paper, based on Gaussian kernel similarity measure and the modified Gaussian kernel similarity measures, we propose a weighted self adaptive similarity measure. The proposed similarity measure not only can describe the similarity for data sets with different densities clusters, but also can reduce the similarities between outliers (noise) and other data points. Experimental results show that the proposed similarity measure gives better description of the similarities between data points in various types of data sets, leading to better clustering results.