K-Modes Clustering Algorithm Based on a New Distance Measure

Liang Jiye; Bai Liang; Cao Fuyuan

Liang Jiye, Bai Liang, Cao Fuyuan. K-Modes Clustering Algorithm Based on a New Distance MeasureJ. Journal of Computer Research and Development, 2010, 47(10): 1749-1755.

Citation:

Liang Jiye, Bai Liang, Cao Fuyuan. K-Modes Clustering Algorithm Based on a New Distance MeasureJ. Journal of Computer Research and Development, 2010, 47(10): 1749-1755.

Citation:

Liang Jiye, Bai Liang, Cao Fuyuan. K-Modes Clustering Algorithm Based on a New Distance MeasureJ. Journal of Computer Research and Development, 2010, 47(10): 1749-1755.

K-Modes Clustering Algorithm Based on a New Distance Measure

Graphical Abstract

Abstract

Abstract

The leading partitional clustering technique, K-Modes, is one of the most computationally efficient clustering methods for categorical data. In the traditional K-Modes algorithm, the simple matching dissimilarity measure is used to compute the distance between two values of the same categorical attributes. This compares two categorical values directly and results in either a difference of zero when the two values are identical or one if otherwise. However, the similarity between categorical values is not considered. In this paper, a new distance measure based on rough set theory is proposed, which overcomes the shortage of the simple matching dissimilarity measure and is used along with the traditional K-Modes clustering algorithm. While computing the distance between two values of the same categorical attributes, the new distance measure takes into account not only their difference but also discernibility of other relational categorical attributes to them. The time complexity of the modified K-Modes clustering algorithm is linear with respect to the number of data objects which can be applied for large data sets. The performance of the K-Modes algorithm with the new distance measure is tested on real world data sets. Comparisons with the K-Modes algorithm based on many different distance measures illustrate the effectiveness of the new distance measure.

FullText(HTML)

References (0)

Cited By

Turn off MathJax

Article Contents

K-Modes Clustering Algorithm Based on a New Distance Measure

Abstract

Catalog

Export File

Citation

Format

Content