周治平, 朱书伟, 张道文. 分类数据的多目标模糊中心点聚类算法[J]. 计算机研究与发展, 2016, 53(11): 2594-2606.
 引用本文: 周治平, 朱书伟, 张道文. 分类数据的多目标模糊中心点聚类算法[J]. 计算机研究与发展, 2016, 53(11): 2594-2606.
Zhou Zhiping, Zhu Shuwei, Zhang Daowen. Multiobjective Clustering Algorithm with Fuzzy Centroids for Categorical Data[J]. Journal of Computer Research and Development, 2016, 53(11): 2594-2606.
 Citation: Zhou Zhiping, Zhu Shuwei, Zhang Daowen. Multiobjective Clustering Algorithm with Fuzzy Centroids for Categorical Data[J]. Journal of Computer Research and Development, 2016, 53(11): 2594-2606.

## Multiobjective Clustering Algorithm with Fuzzy Centroids for Categorical Data

• 摘要: 针对传统面向分类属性数据的聚类算法大多是对单一指标优化而存在的局限性，将类内和类间信息同时引入到优化过程中，结合多目标优化算法与模糊中心点聚类，提出一种新颖的多目标模糊聚类算法.与传统的基于遗传算法的混合聚类方法不同的是，采用模糊隶属度对染色体进行编码，同时优化2个相对的聚类目标函数获得一组最优解集，并且采用了一种提前终止准则判断算法是否达到稳定状态并停止操作，以减少不必要的计算开销.为了进一步提高算法的效率，通过采样子集计算出相应的模糊中心点作为类的表达，然后以这些模糊中心点计算出全体样本的隶属度矩阵即可获得最终的聚类结果.对10种数据集的实验结果表明:所提方法在聚类精度和稳定性方面优于当前最新的多目标聚类算法，且计算效率也获得较大的提升.

Abstract: It has been shown that most traditional clustering algorithms for categorical data that only optimize a single criteria suffer from some limitations, thus a novel multiobjective fuzzy clustering is proposed, which simultaneously considers within-cluster and between-cluster information. The lately reported algorithms are all based on K-modes, and the more accurate algorithm fuzzy centroids is utilized as the base algorithm to design the proposed method. Fuzzy membership is used as chromosome that is different from traditional genetic based hybrid algorithms, and a set of optimal clustering solutions can be produced by optimizing two conflicting objectives simultaneously. Meanwhile, a termination criterion in advance which can reduce unnecessary computing cost is used to judge whether the algorithm is steady or not. To further improve the efficiency of the proposed method, fuzzy centroids can be calculated using a subset of the dataset, and then the membership matrix can be calculated by these centroids to obtain the final clustering result. The experimental results of 10 datasets show that the clustering accuracy and stability of the proposed algorithm is better than the state of art multiobjective algorithm, and also the computing efficiency is improved to a large extern.

/

• 分享
• 用微信扫码二维码

分享至好友和朋友圈