• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Shi Qianyu, Liang Jiye, Zhao Xingwang. A Clustering Ensemble Algorithm for Incomplete Mixed Data[J]. Journal of Computer Research and Development, 2016, 53(9): 1979-1989. DOI: 10.7544/issn1000-1239.2016.20150592
Citation: Shi Qianyu, Liang Jiye, Zhao Xingwang. A Clustering Ensemble Algorithm for Incomplete Mixed Data[J]. Journal of Computer Research and Development, 2016, 53(9): 1979-1989. DOI: 10.7544/issn1000-1239.2016.20150592

A Clustering Ensemble Algorithm for Incomplete Mixed Data

More Information
  • Published Date: August 31, 2016
  • Cluster ensembles have recently emerged a powerful clustering analysis technology and caught high attention of researchers due to their good generalization ability. From the existing work, these techniques held great promise, most of which generate the final results for complete data sets with numerical attributes. However, real life data sets are usually incomplete mixed data described by numerical and categorical attributes at the same time. And these existing algorithms are not very effective for an incomplete mixed data set. To overcome this deficiency, this paper proposes a new clustering ensemble algorithm which can be used to ensemble final clustering results for mixed numerical and categorical incomplete data. Firstly, the algorithm conducts completion of incomplete mixed data using three different missing value filling methods. Then, a set of clustering solutions are produced by executing K-Prototypes clustering algorithm on three different kinds of complete data sets multiple times, respectively. Next, a similarity matrix is constructed by considering all the clustering solutions. After that, the final clustering result is obtained by hierarchical clustering algorithms based on the similarity matrix. The effectiveness of the proposed algorithm is empirically demonstrated over some UCI real data sets and three benchmark evaluation measures. The experimental results show that the proposed algorithm is able to generate higher clustering quality in comparison with several traditional clustering algorithms.
  • Related Articles

    [1]Zhang Qiang, Ye Ayong, Ye Guohua, Deng Huina, Chen Aimin. k-Anonymous Data Privacy Protection Mechanism Based on Optimal Clustering[J]. Journal of Computer Research and Development, 2022, 59(7): 1625-1635. DOI: 10.7544/issn1000-1239.20210117
    [2]Qin Hong, Wang Hao, Wei Xiaochao, Zheng Zhihua. Secure Constant-Round Multi-User k-Means Clustering Protocol[J]. Journal of Computer Research and Development, 2020, 57(10): 2188-2200. DOI: 10.7544/issn1000-1239.2020.20200407
    [3]Li Shunyong, Zhang Miaomiao, Cao Fuyuan. A MD fuzzy k-modes Algorithm for Clustering Categorical Matrix-Object Data[J]. Journal of Computer Research and Development, 2019, 56(6): 1325-1337. DOI: 10.7544/issn1000-1239.2019.20180737
    [4]Zhao Xingwang, Liang Jiye. An Attribute Weighted Clustering Algorithm for Mixed Data Based on Information Entropy[J]. Journal of Computer Research and Development, 2016, 53(5): 1018-1028. DOI: 10.7544/issn1000-1239.2016.20150131
    [5]Liang Jiye, Bai Liang, Cao Fuyuan. K-Modes Clustering Algorithm Based on a New Distance Measure[J]. Journal of Computer Research and Development, 2010, 47(10): 1749-1755.
    [6]Pan Liqiang and Li Jianzhong. A Multiple-Regression-Model-Based Missing Values Imputation Algorithm in Wireless Sensor Network[J]. Journal of Computer Research and Development, 2009, 46(12): 2101-2110.
    [7]Wang Bennian, Gao Yang, Chen Zhaoqian, Xie Junyuan, Chen Shifu. K-Cluster Subgoal Discovery Algorithm for Option[J]. Journal of Computer Research and Development, 2006, 43(5): 851-855.
    [8]Chen Zonghai, Wen Feng, Nie Jianbin, and Wu Xiaoshu. A Reinforcement Learning Method Based on Node-Growing k-Means Clustering Algorithm[J]. Journal of Computer Research and Development, 2006, 43(4): 661-666.
    [9]Ni Weiwei, Lu Jieping, and Sun Zhihui. An Effective Distributed k-Means Clustering Algorithm Based on the Pretreatment of Vectors' Inner-Product[J]. Journal of Computer Research and Development, 2005, 42(9): 1493-1497.
    [10]Ni Weiwei, Sun Zhihui, and Lu Jieping. k-LDCHD—A Local Density Based k-Neighborhood Clustering Algorithm for High Dime nsional Space[J]. Journal of Computer Research and Development, 2005, 42(5): 784-791.

Catalog

    Article views (1457) PDF downloads (651) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return