• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
高级检索

基于分类型矩阵对象数据的MD fuzzy k-modes聚类算法

李顺勇, 张苗苗, 曹付元

李顺勇, 张苗苗, 曹付元. 基于分类型矩阵对象数据的MD fuzzy k-modes聚类算法[J]. 计算机研究与发展, 2019, 56(6): 1325-1337. DOI: 10.7544/issn1000-1239.2019.20180737
引用本文: 李顺勇, 张苗苗, 曹付元. 基于分类型矩阵对象数据的MD fuzzy k-modes聚类算法[J]. 计算机研究与发展, 2019, 56(6): 1325-1337. DOI: 10.7544/issn1000-1239.2019.20180737
Li Shunyong, Zhang Miaomiao, Cao Fuyuan. A MD fuzzy k-modes Algorithm for Clustering Categorical Matrix-Object Data[J]. Journal of Computer Research and Development, 2019, 56(6): 1325-1337. DOI: 10.7544/issn1000-1239.2019.20180737
Citation: Li Shunyong, Zhang Miaomiao, Cao Fuyuan. A MD fuzzy k-modes Algorithm for Clustering Categorical Matrix-Object Data[J]. Journal of Computer Research and Development, 2019, 56(6): 1325-1337. DOI: 10.7544/issn1000-1239.2019.20180737
李顺勇, 张苗苗, 曹付元. 基于分类型矩阵对象数据的MD fuzzy k-modes聚类算法[J]. 计算机研究与发展, 2019, 56(6): 1325-1337. CSTR: 32373.14.issn1000-1239.2019.20180737
引用本文: 李顺勇, 张苗苗, 曹付元. 基于分类型矩阵对象数据的MD fuzzy k-modes聚类算法[J]. 计算机研究与发展, 2019, 56(6): 1325-1337. CSTR: 32373.14.issn1000-1239.2019.20180737
Li Shunyong, Zhang Miaomiao, Cao Fuyuan. A MD fuzzy k-modes Algorithm for Clustering Categorical Matrix-Object Data[J]. Journal of Computer Research and Development, 2019, 56(6): 1325-1337. CSTR: 32373.14.issn1000-1239.2019.20180737
Citation: Li Shunyong, Zhang Miaomiao, Cao Fuyuan. A MD fuzzy k-modes Algorithm for Clustering Categorical Matrix-Object Data[J]. Journal of Computer Research and Development, 2019, 56(6): 1325-1337. CSTR: 32373.14.issn1000-1239.2019.20180737

基于分类型矩阵对象数据的MD fuzzy k-modes聚类算法

基金项目: 国家自然科学基金项目(61573229);山西省基础研究计划项目(201701D121004);山西省回国留学人员科研资助项目(2017-020);山西省高等学校教学改革创新项目(J2017002)
详细信息
  • 中图分类号: TP391

A MD fuzzy k-modes Algorithm for Clustering Categorical Matrix-Object Data

Funds: This work was supported by the National Natural Science Foundation of China (61573229), the Shanxi Provincial Basic Research Foundation of China (201701D121004), the Shanxi Scholarship Council of China (2017-020), and the Shanxi Provincial Teaching Reform and Innovation Program in Higher Education (J2017002).
  • 摘要: 传统的聚类算法一般是对单值属性数据进行聚类.但在许多实际应用中,每个对象通常被多个特征向量所描述.例如,顾客在购物时可能同时购买多个产品.由多个特征向量描述的对象称为矩阵对象,由矩阵对象构成的数据集称为矩阵对象数据集.目前,针对矩阵对象数据聚类算法的研究相对较少,还有很多问题有待解决.利用fuzzy k-modes算法的聚类过程,提出一种基于矩阵对象数据的matrix-object data fuzzy k-modes(MD fuzzy k-modes)聚类算法.该算法结合模糊集的概念引入模糊因子β,重新定义了矩阵对象间的相异性度量,并给出类中心的启发式更新算法.最后,在5个真实数据集上验证了MD fuzzy k-modes算法的有效性,并分析了模糊因子β与隶属度w之间的关系.大数据时代,利用MD fuzzy k-modes算法对多条记录进行聚类,能更易发现顾客的消费偏好,从而做出更有针对性的推荐.
    Abstract: Traditional algorithms generally cluster single-valued attributed data. However, in practice, each attribute of the data object is described by more than one feature vector. For example, customers may purchase multiple products at the same time as they shop. An object described by multiple feature vectors is called a matrix object and such data are called matrix-object data. At present, the research work on clustering algorithms for categorical matrix- object data is relatively rare, and there are still many issues to be settled. In this paper, we propose a new matrix-object data fuzzy k-modes (MD fuzzy k-modes) algorithm that uses the fuzzy k-modes clustering process to cluster categorical matrix-object data. In the proposed algorithm, we introduce the fuzzy factor β with the concept of fuzzy set. The dissimilarity measure between two categorical matrix-objects is redefined, and the heuristic updating algorithm of the cluster centers is provided. Finally, the effectiveness of the MD fuzzy k-modes algorithm is verified on the five real-world data sets, and the relationship between fuzzy factor β and membership w is analyzed. Therefore, in the era of big data, clustering multiple records by using the MD fuzzy k-modes algorithm can make it easier to find customers’ spending habits and preferences, so as to make more targeted recommendation.
计量
  • 文章访问数:  879
  • HTML全文浏览量:  6
  • PDF下载量:  366
  • 被引次数: 0
出版历程
  • 发布日期:  2019-05-31

目录

    /

    返回文章
    返回