• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
高级检索

基于分类型矩阵对象数据的MD fuzzy k-modes聚类算法

李顺勇, 张苗苗, 曹付元

李顺勇, 张苗苗, 曹付元. 基于分类型矩阵对象数据的MD fuzzy k-modes聚类算法[J]. 计算机研究与发展, 2019, 56(6): 1325-1337. DOI: 10.7544/issn1000-1239.2019.20180737
引用本文: 李顺勇, 张苗苗, 曹付元. 基于分类型矩阵对象数据的MD fuzzy k-modes聚类算法[J]. 计算机研究与发展, 2019, 56(6): 1325-1337. DOI: 10.7544/issn1000-1239.2019.20180737
Li Shunyong, Zhang Miaomiao, Cao Fuyuan. A MD fuzzy k-modes Algorithm for Clustering Categorical Matrix-Object Data[J]. Journal of Computer Research and Development, 2019, 56(6): 1325-1337. DOI: 10.7544/issn1000-1239.2019.20180737
Citation: Li Shunyong, Zhang Miaomiao, Cao Fuyuan. A MD fuzzy k-modes Algorithm for Clustering Categorical Matrix-Object Data[J]. Journal of Computer Research and Development, 2019, 56(6): 1325-1337. DOI: 10.7544/issn1000-1239.2019.20180737
李顺勇, 张苗苗, 曹付元. 基于分类型矩阵对象数据的MD fuzzy k-modes聚类算法[J]. 计算机研究与发展, 2019, 56(6): 1325-1337. CSTR: 32373.14.issn1000-1239.2019.20180737
引用本文: 李顺勇, 张苗苗, 曹付元. 基于分类型矩阵对象数据的MD fuzzy k-modes聚类算法[J]. 计算机研究与发展, 2019, 56(6): 1325-1337. CSTR: 32373.14.issn1000-1239.2019.20180737
Li Shunyong, Zhang Miaomiao, Cao Fuyuan. A MD fuzzy k-modes Algorithm for Clustering Categorical Matrix-Object Data[J]. Journal of Computer Research and Development, 2019, 56(6): 1325-1337. CSTR: 32373.14.issn1000-1239.2019.20180737
Citation: Li Shunyong, Zhang Miaomiao, Cao Fuyuan. A MD fuzzy k-modes Algorithm for Clustering Categorical Matrix-Object Data[J]. Journal of Computer Research and Development, 2019, 56(6): 1325-1337. CSTR: 32373.14.issn1000-1239.2019.20180737

基于分类型矩阵对象数据的MD fuzzy k-modes聚类算法

基金项目: 国家自然科学基金项目(61573229);山西省基础研究计划项目(201701D121004);山西省回国留学人员科研资助项目(2017-020);山西省高等学校教学改革创新项目(J2017002)
详细信息
  • 中图分类号: TP391

A MD fuzzy k-modes Algorithm for Clustering Categorical Matrix-Object Data

Funds: This work was supported by the National Natural Science Foundation of China (61573229), the Shanxi Provincial Basic Research Foundation of China (201701D121004), the Shanxi Scholarship Council of China (2017-020), and the Shanxi Provincial Teaching Reform and Innovation Program in Higher Education (J2017002).
  • 摘要: 传统的聚类算法一般是对单值属性数据进行聚类.但在许多实际应用中,每个对象通常被多个特征向量所描述.例如,顾客在购物时可能同时购买多个产品.由多个特征向量描述的对象称为矩阵对象,由矩阵对象构成的数据集称为矩阵对象数据集.目前,针对矩阵对象数据聚类算法的研究相对较少,还有很多问题有待解决.利用fuzzy k-modes算法的聚类过程,提出一种基于矩阵对象数据的matrix-object data fuzzy k-modes(MD fuzzy k-modes)聚类算法.该算法结合模糊集的概念引入模糊因子β,重新定义了矩阵对象间的相异性度量,并给出类中心的启发式更新算法.最后,在5个真实数据集上验证了MD fuzzy k-modes算法的有效性,并分析了模糊因子β与隶属度w之间的关系.大数据时代,利用MD fuzzy k-modes算法对多条记录进行聚类,能更易发现顾客的消费偏好,从而做出更有针对性的推荐.
    Abstract: Traditional algorithms generally cluster single-valued attributed data. However, in practice, each attribute of the data object is described by more than one feature vector. For example, customers may purchase multiple products at the same time as they shop. An object described by multiple feature vectors is called a matrix object and such data are called matrix-object data. At present, the research work on clustering algorithms for categorical matrix- object data is relatively rare, and there are still many issues to be settled. In this paper, we propose a new matrix-object data fuzzy k-modes (MD fuzzy k-modes) algorithm that uses the fuzzy k-modes clustering process to cluster categorical matrix-object data. In the proposed algorithm, we introduce the fuzzy factor β with the concept of fuzzy set. The dissimilarity measure between two categorical matrix-objects is redefined, and the heuristic updating algorithm of the cluster centers is provided. Finally, the effectiveness of the MD fuzzy k-modes algorithm is verified on the five real-world data sets, and the relationship between fuzzy factor β and membership w is analyzed. Therefore, in the era of big data, clustering multiple records by using the MD fuzzy k-modes algorithm can make it easier to find customers’ spending habits and preferences, so as to make more targeted recommendation.
  • 期刊类型引用(10)

    1. 李顺勇,余曼,王改变. Fuzzy BC-k-modes:一种分类矩阵对象数据的聚类算法. 计算机应用与软件. 2023(01): 287-297 . 百度学术
    2. 龚芝,马凌,刘敏,何先波. 融合知识图谱的文本聚类方法研究. 南京理工大学学报. 2022(02): 170-176 . 百度学术
    3. 李洁,许青,张露露,王英明. 基于网格耦合的混合属性大数据聚类算法研究. 信息工程大学学报. 2022(02): 218-223 . 百度学术
    4. 王雪蓉,万年红. 云模式事件混沌关联特征提取的物联网大数据聚类算法. 计算机应用研究. 2021(02): 391-397 . 百度学术
    5. 邱保志,王志林. 基于熵的混合属性聚类算法. 计算机工程与设计. 2021(04): 957-962 . 百度学术
    6. 陈艺,江芝蒙,张渝. 云系统中基于同态哈希认证的大数据安全传输. 计算机工程与设计. 2021(05): 1250-1256 . 百度学术
    7. 张春英,高瑞艳,王佳昊,陈松,刘凤春,任静,冯晓泽. 面向不完备分类型矩阵数据的集对k-modes聚类算法. 小型微型计算机系统. 2021(09): 1837-1844 . 百度学术
    8. 郑颖,张伟,靳新,于千贺. 基于Fuzzy ART聚类的卫星在轨姿态监测系统设计. 计算机测量与控制. 2021(12): 17-21 . 百度学术
    9. 费丹雄,严思唯,芦金雨,周文哲,范正权. 基于混合高斯模型的用电量计量数据聚类算法研究. 电子设计工程. 2020(20): 106-110 . 百度学术
    10. 钟耀霞,程建斌,项正山. 传感网络局部离群数据动态聚类算法仿真. 计算机仿真. 2020(11): 312-315+421 . 百度学术

    其他类型引用(11)

计量
  • 文章访问数:  879
  • HTML全文浏览量:  6
  • PDF下载量:  366
  • 被引次数: 21
出版历程
  • 发布日期:  2019-05-31

目录

    /

    返回文章
    返回