ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2022, Vol. 59 ›› Issue (4): 907-921.doi: 10.7544/issn1000-1239.20200897

• 人工智能 • 上一篇    下一篇

基于两级权重的多视角聚类

杜国王,周丽华,王丽珍,杜经纬   

  1. (云南大学信息学院 昆明 650500) (dugking@mail.ynu.edu.cn)
  • 出版日期: 2022-04-01
  • 基金资助: 
    国家自然科学基金项目(62062066,61762090,61966036);2022年云南省基础研究计划重点项目;云南省高校物联网技术及应用重点实验室项目;国家社会科学基金项目(18XZZ005);云南省高等学校科技创新团队项目(IRTSTYN);云南省教育厅科学研究基金项目(2021Y026)

Multi-View Clustering Based on Two-Level Weights

Du Guowang, Zhou Lihua, Wang Lizhen, Du Jingwei   

  1. (School of Information Science & Engineering, Yunnan University, Kunming 650500)
  • Online: 2022-04-01
  • Supported by: 
    This work was supported by the National Natural Science Foundation of China (62062066, 61762090, 61966036), Yunnan Fundamental Research Projects in 2022, the Project of the University Key Laboratory of Internet of Things Technology and Application of Yunnan Province, the National Social Science Foundation of China (18XZZ005), the Program for Innovation Research Team (in Science and Technology) in University of Yunnan Province (IRTSTYN), and the Scientific Research Fund Project of Yunnan Provincial Department of Education (2021Y026).

摘要: 在聚类过程中,多视角数据的高维性和稀疏性使视角内描述样本的不同特征对聚类结果的影响不同,同一样本在不同的视角中对聚类的贡献也不同.层次化区分视角内不同特征的权重和相同样本在不同视角内的权重是提高多视角聚类性能的重要因素.提出了具有特征级和样本级两级权重的多视角聚类(multi-view clustering based on two-level weights, MVC2W)算法.该算法引入了特征级和样本级注意力机制学习每个视角内不同特征的权重和每个样本在不同视角内的权重.两级注意力机制使算法在训练过程中能够更加关注重要的特征和重要的样本,更加合理地融合不同视角的信息,从而有效克服数据高维性和稀疏性对聚类结果的影响.此外,MVC2W将表征学习和聚类过程融为一体,协同训练、相互促进,进一步提升聚类性能.在5个稀疏程度不同的数据集上的实验结果表明:MVC2W算法的聚类性能比11个基线算法均有提升,尤其是在稀疏程度高的数据集上,MVC2W的提升更加显著.

关键词: 多视角聚类, 特征级权重, 样本级权重, 注意力机制, 稀疏度

Abstract: In the process of clustering, the high-dimensionality and sparsity of multi-view data make the different features of samples described in a view have different effects on the clustering results, and each sample has different contributions to the clustering in different views. Hierarchically distinguishing the weights of different features in one view and the weights of the same sample in different views is an important factor to improve the quality of multi-view clustering. In this paper, we propose a multi-view clustering algorithm based on two-level weights, i.e. feature-level and sample-level weights. The proposed algorithm is named MVC2W, which learns the weights of different features in each view and the weights of each sample in different views by introducing a feature-level and a sample-level attention mechanism. The introduction of the two-level attention mechanism allows the algorithm to pay more attention to important features and important samples during the training process, and to integrate information from different views in a more rational way, thereby alleviating effectively the effects induced by high-dimensionality and sparsity on clustering quality. In addition, MVC2W integrates the process of representation learning and clustering into a unified framework for collaborative training and mutual promotion, so as to further improve the clustering performance. The experimental results on 5 datasets with different degrees of sparseness show that MVC2W algorithm outperforms 11 baseline algorithms, especially in the datasets with high degree of sparseness, and the improvement of clustering performance obtained by MVC2W is more significant.

Key words: multi-view clustering, feature-level weights, sample-level weights, attention mechanism, sparseness

中图分类号: