高级检索

    基于两级权重的多视角聚类

    Multi-View Clustering Based on Two-Level Weights

    • 摘要: 在聚类过程中,多视角数据的高维性和稀疏性使视角内描述样本的不同特征对聚类结果的影响不同,同一样本在不同的视角中对聚类的贡献也不同.层次化区分视角内不同特征的权重和相同样本在不同视角内的权重是提高多视角聚类性能的重要因素.提出了具有特征级和样本级两级权重的多视角聚类(multi-view clustering based on two-level weights, MVC2W)算法.该算法引入了特征级和样本级注意力机制学习每个视角内不同特征的权重和每个样本在不同视角内的权重.两级注意力机制使算法在训练过程中能够更加关注重要的特征和重要的样本,更加合理地融合不同视角的信息,从而有效克服数据高维性和稀疏性对聚类结果的影响.此外,MVC2W将表征学习和聚类过程融为一体,协同训练、相互促进,进一步提升聚类性能.在5个稀疏程度不同的数据集上的实验结果表明:MVC2W算法的聚类性能比11个基线算法均有提升,尤其是在稀疏程度高的数据集上,MVC2W的提升更加显著.

       

      Abstract: In the process of clustering, the high-dimensionality and sparsity of multi-view data make the different features of samples described in a view have different effects on the clustering results, and each sample has different contributions to the clustering in different views. Hierarchically distinguishing the weights of different features in one view and the weights of the same sample in different views is an important factor to improve the quality of multi-view clustering. In this paper, we propose a multi-view clustering algorithm based on two-level weights, i.e. feature-level and sample-level weights. The proposed algorithm is named MVC2W, which learns the weights of different features in each view and the weights of each sample in different views by introducing a feature-level and a sample-level attention mechanism. The introduction of the two-level attention mechanism allows the algorithm to pay more attention to important features and important samples during the training process, and to integrate information from different views in a more rational way, thereby alleviating effectively the effects induced by high-dimensionality and sparsity on clustering quality. In addition, MVC2W integrates the process of representation learning and clustering into a unified framework for collaborative training and mutual promotion, so as to further improve the clustering performance. The experimental results on 5 datasets with different degrees of sparseness show that MVC2W algorithm outperforms 11 baseline algorithms, especially in the datasets with high degree of sparseness, and the improvement of clustering performance obtained by MVC2W is more significant.

       

    /

    返回文章
    返回