面向子空间聚类的多视图统一表示学习网络

林毓秀; 刘慧; 于晓; 张彩明

doi:10.7544/issn1000-1239.202440431

面向子空间聚类的多视图统一表示学习网络

A Multi-View Unified Representation Learning Network for Subspace Clustering

摘要

摘要: 多视图子空间聚类旨在挖掘多视图的丰富信息来指导高维数据聚类，其研究关键在于如何有效地学习多视图统一表示和子空间表示. 近年来，深度聚类方法利用神经网络强大的表征能力取得了优异的性能. 然而，多视图数据固有的多源异构性使得大多数现有方法以单模态编码器实现对各个视图的独立编码，不仅增加了模型参数量，同时限制了模型的泛化能力. 另一方面，低秩子空间表示被证明能够提升聚类性能，传统的核范数正则化优化没有考虑不同奇异值隐含的信息量差异，是矩阵秩的一个有偏估计. 为此，提出了一种面向子空间聚类的多视图统一表示学习网络. 首先，基于Transformer构建编码器，通过共享参数将异构视图以相同的映射规则投影到低维特征空间. 其次，针对每个样本在不同视图中可能具有不同的表现，采用视图内样本加权融合的方法学习多视图统一表示. 最后，引入加权Schatten-p范数对子空间表示矩阵施加低秩约束. 在7个多视图数据集上的广泛实验验证了所提方法的有效性和优越性.

Abstract: Multi-view subspace clustering aims to explore rich information across views to guide the clustering process. The key lies in effectively learning the unified representation and subspace representation between views. Recently, deep clustering methods have achieved promising effects due to the powerful representation capability of neural networks. However, the multi-source heterogeneity inherent in multi-view data allows existing methods to encode each view independently with a unimodal encoder, increasing the number of model parameters and limiting the model’s generalization capability. Besides, low-rank subspace representation has been shown to facilitate clustering performance, while traditional nuclear norm regularization does not consider the difference between different singular values, leading to biased estimation. To tackle these two problems, we propose a novel multi-view unified representation learning network (namely, MURLN) for subspace clustering. Specifically, MURLN first uses the Transformer as the encoder architecture, which projects different views into the low-dimensional feature space with the same mapping rule by sharing parameters. In addition, a weighted fusion strategy for intra-view samples is conducted to learn a unified representation rationally. Finally, the weighted Schatten p-norm is introduced as the low-rank constraint of the subspace representation matrix. Extensive experiments on seven multi-view datasets verify the effectiveness and superiority of our proposed method.

HTML全文

参考文献(49)

施引文献

资源附件(0)