Abstract:
Multi-view clustering, which learns consensus representations in an unsupervised manner to partition data samples into distinct categories, has gained widespread attention. In recent years, contrastive learning has demonstrated significant potential in multi-view clustering due to its powerful capability for cross-view feature alignment. However, existing contrastive multi-view clustering methods suffer from false negatives and false positives in positive-negative sample partitioning, limiting further improvements in clustering performance. To address these issues, we propose a Multi-layer Contrastive Multi-view Clustering via Spectral Embedding Similarity (MCSES). By introducing a multi-layer contrastive learning mechanism grounded in spectral embedding similarity, MCSES effectively mitigates the performance bottleneck caused by inaccurate sample partitioning. Specifically, Laplacian matrices are first constructed based on the latent representations of each view, and their eigenvectors are utilized to generate spectral embedding similarity matrices. Subsequently, positive and negative sample pairs are partitioned according to the global prior information provided by these spectral embedding similarity matrices, alleviating the interference of sample partitioning errors on latent representation learning. Then, a cross-view contrastive loss is designed to strengthen inter-view consistency and align shared semantics. Finally, the latent representations from each view are adaptively fused through learnable weights to obtain a global multi-view representation, upon which a global contrastive loss is designed to emphasize multi-view consensus information. Extensive experiments on public multi-view datasets validate the effectiveness of the proposed MCSES method.