样本加权的多视图聚类算法

洪敏; 贾彩燕; 李亚芳; 于剑

doi:10.7544/issn1000-1239.2019.20190150

样本加权的多视图聚类算法

Sample-Weighted Multi-View Clustering

摘要

摘要: 大数据时代，人类收集、存储、传输、管理数据的能力日益提高，各行各业已经积累了大量的数据资源，这些数据常呈现出多源性和异构性.如何对这些多源数据进行有效的聚类(也称为多视图聚类)已成为当今机器学习研究关注的焦点之一.现有的多视图聚类算法主要从“全局”角度关注不同视图和特征对簇结构的贡献，没有考虑不同样本间存在的“局部”信息间的差异.因此，提出一种新的多视图样本加权聚类算法(sample-weighted multi-view clustering, SWMVC)，该算法对每个样本的不同视图进行加权，采用交替方向乘子法自适应学习样本权值，不仅可以学习不同样本点间不同视图权重的“局部”差异，还可以从学习到的“局部”差异反映出不同视图对簇结构贡献的“全局”差异，具有较好的灵活性.多个数据集上的实验表明：SWMVC方法在异质视图数据上具有较好的聚类效果.

Abstract: In the era of big data, the ability of humans to collect, store, transmit and manage data has been increasingly improved. Various industries have accumulated a large amount of data resources, which are often multi-source and heterogeneous. How to effectively cluster these multi-source data (also known as multi-view clustering) has become one of the focuses of today’s machine learning research. The existing multi-view clustering algorithms mainly pay attention to the contribution of different views and features to the cluster structure from the “global” perspective, without considering the “local” information complementary differences between different samples. Therefore, this paper proposes a new sample-weighted multi-view clustering (SWMVC). The method weights each sample with different views and adopts alternating direction method of multipliers (ADMM) to learn sample weight, which can not only learn the “local” difference of weights among multiple views in different sample points, but also reflect the “global” difference of the contribution of different views to the cluster structure, and has better flexibility. Experiments on multiple datasets show that the SWMVC method has a better clustering effect on heterogeneous view data.

HTML全文

参考文献(0)

施引文献

资源附件(0)