Processing math: 0%
  • 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
高级检索

基于转换学习的半监督分类

康昭, 刘亮, 韩蒙

康昭, 刘亮, 韩蒙. 基于转换学习的半监督分类[J]. 计算机研究与发展, 2023, 60(1): 103-111. DOI: 10.7544/issn1000-1239.202110811
引用本文: 康昭, 刘亮, 韩蒙. 基于转换学习的半监督分类[J]. 计算机研究与发展, 2023, 60(1): 103-111. DOI: 10.7544/issn1000-1239.202110811
Kang Zhao, Liu Liang, Han Meng. Semi-Supervised Classification Based on Transformed Learning[J]. Journal of Computer Research and Development, 2023, 60(1): 103-111. DOI: 10.7544/issn1000-1239.202110811
Citation: Kang Zhao, Liu Liang, Han Meng. Semi-Supervised Classification Based on Transformed Learning[J]. Journal of Computer Research and Development, 2023, 60(1): 103-111. DOI: 10.7544/issn1000-1239.202110811
康昭, 刘亮, 韩蒙. 基于转换学习的半监督分类[J]. 计算机研究与发展, 2023, 60(1): 103-111. CSTR: 32373.14.issn1000-1239.202110811
引用本文: 康昭, 刘亮, 韩蒙. 基于转换学习的半监督分类[J]. 计算机研究与发展, 2023, 60(1): 103-111. CSTR: 32373.14.issn1000-1239.202110811
Kang Zhao, Liu Liang, Han Meng. Semi-Supervised Classification Based on Transformed Learning[J]. Journal of Computer Research and Development, 2023, 60(1): 103-111. CSTR: 32373.14.issn1000-1239.202110811
Citation: Kang Zhao, Liu Liang, Han Meng. Semi-Supervised Classification Based on Transformed Learning[J]. Journal of Computer Research and Development, 2023, 60(1): 103-111. CSTR: 32373.14.issn1000-1239.202110811

基于转换学习的半监督分类

基金项目: 国家自然科学基金项目(61806045)
详细信息
    作者简介:

    康昭: 1983年生.博士,副教授,硕士生导师.CCF会员.主要研究方向为无监督机器学习、深度表示学习、图信号处理、社交媒体分析和知识图谱

    刘亮: 1997年生.硕士.主要研究方向为深度学习和图信号处理

    韩蒙: 1963年生. 博士,高级工程师,硕士生导师.主要研究方向为数据挖掘和机器学习

    通讯作者:

    韩蒙(hmuestc@126.com

  • 中图分类号: TP391

Semi-Supervised Classification Based on Transformed Learning

Funds: This work was supported by the National Natural Science Foundation of China (61806045).
  • 摘要:

    近年来,基于图的半监督分类是机器学习与模式识别领域的研究热点之一. 该类方法一般通过构造图来挖掘数据中隐含的信息,并利用图的结构信息来对无标签样本进行分类,因此半监督分类的效果严重依赖于图的质量,尤其是图的构建方法和数据的质量. 为解决上述问题,提出了一种基于转换学习的半监督分类(semi-supervised classification based on transformed learning, TLSSC)算法.不同于已有的大多数半监督分类算法,此算法试图学习到一个转换空间,并在该空间上构建图,进行标签传播. 具体来说,此算法建立了一个统一的联合优化框架,其由3个部分组成:1)使用转换学习将原始数据映射到转换空间中;2)借鉴数据自表示思想,在转换空间上学习一个图;3)在图上进行标签传播. 这3个步骤交替进行、互相促进,避免低质量图导致的次优解. 对人脸和物品数据集进行实验,结果表明所提出的TLSSC算法在大部分情况下优于现有的其他算法.

    Abstract:

    In recent years graph-based semi-supervised classification is one of the research hot topics in machine learning and pattern recognition. In general, this algorithm discovers the hidden information by constructing a graph and classifies the labels for unlabeled samples based on the structural information of the graph. Therefore, the performance of semi-supervised classification heavily depends on the quality of the graph, especially the graph construction algorithm and the quality of data. In order to solve the above problems, we propose to perform a semi-supervised classification based on transformed learning (TLSSC) in this paper. Unlike most existing semi-supervised classification algorithms that learn the graph using raw features, our algorithm seeks a representation (transformed coefficients) and performs graph learning and label propagation based on the learned representation. In particular, a unified framework that integrates representation learning, graph construction, and label propagation is proposed, so that it is alternately updated and mutually improved and can avoid the sub-optimal solution caused by the low-quality graph. Specially, the raw features are mapped into transformed representation by transformed learning, then learn a high-quality graph by self-expression and achieve classification performance by label propagation. Extensive experiments on face and subject data sets show that our proposed algorithm outperforms other state-of-the-art algorithms in most cases.

  • 图  1   3个数据集样本的示例

    Figure  1.   Sample images of three datasets

    图  2   邻接矩阵 {\boldsymbol{C}} 在3个数据集上的分布

    Figure  2.   Distribution of the adjacency matrix {\boldsymbol{C}} on 3 datasets

    图  3   αβ在JAFFE数据集上Acc的影响

    Figure  3.   Influence of α and β on Acc in JAFFE dataset

    图  4   λμ在JAFFE数据集上Acc的影响

    Figure  4.   Influence of λ and μ on Acc in JAFFE dataset

    表  1   各种算法在数据集上的Acc实验结果

    Table  1   Experimental Results of Classification Accuracy for Each Algorithm on Benchmark Data Sets %

    数据集标记数据占比GFHFLGCSCANS2LRRTLSSC
    YALE1038.00±11.9147.33±13.9645.07±1.3028.77±9.5950.00±12.01
    3054.13±9.4763.08±2.2060.92±4.0342.58±5.9372.88±2.72
    5060.28±5.1669.56±5.4268.94±4.5751.22±6.7880.11±3.73
    JAFFE1092.85±7.7696.68±2.7696.92±1.6894.38±6.2383.83±12.73
    3098.50±1.0198.86±1.1498.20±1.2298.82±1.0598.98±1.29
    5098.94±1.1199.29±0.9499.25±5.7999.47±0.5999.51±0.67
    COIL201087.74±2.2685.43±1.4090.09±1.1581.10±1.6987.65±2.0
    3095.48±1.4087.82±1.0395.27±0.9387.69±1.3996.56±2.04
    5098.62±0.7188.47±0.4597.53±0.8290.92±1.1997.68±1.69
    COIL1001051.27±0.7369.41±1.5178.95±2.2344.30±1.5680.52±2.04
    3064.85±0.4980.16±1.3288.39±1.3858.63±1.4490.84±1.26
    5072.10±0.7084.93±1.2691.98±1.1762.84±2.4993.57±1.03
    YALEB1011.19±1.6723.76±1.5355.15±2.4964.14±3.4766.83±4.35
    3029.45±2.2039.69±2.8269.21±2.5584.69±0.7486.91±3.63
    5044.63±1.8348.74±2.0673.66±1.8089.84±0.7388.59±1.47
    注:黑体值为最优结果,±为标准偏差符号.
    下载: 导出CSV
  • [1] 许震,沙朝锋,王晓玲,等. 基于KL距离的非平衡数据半监督学习算法[J]. 计算机研究与发展,2010,47(1):81−87

    Xu Zhen, Sha Chaofeng, Wang Xiaoling, et al. A semi-supervised learning algorithm from imbalanced data based on KL divergence[J]. Journal of Computer Research and Development, 2010, 47(1): 81−87 (in Chinese)

    [2] 李宇峰,黄圣君,周志华. 一种基于正则化的半监督多标记学习方法[J]. 计算机研究与发展,2012,49(6):1272−1278

    Li Yufeng, Huang Shengjun, Zhou Zhihua. Regularized semi-supervised multi-label learning[J]. Journal of Computer Research and Development, 2012, 49(6): 1272−1278 (in Chinese)

    [3] 周志华. 基于分歧的半监督学习[J]. 自动化学报,2013,39(11):1871−1878 doi: 10.3724/SP.J.1004.2013.01871

    Zhou Zhihua. Disagreement-based semi-supervised learning[J]. Acta Automatica Sinica, 2013, 39(11): 1871−1878 (in Chinese) doi: 10.3724/SP.J.1004.2013.01871

    [4] 张晨光,张燕,张夏欢. 最大规范化依赖性多标记半监督学习方法[J]. 自动化学报,2015,41(9):1577−1588

    Zhang Chenguang, Zhang Yan, Zhang Xiahuan. Normalized dependence maximization multi-label semi-supervised learning method[J]. Acta Automatica Sinica, 2015, 41(9): 1577−1588 (in Chinese)

    [5] 陈荣,曹永锋,孙洪. 基于主动学习和半监督学习的多类图像分类[J]. 自动化学报,2011,37(8):954−962

    Chen Rong, Cao Yongfeng, Sun Hong. Multi-class image classification with active learning and semi-supervised learning[J]. Acta Automatica Sinica, 2011, 37(8): 954−962 (in Chinese)

    [6] 张永,陈蓉蓉,张晶. 基于交叉熵的安全Tri-training算法[J]. 计算机研究与发展,2021,58(1):60−69 doi: 10.7544/issn1000-1239.2021.20190838

    Zhang Yong, Chen Rongrong, Zhang Jing. Safe Tri-training algorithm based on cross entropy[J]. Journal of Computer Research and Development, 2021, 58(1): 60−69 (in Chinese) doi: 10.7544/issn1000-1239.2021.20190838

    [7] 李明,杨艳屏,占惠融. 基于局部聚类与图方法的半监督学习算法[J]. 自动化学报,2010,36(12):1655−1660

    Li Ming, Yang Yanping, Zhan Huirong. Semi-supervised learning based on graph and local quick shift[J]. Acta Automatica Sinica, 2010, 36(12): 1655−1660 (in Chinese)

    [8] 张震,汪斌强,李向涛,等. 基于近邻传播学习的半监督流量分类方法[J]. 自动化学报,2013,39(7):1100−1109

    Zhang Zhen, Wang Binqiang, Li Xiangtao, et al. Semi-supervised traffic identification based on affinity propagation[J]. Acta Automatica Sinica, 2013, 39(7): 1100−1109 (in Chinese)

    [9]

    Bo Xiaofan, Kang Zhao, Zhao Zhitong, et al. Latent multi-view semi-supervised classification[C] //Proc of the 11th Asian Conf on Machine Learning. PMLR, 2019 [2022-01-27]. http://proceedings.mlr.press/v101/bo19a.html

    [10]

    Kang Zhao, Pan Haiqi, Hoi S C H, et al. Robust graph learning from noisy data[J]. IEEE Transactions on Cybernetics, 2020, 50(5): 1833−1843 doi: 10.1109/TCYB.2018.2887094

    [11]

    Kang Zhao, Xu Zenglin, Lu Xiao, et al. Self-weighted multiple kernel learning for graph-based clustering and semi-supervised classification[C] //Proc of the 27th Int Joint Conf on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2018: 2312−2318

    [12] 刘钰峰,李仁发. 异构信息网络上基于图正则化的半监督学习[J]. 计算机研究与发展,2015,52(3):606−613 doi: 10.7544/issn1000-1239.2015.20131147

    Liu Yufeng, Li Renfa. Graph regularized semi-supervised learning on heterogeneous information networks[J]. Journal of Computer Research and Development, 2015, 52(3): 606−613 (in Chinese) doi: 10.7544/issn1000-1239.2015.20131147

    [13]

    Zhu Xiaojin, Ghahramani Z. Learning from labeled and unlabeled data with label propagation, CMU-CALD-02-107[R]. Pittsburgh, PA: Carnegie Mellon University, 2002

    [14]

    Jebara T, Wang Jun, Chang Shifu. Graph construction and b-matching for semi-supervised learning[C] //Proc of the 26th Annual Int Conf on Machine Learning. New York: ACM, 2009: 441−448

    [15]

    Cheng Hong, Liu Zicheng, Yang Jie. Sparsity induced similarity measure for label propagation[C] //Proc of the 12th Int Conf on Computer Vision. Los Alamitos, CA: IEEE Computer Society, 2009: 317−324

    [16]

    Li Sheng, Fu Yun. Learning balanced and unbalanced graphs via low-rank coding[J]. IEEE Transactions on Knowledge and Data Engineering, 2014, 27(5): 1274−1287

    [17]

    Wang Fei, Zhang Changshui. Label propagation through linear neighborhoods[J]. IEEE Transactions on Knowledge and Data Engineering, 2007, 20(1): 55−67

    [18]

    Nie Feiping, Cai Guohao, Li Xuelong. Multi-view clustering and semi-supervised classification with adaptive neighbours[C] //Proc of the 31st AAAI Conf on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2017: 2408−2414

    [19]

    Kang Zhao, Guo Zipeng, Huang Shudong, et al. Multiple partitions aligned clustering[C] //Proc of the 28th Int Joint Conf on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2019: 2701−2707

    [20]

    Maggu J, Majumdar A, Chouzenoux E. Transformed subspace clustering[J]. IEEE Transactions on Knowledge and Data Engineering, 2020, 33(4): 1796−1801

    [21]

    Ravishankar S, Bresler Y. Learning sparsifying transforms[J]. IEEE Transactions on Signal Processing, 2012, 61(5): 1072−1086

    [22]

    Ravishankar S, Wen B, Bresler Y. Online sparsifying transform learning—part I: Algorithms[J]. IEEE Journal of Selected Topics in Signal Processing, 2015, 9(4): 625−636 doi: 10.1109/JSTSP.2015.2417131

    [23]

    Ravishankar S, Bresler Y. Online sparsifying transform learning—part II: Convergence analysis[J]. IEEE Journal of Selected Topics in Signal Processing, 2015, 9(4): 637−646 doi: 10.1109/JSTSP.2015.2407860

    [24]

    Zhu Xiaojin, Ghahramani Z, Lafferty J D. Semi-supervised learning using Gaussian fields and harmonic functions[C] //Proc of the 20th Int Conf on Machine Learning. Palo Alto, CA: AAAI Press, 2003: 912−919

    [25]

    Nie Feiping, Wang Hua, Huang Heng, et al. Unsupervised and semi-supervised learning via ℓ1-norm graph [C] //Proc of the 13th IEEE Int Conf on Computer Vision. Los Alamitos, CA: IEEE Computer Society, 2011: 2268−2273

    [26] 古楠楠,樊明宇,王迪,等. 基于仿射子空间稀疏表示的半监督分类[J]. 中国科学:信息科学,2015,45(8):985−1000 doi: 10.1360/N112015-00106

    Gu Nannan, Fan Mingyu, Wang Di, et al. Semi-supervised classification based on affine subspace sparse representation[J]. SCIENTIA SINICA Informationis, 2015, 45(8): 985−1000 (in Chinese) doi: 10.1360/N112015-00106

    [27]

    Lu Canyi, Min Hai, Zhao Zhongqiu, et al. Robust and efficient subspace segmentation via least squares regression [C] //Proc of the 12th European Conf on Computer Vision. Berlin: Springer, 2012: 347−360

    [28]

    Mohar B, Alavi Y, Chartrand G, et al. The Laplacian spectrum of graphs[J]. Graph Theory, Combinatorics, and Applications, 1991, 2(12): 871−898

    [29]

    Chung F R K. Spectral Graph Theory[M]. Providence, Rhode Island: American Mathematical Society, 1997

    [30]

    Zhou Dengyong, Bousquet O, Lal T N, et al. Learning with local and global consistency[C] //Proc of the 16th Int Conf on Neural Information Processing Systems. Cambridge, MA: MIT Press, 2003: 321−328

    [31]

    Li Chunguang, Lin Zhouchen, Zhang Honggang, et al. Learning semi-supervised representation towards a unified optimization framework for semi-supervised learning[C] //Proc of the 15th IEEE Int Conf on Computer Vision. Los Alamitos, CA: IEEE Computer Society, 2015: 2767−2775

  • 期刊类型引用(2)

    1. 刘阳,鲁圆圆,郭成城. 基于优先级的数据中心任务优化调度算法设计. 计算机仿真. 2025(01): 497-500+507 . 百度学术
    2. 骆海霞. 基于递推估计的Web前端偶发任务能耗感知方法. 黑龙江工业学院学报(综合版). 2023(10): 115-120 . 百度学术

    其他类型引用(1)

图(4)  /  表(1)
计量
  • 文章访问数: 
  • HTML全文浏览量:  0
  • PDF下载量: 
  • 被引次数: 3
出版历程
  • 收稿日期:  2021-08-12
  • 修回日期:  2022-04-23
  • 网络出版日期:  2023-02-10
  • 刊出日期:  2022-12-31

目录

    /

    返回文章
    返回