基于保留分类信息的多任务特征学习算法

王珺; 卫金茂; 张璐

doi:10.7544/issn1000-1239.2017.20150963

基于保留分类信息的多任务特征学习算法

(南开大学计算机与控制工程学院天津 300071) (南开大学软件学院天津 300071) (weijm@nankai.edu.cn)

基金项目: 国家自然科学基金项目(61772288，61070089)；天津市自然科学基金项目(14JCYBJC15700)

详细信息

中图分类号: TP181; TP391
计量
- 文章访问数: 1476
- HTML全文浏览量: 3
- PDF下载量: 662
出版历程
- 发布日期: 2017-02-28

Multi-Task Feature Learning Algorithm Based on Preserving Classification Information

(College of Computer and Control Engineering, Nankai University, Tianjin 300071) (College of Software, Nankai University, Tianjin 300071)

摘要

摘要: 在模式识别中，特征选择是一种非常有效的降维技术.特征评价标准在特征选择过程中被用于度量特征的重要性，但目前已有的标准存在着只考虑类之间的分离性而未考虑其相关性、无法去除特征之间的分类冗余性以及多用于单变量度量而无法获取子集整体最优性等问题.提出一种保留分类信息的特征评价准则(classification information preserving, CIP)，并使用多任务学习技术进行实现.CIP是一种特征子集度量方法，通过F范数实现已选特征子集的分类信息与原始数据分类信息的差异最小化，并通过l2,1范数约束选择特征个数.近似交替方向法被用于求解CIP的最优解.理论分析与实验结果表明：CIP选择的最优特征子集不仅最大程度上保留了原始数据类别之间的相关性信息，而且有效地降低了特征之间的分类冗余性.
- 特征选择 /
- 多任务学习 /
- 分类信息保留 /
- 特征冗余 /
- 近似交替方向法
Abstract: In pattern recognition, feature selection is an effective technique for dimension reduction. Feature evaluation criteria are utilized for assessing the importance of features. However, there are several shortcomings for currently available criteria. Firstly, these criteria commonly concentrate all along on class separability, whereas class correlation information is ignored in the selection process. Secondly, they are hardly capable of reducing feature redundancy specific to classification. And thirdly, they are often exploited in univariate measurement and unable to achieve global optimality for feature subset. In this work, we introduce a novel feature evaluation criterion called CIP (classification information preserving). CIP is on the basis of preserving classification information, and multi-task learning technology is adopted for formulating and realizing it. Furthermore, CIP is a feature subset selection method. It employs Frobenius norm for minimizing the difference of classification information between the selected feature subset and original data. Also l2,1 norm is used for constraining the number of the selected features. Then the optimal solution of CIP is achieved under the framework of the proximal alternating direction method. Both theoretical analysis and experimental results demonstrate that the optimal feature subset selected by CIP maximally preserves the original class correlation information. Also feature redundancy for classification is reduced effectively.
- feature selection /
- multi-task learning /
- classification information preserving /
- feature redundancy /
- proximal alternating direction method

HTML全文

参考文献(0)

施引文献(5)

期刊类型引用(3)

1.	苏兆品，张羚，张国富. 低比特率语音流大容量分层隐写方法. 中国图象图形学报. 2022(12): 3461-3475 . 百度学术
2.	李丽惠. 云计算环境下的数据安全传输方式研究. 漳州职业技术学院学报. 2020(04): 80-86 . 百度学术
3.	廖克顺. 基于抗转码视频处理技术的图像隐写算法. 广西师范学院学报(自然科学版). 2019(02): 50-54 . 百度学术