高级检索

    基于保留分类信息的多任务特征学习算法

    Multi-Task Feature Learning Algorithm Based on Preserving Classification Information

    • 摘要: 在模式识别中,特征选择是一种非常有效的降维技术.特征评价标准在特征选择过程中被用于度量特征的重要性,但目前已有的标准存在着只考虑类之间的分离性而未考虑其相关性、无法去除特征之间的分类冗余性以及多用于单变量度量而无法获取子集整体最优性等问题.提出一种保留分类信息的特征评价准则(classification information preserving, CIP),并使用多任务学习技术进行实现.CIP是一种特征子集度量方法,通过F范数实现已选特征子集的分类信息与原始数据分类信息的差异最小化,并通过l2,1范数约束选择特征个数.近似交替方向法被用于求解CIP的最优解.理论分析与实验结果表明:CIP选择的最优特征子集不仅最大程度上保留了原始数据类别之间的相关性信息,而且有效地降低了特征之间的分类冗余性.

       

      Abstract: In pattern recognition, feature selection is an effective technique for dimension reduction. Feature evaluation criteria are utilized for assessing the importance of features. However, there are several shortcomings for currently available criteria. Firstly, these criteria commonly concentrate all along on class separability, whereas class correlation information is ignored in the selection process. Secondly, they are hardly capable of reducing feature redundancy specific to classification. And thirdly, they are often exploited in univariate measurement and unable to achieve global optimality for feature subset. In this work, we introduce a novel feature evaluation criterion called CIP (classification information preserving). CIP is on the basis of preserving classification information, and multi-task learning technology is adopted for formulating and realizing it. Furthermore, CIP is a feature subset selection method. It employs Frobenius norm for minimizing the difference of classification information between the selected feature subset and original data. Also l2,1 norm is used for constraining the number of the selected features. Then the optimal solution of CIP is achieved under the framework of the proximal alternating direction method. Both theoretical analysis and experimental results demonstrate that the optimal feature subset selected by CIP maximally preserves the original class correlation information. Also feature redundancy for classification is reduced effectively.

       

    /

    返回文章
    返回