ISSN 1000-1239 CN 11-1777/TP

• 人工智能 •

### 一种基于关联信息熵度量的特征选择方法

1. (哈尔滨工程大学计算机科学与技术学院 哈尔滨 150001) (donghongbin@hrbeu.edu.cn)
• 出版日期: 2016-08-01
• 基金资助:
国家自然科学基金项目(61472095，61502116)；黑龙江省教育厅智能教育与信息工程重点实验室开放基金项目

### Feature Selection Based on the Measurement of Correlation Information Entropy

Dong Hongbin, Teng Xuyang,Yang Xue

1. (College of Computer Science and Technology, Harbin Engineering University, Harbin 150001)
• Online: 2016-08-01

Abstract: Feature selection aims to select a smaller feature subset from the original feature set. The subset can provide the approximate or better performance in data mining and machine learning. Without transforming physical characteristics of features, fewer features give a more powerful interpretation. Traditional information-theoretic methods tend to measure features relevance and redundancy separately and ignore the combination effect of the whole feature subset. In this paper, the correlation information entropy is applied to feature selection, which is a technology in data fusion. Based on this method, we measure the degree of the independence and redundancy among features. Then the correlation matrix is constructed by utilizing the mutual information between features and their class labels and the combination of feature pairs. Besides, with the consideration of the multivariable correlation of different features in subset, the eigenvalue of the correlation matrix is calculated. Therefore, the sorting algorithm of features and an adaptive feature subset selection algorithm combining with the parameter are proposed. Experiment results show the effectiveness and efficiency on classification tasks of the proposed algorithms.