ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2019, Vol. 56 ›› Issue (12): 2578-2588.doi: 10.7544/issn1000-1239.2019.20180541

Previous Articles     Next Articles

EasiFFRA: A Fast Feature Reduction Algorithm Based on Neighborhood Rough Set

Wang Nian1,2, Peng Zhenghong1,2, Cui Li1   

  1. 1(Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190);2(University of Chinese Academy of Sciences, Beijing 100190)
  • Online:2019-12-01

Abstract: Extracting effective features from the high-dimensional and heterogeneous feature set is significant, which is the basis for the prediction and classification of Internet of things (IoT) applications. There are usually multiple sensors deployed in the system and quite a few features are extracted to make full use of the environment information. The high dimensional features always contain redundant and unrelated features, which reduces not only the speed of system, but also the performance of the classification. It’s necessary to recognize and delete them. Neighborhood rough set (NRS) is a popular method for dimensionality reduction, which deletes the unrelated and redundant features while keeping the separability of dataset. However, the NRS method has not been widely applied because of the huge computing cost. In this paper, a Easi fast feature reduction algorithm (EasiFFRA) is proposed based on the symmetry of adjacent domain relationships and the decision attribute filtering mechanism, which reduces the redundant computing by preferentially traversing the buckets with relatively concentrated neighbor samples distribution, and stores the samples into a Hash table that cannot belong to the positive region under the current feature subset. Furthermore, this method can reduce the number of distance calculation significantly through filtering the samples which have the same label with the current sample. Moreover, the algorithm validity is verified by a real world dataset, and 12 open datasets are used. The results show that compared with FHARA, EasiFFRA reduces the computing time by 75.45%. EasiFFRA algorithm reduces the effect of unrelated and redundant features on the results of classification and prediction, and enhances the real-time performance of the neighborhood rough set based features reduction method, which has important application value.

Key words: neighborhood rough set, feature reduction, symmetry mechanism, filtration mechanism, Hash buckets

CLC Number: