• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
高级检索

EasiFFRA:一种基于邻域粗糙集的属性快速约简算法

王念, 彭政红, 崔莉

王念, 彭政红, 崔莉. EasiFFRA:一种基于邻域粗糙集的属性快速约简算法[J]. 计算机研究与发展, 2019, 56(12): 2578-2588. DOI: 10.7544/issn1000-1239.2019.20180541
引用本文: 王念, 彭政红, 崔莉. EasiFFRA:一种基于邻域粗糙集的属性快速约简算法[J]. 计算机研究与发展, 2019, 56(12): 2578-2588. DOI: 10.7544/issn1000-1239.2019.20180541
Wang Nian, Peng Zhenghong, Cui Li. EasiFFRA: A Fast Feature Reduction Algorithm Based on Neighborhood Rough Set[J]. Journal of Computer Research and Development, 2019, 56(12): 2578-2588. DOI: 10.7544/issn1000-1239.2019.20180541
Citation: Wang Nian, Peng Zhenghong, Cui Li. EasiFFRA: A Fast Feature Reduction Algorithm Based on Neighborhood Rough Set[J]. Journal of Computer Research and Development, 2019, 56(12): 2578-2588. DOI: 10.7544/issn1000-1239.2019.20180541
王念, 彭政红, 崔莉. EasiFFRA:一种基于邻域粗糙集的属性快速约简算法[J]. 计算机研究与发展, 2019, 56(12): 2578-2588. CSTR: 32373.14.issn1000-1239.2019.20180541
引用本文: 王念, 彭政红, 崔莉. EasiFFRA:一种基于邻域粗糙集的属性快速约简算法[J]. 计算机研究与发展, 2019, 56(12): 2578-2588. CSTR: 32373.14.issn1000-1239.2019.20180541
Wang Nian, Peng Zhenghong, Cui Li. EasiFFRA: A Fast Feature Reduction Algorithm Based on Neighborhood Rough Set[J]. Journal of Computer Research and Development, 2019, 56(12): 2578-2588. CSTR: 32373.14.issn1000-1239.2019.20180541
Citation: Wang Nian, Peng Zhenghong, Cui Li. EasiFFRA: A Fast Feature Reduction Algorithm Based on Neighborhood Rough Set[J]. Journal of Computer Research and Development, 2019, 56(12): 2578-2588. CSTR: 32373.14.issn1000-1239.2019.20180541

EasiFFRA:一种基于邻域粗糙集的属性快速约简算法

基金项目: 国家自然科学基金项目(61672498);国家重点研发计划项目(2016YFC0302300)
详细信息
  • 中图分类号: TP391; TP18

EasiFFRA: A Fast Feature Reduction Algorithm Based on Neighborhood Rough Set

  • 摘要: 从高维异构感知信息中提取有效特征是支撑物联网系统预测与识别的基础.物联网场景中通常包括多个多种感知节点,系统通常会从感知数据中提取大量特征,其中不乏部分无关和冗余特征.这些无关及冗余特征会降低系统的运行速度,引入冗余计算,更会影响后续的分类及预测等机器学习操作的性能.因而高效识别并提取低维有效的特征子集是物联网数据分析所面临的一大挑战.邻域粗糙集方法能够在保持数据集可分性的前提下,识别和去除无关及冗余特征子集,从而达到降维效果.但由于现有基于邻域粗糙集的特征约简算法的计算开销大、运行时间长,故而并未得到广泛应用.提出了一种基于邻域关系对称性及决策值过滤策略的特征快速约简算法EasiFFRA.EasiFFRA可通过改进的散列分桶方法加速正域样本计算,可检验并过滤冗余决策值样本,从而降低现有方法中由于重复距离评估所带来的冗余计算.实验结果表明:EasiFFRA在实际采集的水质数据集和多个不同样本量及维度的公开数据集中平均加快75.45%的特征约简时间,其约简结果和已有邻域粗糙集特征约简算法等效,可有效解决物联网数据分析中由冗余及无关特征导致的分类及预测精度下降问题,有重要应用价值.
    Abstract: Extracting effective features from the high-dimensional and heterogeneous feature set is significant, which is the basis for the prediction and classification of Internet of things (IoT) applications. There are usually multiple sensors deployed in the system and quite a few features are extracted to make full use of the environment information. The high dimensional features always contain redundant and unrelated features, which reduces not only the speed of system, but also the performance of the classification. It’s necessary to recognize and delete them. Neighborhood rough set (NRS) is a popular method for dimensionality reduction, which deletes the unrelated and redundant features while keeping the separability of dataset. However, the NRS method has not been widely applied because of the huge computing cost. In this paper, a Easi fast feature reduction algorithm (EasiFFRA) is proposed based on the symmetry of adjacent domain relationships and the decision attribute filtering mechanism, which reduces the redundant computing by preferentially traversing the buckets with relatively concentrated neighbor samples distribution, and stores the samples into a Hash table that cannot belong to the positive region under the current feature subset. Furthermore, this method can reduce the number of distance calculation significantly through filtering the samples which have the same label with the current sample. Moreover, the algorithm validity is verified by a real world dataset, and 12 open datasets are used. The results show that compared with FHARA, EasiFFRA reduces the computing time by 75.45%. EasiFFRA algorithm reduces the effect of unrelated and redundant features on the results of classification and prediction, and enhances the real-time performance of the neighborhood rough set based features reduction method, which has important application value.
  • 期刊类型引用(7)

    1. 姜磊,章小卫. 基于模糊隶属度邻域覆盖的三支分类决策. 计算机应用与软件. 2024(02): 271-278 . 百度学术
    2. 骆公志,张尚蕾. 基于正区域和投票式属性重要度的特征提取算法. 南京邮电大学学报(自然科学版). 2024(01): 79-89 . 百度学术
    3. 王笑笑,巴婧,陈建军,宋晶晶,杨习贝. 超约简求解:效率与性能的提升. 计算机科学. 2023(02): 166-172 . 百度学术
    4. 刘长顺,刘炎,宋晶晶,徐泰华. 基于论域离散度的属性约简算法. 山东大学学报(理学版). 2023(05): 26-35+52 . 百度学术
    5. 张清华,艾志华,张金镇. 融合密度与邻域覆盖约简的分类方法. 陕西师范大学学报(自然科学版). 2022(03): 33-42 . 百度学术
    6. 沈毅波. RBF神经网络在关联数据一致性挖掘中的应用. 福建电脑. 2022(08): 5-9 . 百度学术
    7. 周长顺,徐久成,瞿康林,申凯丽,章磊. 一种基于改进邻域粗糙集中属性重要度的快速属性约简方法. 西北大学学报(自然科学版). 2022(05): 745-752 . 百度学术

    其他类型引用(7)

计量
  • 文章访问数:  1131
  • HTML全文浏览量:  2
  • PDF下载量:  367
  • 被引次数: 14
出版历程
  • 发布日期:  2019-11-30

目录

    /

    返回文章
    返回