• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
高级检索

EasiFFRA:一种基于邻域粗糙集的属性快速约简算法

王念, 彭政红, 崔莉

王念, 彭政红, 崔莉. EasiFFRA:一种基于邻域粗糙集的属性快速约简算法[J]. 计算机研究与发展, 2019, 56(12): 2578-2588. DOI: 10.7544/issn1000-1239.2019.20180541
引用本文: 王念, 彭政红, 崔莉. EasiFFRA:一种基于邻域粗糙集的属性快速约简算法[J]. 计算机研究与发展, 2019, 56(12): 2578-2588. DOI: 10.7544/issn1000-1239.2019.20180541
Wang Nian, Peng Zhenghong, Cui Li. EasiFFRA: A Fast Feature Reduction Algorithm Based on Neighborhood Rough Set[J]. Journal of Computer Research and Development, 2019, 56(12): 2578-2588. DOI: 10.7544/issn1000-1239.2019.20180541
Citation: Wang Nian, Peng Zhenghong, Cui Li. EasiFFRA: A Fast Feature Reduction Algorithm Based on Neighborhood Rough Set[J]. Journal of Computer Research and Development, 2019, 56(12): 2578-2588. DOI: 10.7544/issn1000-1239.2019.20180541
王念, 彭政红, 崔莉. EasiFFRA:一种基于邻域粗糙集的属性快速约简算法[J]. 计算机研究与发展, 2019, 56(12): 2578-2588. CSTR: 32373.14.issn1000-1239.2019.20180541
引用本文: 王念, 彭政红, 崔莉. EasiFFRA:一种基于邻域粗糙集的属性快速约简算法[J]. 计算机研究与发展, 2019, 56(12): 2578-2588. CSTR: 32373.14.issn1000-1239.2019.20180541
Wang Nian, Peng Zhenghong, Cui Li. EasiFFRA: A Fast Feature Reduction Algorithm Based on Neighborhood Rough Set[J]. Journal of Computer Research and Development, 2019, 56(12): 2578-2588. CSTR: 32373.14.issn1000-1239.2019.20180541
Citation: Wang Nian, Peng Zhenghong, Cui Li. EasiFFRA: A Fast Feature Reduction Algorithm Based on Neighborhood Rough Set[J]. Journal of Computer Research and Development, 2019, 56(12): 2578-2588. CSTR: 32373.14.issn1000-1239.2019.20180541

EasiFFRA:一种基于邻域粗糙集的属性快速约简算法

基金项目: 国家自然科学基金项目(61672498);国家重点研发计划项目(2016YFC0302300)
详细信息
  • 中图分类号: TP391; TP18

EasiFFRA: A Fast Feature Reduction Algorithm Based on Neighborhood Rough Set

  • 摘要: 从高维异构感知信息中提取有效特征是支撑物联网系统预测与识别的基础.物联网场景中通常包括多个多种感知节点,系统通常会从感知数据中提取大量特征,其中不乏部分无关和冗余特征.这些无关及冗余特征会降低系统的运行速度,引入冗余计算,更会影响后续的分类及预测等机器学习操作的性能.因而高效识别并提取低维有效的特征子集是物联网数据分析所面临的一大挑战.邻域粗糙集方法能够在保持数据集可分性的前提下,识别和去除无关及冗余特征子集,从而达到降维效果.但由于现有基于邻域粗糙集的特征约简算法的计算开销大、运行时间长,故而并未得到广泛应用.提出了一种基于邻域关系对称性及决策值过滤策略的特征快速约简算法EasiFFRA.EasiFFRA可通过改进的散列分桶方法加速正域样本计算,可检验并过滤冗余决策值样本,从而降低现有方法中由于重复距离评估所带来的冗余计算.实验结果表明:EasiFFRA在实际采集的水质数据集和多个不同样本量及维度的公开数据集中平均加快75.45%的特征约简时间,其约简结果和已有邻域粗糙集特征约简算法等效,可有效解决物联网数据分析中由冗余及无关特征导致的分类及预测精度下降问题,有重要应用价值.
    Abstract: Extracting effective features from the high-dimensional and heterogeneous feature set is significant, which is the basis for the prediction and classification of Internet of things (IoT) applications. There are usually multiple sensors deployed in the system and quite a few features are extracted to make full use of the environment information. The high dimensional features always contain redundant and unrelated features, which reduces not only the speed of system, but also the performance of the classification. It’s necessary to recognize and delete them. Neighborhood rough set (NRS) is a popular method for dimensionality reduction, which deletes the unrelated and redundant features while keeping the separability of dataset. However, the NRS method has not been widely applied because of the huge computing cost. In this paper, a Easi fast feature reduction algorithm (EasiFFRA) is proposed based on the symmetry of adjacent domain relationships and the decision attribute filtering mechanism, which reduces the redundant computing by preferentially traversing the buckets with relatively concentrated neighbor samples distribution, and stores the samples into a Hash table that cannot belong to the positive region under the current feature subset. Furthermore, this method can reduce the number of distance calculation significantly through filtering the samples which have the same label with the current sample. Moreover, the algorithm validity is verified by a real world dataset, and 12 open datasets are used. The results show that compared with FHARA, EasiFFRA reduces the computing time by 75.45%. EasiFFRA algorithm reduces the effect of unrelated and redundant features on the results of classification and prediction, and enhances the real-time performance of the neighborhood rough set based features reduction method, which has important application value.
  • 期刊类型引用(11)

    1. 肖宇庭,吕晓琪,谷宇,刘传强. 基于拆分残差网络的糖尿病视网膜病变分类. 广西师范大学学报(自然科学版). 2024(01): 91-101 . 百度学术
    2. 吕德珍,赵玉,苗素琴. 基于分布式多节点医疗管理系统进程设计. 计算机与数字工程. 2024(02): 382-387 . 百度学术
    3. 盛文娟,赖振谱,杨宁,Peng Gangding. 基于改进AdaBoost算法的可调谐F-P滤波器温漂补偿方法. 光学学报. 2023(03): 48-56 . 百度学术
    4. 傅懋钟,胡海洋,李忠金. 面向GPU集群的动态资源调度方法. 计算机研究与发展. 2023(06): 1308-1321 . 本站查看
    5. 杨小琴,朱玉全. 基于距离限定优化的多姿态人脸图像智能识别. 计算机仿真. 2022(01): 200-203+282 . 百度学术
    6. 王昕. 梯度下降及优化算法研究综述. 电脑知识与技术. 2022(08): 71-73 . 百度学术
    7. 赵永亮,于倩,邓博,韩丽君,高红梅. 基于博弈论及机器学习的最优化算法设计与仿真. 电子设计工程. 2022(13): 23-27 . 百度学术
    8. 李晓锋,燕少飞,吴宸. 移动终端操作系统应用程序恶意检测系统技术研究. 电子技术与软件工程. 2022(17): 75-79 . 百度学术
    9. 蒋平. 基于卷积神经网络的图像精度深度优化. 淮阴工学院学报. 2021(03): 30-34 . 百度学术
    10. 杨国葳,李宏坤,张明亮,黄刚劲. 基于一维深度卷积自动编码器的刀具状态监测方法. 振动与冲击. 2021(21): 223-233+274 . 百度学术
    11. 郑雯,沈琪浩,任佳. 基于Improved DR-Net算法的糖尿病视网膜病变识别与分级. 光学学报. 2021(22): 72-83 . 百度学术

    其他类型引用(24)

计量
  • 文章访问数:  1131
  • HTML全文浏览量:  2
  • PDF下载量:  367
  • 被引次数: 35
出版历程
  • 发布日期:  2019-11-30

目录

    /

    返回文章
    返回