高级检索

    一种连续值属性约简方法ReCA

    An Approach for Reduction of Continuous-Valued Attributes

    • 摘要: 属性约简是Rough集理论的主要应用和研究内容之一.现有的各种属性约简方法大多适用于离散值属性.对于连续值属性的数据处理,通常做法是先对其进行离散化.这种先期对数据进行的处理会丢失一些信息,易于使约简产生错误.针对连续值信息系统,提出了一种新的属性约简方法ReCA,该方法将连续值属性离散化与属性约简过程融为一体,以基于信息熵的不确定性度量作为适应度函数,通过进化计算同时得到约简属性集合和离散化的断点集合.实验表明,该方法不仅可以有效地进行属性约简,而且与Rough集及C4.5两种方法相比,得到的属性数目少、测试精度较高.

       

      Abstract: Attributes reduction is the main application of rough set theory. The present methods for reduction are mainly applicable to information systems with discrete values. For the continuous-valued attributes reduction, the common way is to get discrete intervals of values first and then transform the continuous values into the discrete ones. In such discretization, some information will be lost, which may influence the reduction. In this paper, a new approach for reduction of continuous-valued attributes (ReCA) is presented, which integrates the discretion and reduction using information entropy-based uncertainty measures and evolutionary computation. Experimental results show that the approach ReCA is effective for reduction of continuous-valued attributes, and can get less attributes and good precisions compared with the methods of rough set and C4.5 decision tree.

       

    /

    返回文章
    返回