高级检索

    一种结合特征选择和链接过滤的主动协作分类方法

    An Active Collective Classification Method Combing Feature Selection and Link Filtering

    • 摘要: 分类是网络数据挖掘中的重要研究课题之一.协作分类利用网络节点之间的依赖关系对相互链接的节点集合进行组合分类,其精度高于传统的分类方法,受到广泛关注,并被应用于文档分类、蛋白质结构预测、图像处理和社会网络分析等众多领域.提出一种结合特征选择和链接过滤的主动协作分类方法,算法首先基于最小冗余-最大相关方法选择重要的属性,并建立隐式链接;之后过滤初始链接得到显式链接,最后集成隐式和显式链接形成新的网络结构,再应用协作分类方法实现分类.在3个公共数据集上将该方法分别与典型的传统分类方法、协作分类方法进行对比,结果表明该方法能获得较高的分类精度,对稀疏标记的网络其优势更加明显.

       

      Abstract: As the rapid development of information technology represented by Internet, network data exist widely in real world. The classification in network data has become an important research topic in network data mining. Collective classification exploits the dependent relationships between nodes to classify related nodes simultaneously, and obtains higher classification accuracy. It has attracted wide attention from researchers and has been applied in a variety of domains, such as hyperlinked document classification, protein interaction and gene expression data classification, social network analysis and so on. We present an active collective classification method that combines feature selection and link filtering to perform classification. The algorithm first chooses important attributes based on minimum redundancy-maximum relevance feature selection method and constructs implicit links, and then filters original links to obtain explicit links, and finally integrates explicit and implicit links to perform classification. We compare our method with several typical traditional classification methods and several typical collective classification methods, and the results show that our method obtains higher accuracy, especially for sparsely labeled network, its advantage is more obvious.

       

    /

    返回文章
    返回