高级检索

    一种自适应的大间隔近邻分类算法

    An Adaptive Large Margin Nearest Neighbor Classification Algorithm

    • 摘要: kNN分类算法虽然已经广泛地应用于模式识别的各个领域,但是如何对kNN进行改进仍然是一个研究热点.在各种改进方法中,大间隔近邻分类方法取得了较好的改进效果,但是该算法仍然有一些缺点,例如算法对所有测试样本选择的邻域大小(即k值)都是一样的.针对这一缺点,提出了将自适应选择k值引入到目标函数设定中的自适应大间隔近邻分类算法(ALMNN).该算法的主要步骤是:首先为每个测试样本计算一个k值,然后在每一类选取k个目标近邻,计算属于每一类的损失函数值,选择拥有最小函数值的类作为测试样本的类别.给出了ALMNN方法的算法描述,并且通过多个数据集的实验表明,提出的算法与传统的kNN,LMNN比较,可以在一定程度上提高分类的性能,减少了k值的选择对分类性能的影响,训练集的随机抽取对算法的分类性能影响较小.

       

      Abstract: Although kNN has been successfully applied to pattern recognition in various areas, there is a big gap to get good performance because of the parameter k. The existing kNN-type methods have to fix k for all testing examples, which is not appropriate since the data density depends on the real applications. In order to deal with this drawback, an adaptive large margin nearest neighbor classification algorithm (ALMNN) is proposed in this paper to avoid predefining the value of k for all data points. The new method adaptively selects an optimal k for each testing example by solving an optimization problem. Finally, ALMNN assigns a proper label for the testing point based on the loss function. A series of experimental results on real world data sets (including UCI benchmark data sets, image data sets and text data sets) show that the new algorithm outperforms the existing methods. Meanwhile, ALMNN has ability to make kNN insensitive to the choice of k and the random selection of training data sets.

       

    /

    返回文章
    返回