高级检索
    徐 磊 肖柏华 戴汝为 王春恒. 一种面向大类别集的快速分类方法[J]. 计算机研究与发展, 2008, 45(4): 588-595.
    引用本文: 徐 磊 肖柏华 戴汝为 王春恒. 一种面向大类别集的快速分类方法[J]. 计算机研究与发展, 2008, 45(4): 588-595.
    Xu Lei, Xiao Baihua, Dai Ruwei, and Wang Chunheng. A Fast Classification Strategy for Large Class Sets[J]. Journal of Computer Research and Development, 2008, 45(4): 588-595.
    Citation: Xu Lei, Xiao Baihua, Dai Ruwei, and Wang Chunheng. A Fast Classification Strategy for Large Class Sets[J]. Journal of Computer Research and Development, 2008, 45(4): 588-595.

    一种面向大类别集的快速分类方法

    A Fast Classification Strategy for Large Class Sets

    • 摘要: 针对大类别集分类问题提出了一种新的快速分类方法.引入了基于分组的候选规则,通过冗余分组,将大类别集分成若干独立的子集.组的数量和类别数都是有限的,因此可以充分利用各种信息,单独为每个组设计优化的分类器.以手写汉字识别为例,利用多级学习矢量量化来分别训练全局分类器、组中心以及每个组的细分类器.提供了危险区域的判据,并且结合其他的候选规则来提高边缘样本的识别率.

       

      Abstract: Proposed in this paper is a fast multi-stage classification strategy for large class sets, such as handwriting Chinese character recognition. The key issue for multi-stage classification is how to select the candidate subset for fine classification, so a group-based candidate selection rule is provided. The whole class set is first divided into several groups by clustering algorithms. Then, the nearest neighbors of each class are added to the same group and consequently the adjacent groups overlap each other. As a result, for any unlabeled sample, its confusing classes will be totally included in its nearest group. Under this circumstance, the nearest group of any unlabeled sample can be taken as the candidate set. Because the number of the groups is definite and the average group size is rather small, it is feasible to design a special fine classifier for every group using all kinds of complementary features and classifiers. A hierarchical learning vector quantization is also utilized to optimize the global prototypes, local prototypes and group centroids. Furthermore, the risk-zone criterion is introduced to improve the hit rate of the samples which are located near the group boundaries. Experimental results on a handwriting Chinese character database show that the proposed method can reach a reasonable tradeoff between efficiency and accuracy.

       

    /

    返回文章
    返回