Abstract:
Proposed in this paper is a fast multi-stage classification strategy for large class sets, such as handwriting Chinese character recognition. The key issue for multi-stage classification is how to select the candidate subset for fine classification, so a group-based candidate selection rule is provided. The whole class set is first divided into several groups by clustering algorithms. Then, the nearest neighbors of each class are added to the same group and consequently the adjacent groups overlap each other. As a result, for any unlabeled sample, its confusing classes will be totally included in its nearest group. Under this circumstance, the nearest group of any unlabeled sample can be taken as the candidate set. Because the number of the groups is definite and the average group size is rather small, it is feasible to design a special fine classifier for every group using all kinds of complementary features and classifiers. A hierarchical learning vector quantization is also utilized to optimize the global prototypes, local prototypes and group centroids. Furthermore, the risk-zone criterion is introduced to improve the hit rate of the samples which are located near the group boundaries. Experimental results on a handwriting Chinese character database show that the proposed method can reach a reasonable tradeoff between efficiency and accuracy.