Abstract:
The problem of selecting the best combination of classifiers from an ensemble has been shown to be NP-complete. Several strategies have been used to reduce the number of the units in ensemble classifier. The main objective of selective ensemble is to find a rapid pruning method for bagging to reduce the storage needs, speed up the classification process and obtain the potential of improving the classification accuracy. Those traditional methods of selective ensemble focus on the diversity of base learners. Diversity implies many-to-many relationship and agreement implies one-to-many relationship, so bagging pruning based on agreement may be an easy way for selective ensemble. A new selective ensemble algorithm(HDW-bagging), which is based on researching on the agreement of base learners, is proposed in this paper. That is to find the worst base learner which can reduce the ensemble generalization error of the rest base learners by deleting itself. Hierachical pruning is used to speed up the new algorithm. The new algorithm's running time is close to bagging and the performance of the new algorithm is superior to the bagging algorithm. The new algorithm's training time efficiency is superior to GASEN's and the performance of the new algorithm is close to that of GASEN. And the new Algorithm supports parallel computing.