Abstract:
In order to reduce the redundancy of the original Bootstrap example selection algorithm, an embedded Bootstrap (E-Bootstrap) strategy is proposed, which sieves elaborately through extremely large training data sets for more typical examples relevant to the learning problem. Through formulating, the E-Bootstrap and Bootstrap algorithms are compared from two aspects, which indicate that the E-Bootstrap algorithm with almost the same training time selects more utility examples to represent a potentially overwhelming quantity of training data. Thus computational resource constraints to the size of training example set can be handled to some degree, and more effective predictor can be trained. Furthermore, both the above algorithms are applied to the negative example selection for AdaBoost based face detection system. Two experiments are implemented according to each aspect of theoretical analysis. And the results show that the E-Bootstrap outperforms the Bootstrap in getting rid of the redundant examples in Bootsrap sampling to obtain more representative training example set. Moreover, the E-Bootstrap algorithm is applicable to many other example-based active learning methods.