Abstract:
Model selection is very critical to support vector machines (SVMs). Standard SVMs typically suffer from cubic time complexity in data size since they solve the convex quadratic programming problems. However, it usually needs to train hundreds/thousands of SVMs for model selection, which is prohibitively time-consuming for medium-scale datasets and very difficult to scale up to large-scale problems. In this paper, by using random Fourier features to approximate Gaussian kernel, a novel and efficient approach to model selection of kernel SVMs is proposed. Firstly, the random Fourier feature mapping is used to embed the infinite-dimensional implicit feature space into an explicit random feature space. An error bound between the accurate model obtained by training kernel SVM and the approximate one returned by the linear SVM in the random feature space is derived. Then, in the random feature space, a model selection approach to kernel SVM is presented. Under the guarantee of the model error upper bound, by applying the linear SVMs in the random feature space to approximate the corresponding kernel SVMs, the approximate model selection criterion can be efficiently calculated and used to assess the relative goodness of the corresponding kernel SVMs. Finally, comparison experiments on benchmark datasets for cross validation model selection show the proposed approach can significantly improve the efficiency of model selection for kernel SVMs while guaranteeing test accuracy.