Abstract:
The regularization path algorithm is an efficient method for numerical solution to the support vector machine (SVM) classification problem, which can fit the entire path of SVM solutions for every value of the regularization parameter, with essentially the same computational cost as fitting one SVM model. Existing SVM regularization path algorithms can neither deal with the datasets having duplicate data points, nearly duplicate points, or points that are linearly dependent efficiently, nor have efficient numerical solution. To address these issues, an improved regularization path algorithm via positive definite matrix positive definite SVM path (PDSVMP) is proposed in this paper, which provides the accurate path of SVM solutions. The coefficient matrix of the system of iteration equations is transformed into a positive definite matrix, then the Lagrange multiplier increment vector is computed by Cholesky decomposition, and the increment of regularization parameter is derived according to the changes of the active set, which is used to compute the regularization parameter on each inflection point. Such treatment is able to guarantee the theoretical correctness and numerical stability, and reduce the computational complexity. Experimental results on instance dataset and benchmark datasets show that the PDSVMP algorithm can effectively and efficiently handle datasets having duplicate data points, nearly duplicate points, or points that are linearly dependent.