Abstract:
Stochastic gradient descent (SGD) is a simple but efficient method for large-scale optimization problems. Recent researches have shown that its convergence rate can be effectively improved by using the so-called α-suffix averaging technique in solving the strongly convex problems. However, SGD is purely a black-box method, so it is difficult to obtain the expected effect on the learning structure while solving the regularized optimization problems. On the other hand, composite objective mirror descent (COMID) in the stochastic setting is a scalable algorithm which can effectively keep the sparsity imposed by L1 regularization;But it can only obtain an O(logT/T) convergence rate for the strongly convex optimization problems. In this paper, we focus on the generally convex optimization problem in the form of “L1 + Hinge”. To make full use of the α-suffix averaging technique, we first change it into a strongly convex optimization problem by adding an L2 strongly convex term. Then, we present an L1MD-α algorithm which combines the COMID algorithm and the α-suffix averaging technique. We prove that the L1MD-α algorithm can achieve an O(1/T) convergence rate. As a result, our L1MD-α algorithm not only obtains faster convergence rate but also better sparsity than COMID. Through extensive experiments on some typical large-scale datasets we finally verify the correctness of the theoretical analysis and effectiveness of the proposed algorithm.