ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2014, Vol. 51 ›› Issue (9): 1901-1910.doi: 10.7544/issn1000-1239.2014.20140161

Special Issue: 2014深度学习

Previous Articles     Next Articles

A Sparse Stochastic Algorithm with O(1/T) Convergence Rate

Jiang Jiyuan, Xia Liang, Zhang Xian, and Tao Qing   

  1. (Army Officer Academy of PLA, Hefei 230031)
  • Online:2014-09-01

Abstract: Stochastic gradient descent (SGD) is a simple but efficient method for large-scale optimization problems. Recent researches have shown that its convergence rate can be effectively improved by using the so-called α-suffix averaging technique in solving the strongly convex problems. However, SGD is purely a black-box method, so it is difficult to obtain the expected effect on the learning structure while solving the regularized optimization problems. On the other hand, composite objective mirror descent (COMID) in the stochastic setting is a scalable algorithm which can effectively keep the sparsity imposed by L1 regularization;But it can only obtain an O(logT/T) convergence rate for the strongly convex optimization problems. In this paper, we focus on the generally convex optimization problem in the form of “L1 + Hinge”. To make full use of the α-suffix averaging technique, we first change it into a strongly convex optimization problem by adding an L2 strongly convex term. Then, we present an L1MD-α algorithm which combines the COMID algorithm and the α-suffix averaging technique. We prove that the L1MD-α algorithm can achieve an O(1/T) convergence rate. As a result, our L1MD-α algorithm not only obtains faster convergence rate but also better sparsity than COMID. Through extensive experiments on some typical large-scale datasets we finally verify the correctness of the theoretical analysis and effectiveness of the proposed algorithm.

Key words: machine learning, stochastic optimization, sparsity, L1 regularization, COMID (composite objective mirror descent)

CLC Number: