ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2019, Vol. 56 ›› Issue (8): 1686-1694.doi: 10.7544/issn1000-1239.2019.20190167

Special Issue: 2019人工智能前沿进展专题

Previous Articles     Next Articles

Optimal Individual Convergence Rate of the Heavy-Ball-Based Momentum Methods

Cheng Yujia1, Tao Wei2, Liu Yuxiang1, Tao Qing1   

  1. 1(Department of Information Engineering, Army Academy of Artillery and Air Defense of PLA, Hefei 230031);2(College of Command and Control Engineering, Army Engineering University of PLA, Nanjing 210007)
  • Online:2019-08-01

Abstract: The momentum method is widely used as an acceleration technique to improve the convergence of the first-order gradient algorithms. So far, the momentum methods discussed in most literatures are only limited to the accelerated method proposed by Nesterov, but the Heavy-ball momentum method proposed by Polyak is seldom studied. In particular, in the case of non-smooth objective functions, the individual optimal convergence of Nesterov accelerated methods has been derived, and it has high performance in solving sparse optimization problems. In contrast, while it has been proved that the Heavy-ball momentum method has an optimal convergence rate,it is only in terms of the averaged outputs. To our best knowledge, whether it has optimal individual convergence or not still remains unknown. In this paper, we focus on the non-smooth optimizations. We prove that the Heavy-ball momentum method achieves the optimal individual convergence by skillfully selecting the time-varying step-size, which indicates that Heavy-ball momentum is an efficient acceleration strategy for the individual convergence of the projected subgradient methods. As an application, the constrained hinge loss function optimization problems within an l\-1-norm ball are considered. In comparison with other optimization algorithms, the experiments demonstrate the correctness of our theoretical analysis and performance of the proposed algorithms in keeping the sparsity.

Key words: first-order gradient methods, momentum methods, individual convergence rate, heavy-ball methods, sparsity

CLC Number: