ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2019, Vol. 56 ›› Issue (8): 1686-1694.doi: 10.7544/issn1000-1239.2019.20190167

所属专题: 2019人工智能前沿进展专题

• 人工智能 • 上一篇    下一篇



  1. 1(中国人民解放军陆军炮兵防空兵学院信息工程系 合肥 230031);2(中国人民解放军陆军工程大学指挥控制工程学院 南京 210007) (
  • 出版日期: 2019-08-01
  • 基金资助: 

Optimal Individual Convergence Rate of the Heavy-Ball-Based Momentum Methods

Cheng Yujia1, Tao Wei2, Liu Yuxiang1, Tao Qing1   

  1. 1(Department of Information Engineering, Army Academy of Artillery and Air Defense of PLA, Hefei 230031);2(College of Command and Control Engineering, Army Engineering University of PLA, Nanjing 210007)
  • Online: 2019-08-01

摘要: 动量方法作为一种加速技巧被广泛用于提高一阶梯度优化算法的收敛速率.目前,大多数文献所讨论的动量方法仅限于Nesterov提出的加速方法,而对Polyak提出的Heavy-ball型动量方法的研究却较少.特别,在目标函数非光滑的情形下,Nesterov加速方法具有最优的个体收敛性,并在稀疏优化问题的求解中具有很好的效果.但对于Heavy-ball型动量方法,目前仅仅获得了平均输出形式的最优收敛速率,个体收敛是否具有最优性仍然未知.对于非光滑优化问题,通过巧妙地设置步长,证明了Heavy-ball型动量方法具有最优的个体收敛速率,从而说明了Heavy-ball型动量方法可以将投影次梯度方法的个体收敛速率加速至最优.作为应用,考虑了l\-1范数约束的hinge损失函数优化问题.通过与同类的优化算法相比,实验验证了该理论分析的正确性以及所提算法在保持稀疏性方面的良好性能.

关键词: 一阶梯度方法, 动量方法, 个体收敛速率, Heavy-ball方法, 稀疏性

Abstract: The momentum method is widely used as an acceleration technique to improve the convergence of the first-order gradient algorithms. So far, the momentum methods discussed in most literatures are only limited to the accelerated method proposed by Nesterov, but the Heavy-ball momentum method proposed by Polyak is seldom studied. In particular, in the case of non-smooth objective functions, the individual optimal convergence of Nesterov accelerated methods has been derived, and it has high performance in solving sparse optimization problems. In contrast, while it has been proved that the Heavy-ball momentum method has an optimal convergence rate,it is only in terms of the averaged outputs. To our best knowledge, whether it has optimal individual convergence or not still remains unknown. In this paper, we focus on the non-smooth optimizations. We prove that the Heavy-ball momentum method achieves the optimal individual convergence by skillfully selecting the time-varying step-size, which indicates that Heavy-ball momentum is an efficient acceleration strategy for the individual convergence of the projected subgradient methods. As an application, the constrained hinge loss function optimization problems within an l\-1-norm ball are considered. In comparison with other optimization algorithms, the experiments demonstrate the correctness of our theoretical analysis and performance of the proposed algorithms in keeping the sparsity.

Key words: first-order gradient methods, momentum methods, individual convergence rate, heavy-ball methods, sparsity