Heavy-Ball型动量方法的最优个体收敛速率

程禹嘉; 陶蔚; 刘宇翔; 陶卿

doi:10.7544/issn1000-1239.2019.20190167

Heavy-Ball型动量方法的最优个体收敛速率

Optimal Individual Convergence Rate of the Heavy-Ball-Based Momentum Methods

摘要

摘要: 动量方法作为一种加速技巧被广泛用于提高一阶梯度优化算法的收敛速率.目前，大多数文献所讨论的动量方法仅限于Nesterov提出的加速方法，而对Polyak提出的Heavy-ball型动量方法的研究却较少.特别，在目标函数非光滑的情形下，Nesterov加速方法具有最优的个体收敛性，并在稀疏优化问题的求解中具有很好的效果.但对于Heavy-ball型动量方法，目前仅仅获得了平均输出形式的最优收敛速率，个体收敛是否具有最优性仍然未知.对于非光滑优化问题，通过巧妙地设置步长，证明了Heavy-ball型动量方法具有最优的个体收敛速率，从而说明了Heavy-ball型动量方法可以将投影次梯度方法的个体收敛速率加速至最优.作为应用，考虑了l\-1范数约束的hinge损失函数优化问题.通过与同类的优化算法相比，实验验证了该理论分析的正确性以及所提算法在保持稀疏性方面的良好性能.

Abstract: The momentum method is widely used as an acceleration technique to improve the convergence of the first-order gradient algorithms. So far, the momentum methods discussed in most literatures are only limited to the accelerated method proposed by Nesterov, but the Heavy-ball momentum method proposed by Polyak is seldom studied. In particular, in the case of non-smooth objective functions, the individual optimal convergence of Nesterov accelerated methods has been derived, and it has high performance in solving sparse optimization problems. In contrast, while it has been proved that the Heavy-ball momentum method has an optimal convergence rate,it is only in terms of the averaged outputs. To our best knowledge, whether it has optimal individual convergence or not still remains unknown. In this paper, we focus on the non-smooth optimizations. We prove that the Heavy-ball momentum method achieves the optimal individual convergence by skillfully selecting the time-varying step-size, which indicates that Heavy-ball momentum is an efficient acceleration strategy for the individual convergence of the projected subgradient methods. As an application, the constrained hinge loss function optimization problems within an l\-1-norm ball are considered. In comparison with other optimization algorithms, the experiments demonstrate the correctness of our theoretical analysis and performance of the proposed algorithms in keeping the sparsity.

HTML全文

参考文献(0)

施引文献

资源附件(0)