• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Ding Chengcheng, Tao Wei, Tao Qing. A Unified Momentum Method with Triple-Parameters and Its Optimal Convergence Rate[J]. Journal of Computer Research and Development, 2020, 57(8): 1571-1580. DOI: 10.7544/issn1000-1239.2020.20200194
Citation: Ding Chengcheng, Tao Wei, Tao Qing. A Unified Momentum Method with Triple-Parameters and Its Optimal Convergence Rate[J]. Journal of Computer Research and Development, 2020, 57(8): 1571-1580. DOI: 10.7544/issn1000-1239.2020.20200194

A Unified Momentum Method with Triple-Parameters and Its Optimal Convergence Rate

Funds: This work was supported by the National Natural Science Foundation of China (61673394) and the Natural Science Foundation of Anhui Province (1908085MF193).
More Information
  • Published Date: July 31, 2020
  • Momentum methods have been receiving much attention in machine learning community due to being able to improve the performance of SGD. With the successful application in deep learning, various kinds of formulations for momentum methods have been presented. In particular, two unified frameworks SUM (stochastic unified momentum) and QHM (quasi-hyperbolic momentum) were proposed. Unfortunately, even for nonsmooth convex problems, there still exist several unreasonable limitations such as assuming the performed number of iterations to be predefined and restricting the optimization problems to be unconstrained in deriving the optimal average convergence. In this paper, we present a more general framework for momentum methods with three parameters named TPUM (triple-parameters unified momentum), which includes SUM and QHM as specific examples. Then for constrained nonsmooth convex optimization problems, under the circumstances of using time-varying step size, we prove that TPUM has optimal average convergence. This indicates that adding the momentum will not affect the convergence of SGD and it provides a theoretical guarantee for applicability of momentum methods in machine learning problems. The experiments on L1-ball constrained hinge loss problems verify the correctness of theoretical analysis.
  • Related Articles

    [1]Cheng Yujia, Tao Wei, Liu Yuxiang, Tao Qing. Optimal Individual Convergence Rate of the Heavy-Ball-Based Momentum Methods[J]. Journal of Computer Research and Development, 2019, 56(8): 1686-1694. DOI: 10.7544/issn1000-1239.2019.20190167
    [2]Li Chunqiang, Dong Yongqiang, Wu Guoxin. Elephant Flow Detection Algorithm Based on Lowest Rate Eviction Integrated with d-Left Hash[J]. Journal of Computer Research and Development, 2019, 56(2): 349-362. DOI: 10.7544/issn1000-1239.2019.20170732
    [3]Tao Wei, Pan Zhisong, Zhu Xiaohui, Tao Qing. The Optimal Individual Convergence Rate for the Projected Subgradient Method with Linear Interpolation Operation[J]. Journal of Computer Research and Development, 2017, 54(3): 529-536. DOI: 10.7544/issn1000-1239.2017.20160155
    [4]He Min, Du Pan, Zhang Jin, Liu Yue, Cheng Xueqi. Microblog Bursty Topic Detection Method Based on Momentum Model[J]. Journal of Computer Research and Development, 2015, 52(5): 1022-1028. DOI: 10.7544/issn1000-1239.2015.20131549
    [5]Jiang Jiyuan, Xia Liang, Zhang Xian, Tao Qing. A Sparse Stochastic Algorithm with O(1/T) Convergence Rate[J]. Journal of Computer Research and Development, 2014, 51(9): 1901-1910. DOI: 10.7544/issn1000-1239.2014.20140161
    [6]Zhang Yushan, Hao Zhifeng, Huang Han. Global Convergence and Premature Convergence of Two-Membered Evolution Strategy[J]. Journal of Computer Research and Development, 2014, 51(4): 754-761.
    [7]Xiong Jinzhi, Xu Jianmin, and Yuan Huaqiang. Convergenceness of a General Formulation for Polynomial Smooth Support Vector Regressions[J]. Journal of Computer Research and Development, 2011, 48(3): 464-470.
    [8]Qu Yanwen, Zhang Erhua, and Yang Jingyu. Convergence Property of a Generic Particle Filter Algorithm[J]. Journal of Computer Research and Development, 2010, 47(1): 130-139.
    [9]Li Fei, Wang Xin, and Xue Xiangyang. A TCP Friendly Multicast Rate Control Mechanism for Internet DTV[J]. Journal of Computer Research and Development, 2007, 44(4): 623-629.
    [10]Ru Liyun, Ma Shaoping, and Lu Jing. Feature Fusion Based on the Average Precision in Image Retrieval[J]. Journal of Computer Research and Development, 2005, 42(9): 1640-1646.
  • Cited by

    Periodical cited type(2)

    1. 陶蔚,陇盛,刘鑫,胡亚豪,黄金才. 深度学习步长自适应动量优化方法研究综述. 小型微型计算机系统. 2025(02): 257-265 .
    2. 曲军谊. 基于对偶平均的动量方法研究综述. 计算机与数字工程. 2022(11): 2443-2448 .

    Other cited types(4)

Catalog

    Article views (981) PDF downloads (386) Cited by(6)

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return