• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
高级检索

一种三参数统一化动量方法及其最优收敛速率

丁成诚, 陶蔚, 陶卿

丁成诚, 陶蔚, 陶卿. 一种三参数统一化动量方法及其最优收敛速率[J]. 计算机研究与发展, 2020, 57(8): 1571-1580. DOI: 10.7544/issn1000-1239.2020.20200194
引用本文: 丁成诚, 陶蔚, 陶卿. 一种三参数统一化动量方法及其最优收敛速率[J]. 计算机研究与发展, 2020, 57(8): 1571-1580. DOI: 10.7544/issn1000-1239.2020.20200194
Ding Chengcheng, Tao Wei, Tao Qing. A Unified Momentum Method with Triple-Parameters and Its Optimal Convergence Rate[J]. Journal of Computer Research and Development, 2020, 57(8): 1571-1580. DOI: 10.7544/issn1000-1239.2020.20200194
Citation: Ding Chengcheng, Tao Wei, Tao Qing. A Unified Momentum Method with Triple-Parameters and Its Optimal Convergence Rate[J]. Journal of Computer Research and Development, 2020, 57(8): 1571-1580. DOI: 10.7544/issn1000-1239.2020.20200194

一种三参数统一化动量方法及其最优收敛速率

基金项目: 国家自然科学基金项目(61673394);安徽省自然科学基金项目(1908085MF193)
详细信息
  • 中图分类号: TP181

A Unified Momentum Method with Triple-Parameters and Its Optimal Convergence Rate

Funds: This work was supported by the National Natural Science Foundation of China (61673394) and the Natural Science Foundation of Anhui Province (1908085MF193).
  • 摘要: 动量方法由于能够改善SGD(stochastic gradient descent)的收敛性能而倍受机器学习研究者的关注.随着其在深度学习的成功应用,动量方法出现了众多形式的变体.特别地,产生了SUM(stochastic unified momentum)和QHM(quasi-hyperbolic momentum)两种统一框架.但是,即使是对非光滑凸优化问题,其最优平均收敛性的获得仍然存在着固定迭代步数和无约束等不合理限制.为此,提出了一种更一般的含三参数的统一化动量方法TPUM(triple-parameters unified momentum),能够同时包含SUM和QHM;其次,针对约束的非光滑凸优化问题,在采取时变步长的条件下,证明了所提出的TPUM具有最优的平均收敛速率,并将其推广到随机情况,从而保证了添加动量不会影响标准梯度下降法的收敛性能以及动量方法对机器学习问题的可应用性.典型的L1范数约束hinge损失函数优化问题实验验证了理论分析的正确性.
    Abstract: Momentum methods have been receiving much attention in machine learning community due to being able to improve the performance of SGD. With the successful application in deep learning, various kinds of formulations for momentum methods have been presented. In particular, two unified frameworks SUM (stochastic unified momentum) and QHM (quasi-hyperbolic momentum) were proposed. Unfortunately, even for nonsmooth convex problems, there still exist several unreasonable limitations such as assuming the performed number of iterations to be predefined and restricting the optimization problems to be unconstrained in deriving the optimal average convergence. In this paper, we present a more general framework for momentum methods with three parameters named TPUM (triple-parameters unified momentum), which includes SUM and QHM as specific examples. Then for constrained nonsmooth convex optimization problems, under the circumstances of using time-varying step size, we prove that TPUM has optimal average convergence. This indicates that adding the momentum will not affect the convergence of SGD and it provides a theoretical guarantee for applicability of momentum methods in machine learning problems. The experiments on L1-ball constrained hinge loss problems verify the correctness of theoretical analysis.
  • 期刊类型引用(9)

    1. 谢虎,杨占杰,张伟,何超林,谢型浪,马海鑫. 基于混合蝙蝠算法的高比例风电电力系统调度方法. 可再生能源. 2023(06): 804-809 . 百度学术
    2. 余明洋,沈斌. 多变国际形势下新能源汽车销量分析——基于突发因素的复合预测模型. 中国商论. 2023(12): 164-168 . 百度学术
    3. 瞿佳佳,金婷. 基于医疗设备故障数据集的医疗设备可靠性分析. 中国医疗设备. 2022(01): 98-101 . 百度学术
    4. 邓楠,罗幼喜. 函数型Logistic回归模型研究与应用. 湖北工业大学学报. 2022(01): 115-120 . 百度学术
    5. 孟银凤,杨佳宇,曹付元. 函数型数据的分裂转移式层次聚类算法. 山东大学学报(工学版). 2022(01): 19-27 . 百度学术
    6. 齐娜,马琳. 基于Logistic模型的日语翻译机器自动校准方法研究. 自动化与仪器仪表. 2022(07): 247-251 . 百度学术
    7. 朱益冬,陈玉明,卢俊文,曾念峰. 融合Logistic回归与Tabnet模型的P2P网贷违约预测方法. 厦门理工学院学报. 2022(03): 38-47 . 百度学术
    8. 魏鹏,江克贵. 基于FWA-Logistic方法的概率积分动态参数预测. 煤炭工程. 2021(07): 123-127 . 百度学术
    9. 金海波,马海强. 相幅组合的函数型数据特征提取方法研究. 计算机应用研究. 2021(08): 2354-2358 . 百度学术

    其他类型引用(10)

计量
  • 文章访问数:  977
  • HTML全文浏览量:  0
  • PDF下载量:  385
  • 被引次数: 19
出版历程
  • 发布日期:  2020-07-31

目录

    /

    返回文章
    返回