• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
高级检索

基于图片问答的静态重启随机梯度下降算法

李胜东, 吕学强

李胜东, 吕学强. 基于图片问答的静态重启随机梯度下降算法[J]. 计算机研究与发展, 2019, 56(5): 1092-1100. DOI: 10.7544/issn1000-1239.2019.20180472
引用本文: 李胜东, 吕学强. 基于图片问答的静态重启随机梯度下降算法[J]. 计算机研究与发展, 2019, 56(5): 1092-1100. DOI: 10.7544/issn1000-1239.2019.20180472
Li Shengdong, Lü Xueqiang. Static Restart Stochastic Gradient Descent Algorithm Based on Image Question Answering[J]. Journal of Computer Research and Development, 2019, 56(5): 1092-1100. DOI: 10.7544/issn1000-1239.2019.20180472
Citation: Li Shengdong, Lü Xueqiang. Static Restart Stochastic Gradient Descent Algorithm Based on Image Question Answering[J]. Journal of Computer Research and Development, 2019, 56(5): 1092-1100. DOI: 10.7544/issn1000-1239.2019.20180472
李胜东, 吕学强. 基于图片问答的静态重启随机梯度下降算法[J]. 计算机研究与发展, 2019, 56(5): 1092-1100. CSTR: 32373.14.issn1000-1239.2019.20180472
引用本文: 李胜东, 吕学强. 基于图片问答的静态重启随机梯度下降算法[J]. 计算机研究与发展, 2019, 56(5): 1092-1100. CSTR: 32373.14.issn1000-1239.2019.20180472
Li Shengdong, Lü Xueqiang. Static Restart Stochastic Gradient Descent Algorithm Based on Image Question Answering[J]. Journal of Computer Research and Development, 2019, 56(5): 1092-1100. CSTR: 32373.14.issn1000-1239.2019.20180472
Citation: Li Shengdong, Lü Xueqiang. Static Restart Stochastic Gradient Descent Algorithm Based on Image Question Answering[J]. Journal of Computer Research and Development, 2019, 56(5): 1092-1100. CSTR: 32373.14.issn1000-1239.2019.20180472

基于图片问答的静态重启随机梯度下降算法

基金项目: 国家自然科学基金项目(61671070);国家语委十三五科研规划2017年度重点项目(ZDI135-53);网络文化与数字传播北京市重点实验室开放课题(ICDD201505)
详细信息
  • 中图分类号: TP391

Static Restart Stochastic Gradient Descent Algorithm Based on Image Question Answering

  • 摘要: 图片问答是计算机视觉与自然语言处理交叉的多模态学习任务.为了解决该任务,研究人员提出堆叠注意力网络(stacked attention networks, SANs).研究发现该模型易陷入不好的局部最优解,引发较高的问答错误率.为了解决该问题,提出基于图片问答的静态重启随机梯度下降算法.实验结果和分析表明:它的准确率比基准算法提高0.29%,但其收敛速度慢于基准算法.为了验证改善性能的显著性,对实验结果进行统计假设检验.T检验结果证明它的改善性能是极其显著的.为了验证它在同类算法中的有效性,将该算法和当前最好的一阶优化算法进行有效性实验,实验结果和分析证明它更有效.为了验证它的泛化性能和推广价值,在经典的Cifar-10数据集上进行图像识别实验.实验结果和T检验结果证明:它具有良好的泛化性能和较好的推广价值.
    Abstract: Image question answering is a multimodal learning task intersecting computer vision and natural language processing. With the breakthroughs in the deep neural networks, it has been the hotspot and focus of many researchers’ attention. To solve the task, researchers put forward numerous excellent models. Stacked attention networks (SANs) is one of the most typical models, and gets the state-of-the-art results in the test of four public visual question answering datasets. Although it has the excellent performance, because of the diversity of question and the sparsity of answer, it cannot fully learn the universal law of the corpus, and easily fall into the poor local optimal solution, which leads to the higher question answering error rate. By analyzing the causes of the error and observing the details of the model processing image question answering, we find that stochastic gradient descent based on momentum (baseline) has some defects in the optimization of SANs. To solve it, we propose static restart stochastic gradient descent based on image question answering. The experimental results show that its accuracy is 0.29% higher than baseline, but its convergence rate is slower than baseline. To verify the significance of the improved performance, we conduct statistical hypothesis test on the experimental results. The results of T test prove that its improved performance is extremely significant in the process of converging to the global optimal solution. To verify its effectiveness in the same kind of algorithm, we conduct effectiveness experiments with it and the state-of-the-art first-order optimization algorithms. The experimental results and analysis prove that it is more effective in solving image question answering. To verify its generalization performance and promotion value, we conduct the image recognition experiment on the classic Cifar-10 for the image recognition task. The experimental results and the results of T test prove that it has good generalization performance and promotion value in the process of converging to the global optimal solution.
  • 期刊类型引用(5)

    1. 王明,张倩. 我国基于深度学习的图像识别技术在农作物病虫害识别中的研究进展. 中国蔬菜. 2023(03): 22-28 . 百度学术
    2. 覃伟荣,劳燕玲. 基于3D关联规则深度学习的异构遥感数据检测. 计算机仿真. 2023(09): 482-486 . 百度学术
    3. 吕晓洁. 基于深度学习的分布式光伏发电系统电压稳定性评估. 电子设计工程. 2022(17): 114-118 . 百度学术
    4. 宋美佳,贾鹤鸣,林志兴,卢仁盛,刘庆鑫. 自适应学习率梯度下降的优化算法. 三明学院学报. 2021(06): 36-44 . 百度学术
    5. 郑俊浩. 基于深度学习的乳腺癌MRI影像预处理. 智能计算机与应用. 2020(01): 231-232+236 . 百度学术

    其他类型引用(6)

计量
  • 文章访问数:  920
  • HTML全文浏览量:  1
  • PDF下载量:  305
  • 被引次数: 11
出版历程
  • 发布日期:  2019-04-30

目录

    /

    返回文章
    返回