• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
高级检索

基于图片问答的静态重启随机梯度下降算法

李胜东, 吕学强

李胜东, 吕学强. 基于图片问答的静态重启随机梯度下降算法[J]. 计算机研究与发展, 2019, 56(5): 1092-1100. DOI: 10.7544/issn1000-1239.2019.20180472
引用本文: 李胜东, 吕学强. 基于图片问答的静态重启随机梯度下降算法[J]. 计算机研究与发展, 2019, 56(5): 1092-1100. DOI: 10.7544/issn1000-1239.2019.20180472
Li Shengdong, Lü Xueqiang. Static Restart Stochastic Gradient Descent Algorithm Based on Image Question Answering[J]. Journal of Computer Research and Development, 2019, 56(5): 1092-1100. DOI: 10.7544/issn1000-1239.2019.20180472
Citation: Li Shengdong, Lü Xueqiang. Static Restart Stochastic Gradient Descent Algorithm Based on Image Question Answering[J]. Journal of Computer Research and Development, 2019, 56(5): 1092-1100. DOI: 10.7544/issn1000-1239.2019.20180472
李胜东, 吕学强. 基于图片问答的静态重启随机梯度下降算法[J]. 计算机研究与发展, 2019, 56(5): 1092-1100. CSTR: 32373.14.issn1000-1239.2019.20180472
引用本文: 李胜东, 吕学强. 基于图片问答的静态重启随机梯度下降算法[J]. 计算机研究与发展, 2019, 56(5): 1092-1100. CSTR: 32373.14.issn1000-1239.2019.20180472
Li Shengdong, Lü Xueqiang. Static Restart Stochastic Gradient Descent Algorithm Based on Image Question Answering[J]. Journal of Computer Research and Development, 2019, 56(5): 1092-1100. CSTR: 32373.14.issn1000-1239.2019.20180472
Citation: Li Shengdong, Lü Xueqiang. Static Restart Stochastic Gradient Descent Algorithm Based on Image Question Answering[J]. Journal of Computer Research and Development, 2019, 56(5): 1092-1100. CSTR: 32373.14.issn1000-1239.2019.20180472

基于图片问答的静态重启随机梯度下降算法

基金项目: 国家自然科学基金项目(61671070);国家语委十三五科研规划2017年度重点项目(ZDI135-53);网络文化与数字传播北京市重点实验室开放课题(ICDD201505)
详细信息
  • 中图分类号: TP391

Static Restart Stochastic Gradient Descent Algorithm Based on Image Question Answering

  • 摘要: 图片问答是计算机视觉与自然语言处理交叉的多模态学习任务.为了解决该任务,研究人员提出堆叠注意力网络(stacked attention networks, SANs).研究发现该模型易陷入不好的局部最优解,引发较高的问答错误率.为了解决该问题,提出基于图片问答的静态重启随机梯度下降算法.实验结果和分析表明:它的准确率比基准算法提高0.29%,但其收敛速度慢于基准算法.为了验证改善性能的显著性,对实验结果进行统计假设检验.T检验结果证明它的改善性能是极其显著的.为了验证它在同类算法中的有效性,将该算法和当前最好的一阶优化算法进行有效性实验,实验结果和分析证明它更有效.为了验证它的泛化性能和推广价值,在经典的Cifar-10数据集上进行图像识别实验.实验结果和T检验结果证明:它具有良好的泛化性能和较好的推广价值.
    Abstract: Image question answering is a multimodal learning task intersecting computer vision and natural language processing. With the breakthroughs in the deep neural networks, it has been the hotspot and focus of many researchers’ attention. To solve the task, researchers put forward numerous excellent models. Stacked attention networks (SANs) is one of the most typical models, and gets the state-of-the-art results in the test of four public visual question answering datasets. Although it has the excellent performance, because of the diversity of question and the sparsity of answer, it cannot fully learn the universal law of the corpus, and easily fall into the poor local optimal solution, which leads to the higher question answering error rate. By analyzing the causes of the error and observing the details of the model processing image question answering, we find that stochastic gradient descent based on momentum (baseline) has some defects in the optimization of SANs. To solve it, we propose static restart stochastic gradient descent based on image question answering. The experimental results show that its accuracy is 0.29% higher than baseline, but its convergence rate is slower than baseline. To verify the significance of the improved performance, we conduct statistical hypothesis test on the experimental results. The results of T test prove that its improved performance is extremely significant in the process of converging to the global optimal solution. To verify its effectiveness in the same kind of algorithm, we conduct effectiveness experiments with it and the state-of-the-art first-order optimization algorithms. The experimental results and analysis prove that it is more effective in solving image question answering. To verify its generalization performance and promotion value, we conduct the image recognition experiment on the classic Cifar-10 for the image recognition task. The experimental results and the results of T test prove that it has good generalization performance and promotion value in the process of converging to the global optimal solution.
  • 期刊类型引用(7)

    1. 王忠勇,孟杰,王玮,巩克现,刘宏华. 基于特征再挑选的网络未知流量检测算法. 计算机工程与设计. 2025(01): 60-66 . 百度学术
    2. 董姝岐,黄辑贤,粘镇泓,井靖. 字段语义推断模型的二进制协议语义推理方法. 信息工程大学学报. 2025(02): 238-244 . 百度学术
    3. 安晓明,王忠勇,翟慧鹏,巩克现,王玮,孙鹏. 基于深度学习的二进制变种协议字段划分方法. 计算机工程与设计. 2024(04): 982-988 . 百度学术
    4. 童瑞谦,胡夏南,刘优然,秦研,张宁,王强. 基于自动化私有协议识别的挖矿流量检测. 北京航空航天大学学报. 2024(07): 2304-2313 . 百度学术
    5. 刘奇旭,肖聚鑫,谭耀康,王承淳,黄昊,张方娇,尹捷,刘玉岭. 工业互联网流量分析技术综述. 通信学报. 2024(08): 221-237 . 百度学术
    6. 肖盛忠,毛永强,吴晓丹,赵舒敏. 工业控制系统私有协议解析方法研究. 中国宽带. 2024(02): 70-72 . 百度学术
    7. 郑红兵,王焕伟,赵琪,董姝岐,井靖. 基于Tamarin的MQTT协议安全性分析方法. 计算机应用研究. 2023(10): 3132-3137+3143 . 百度学术

    其他类型引用(6)

计量
  • 文章访问数:  923
  • HTML全文浏览量:  1
  • PDF下载量:  305
  • 被引次数: 13
出版历程
  • 发布日期:  2019-04-30

目录

    /

    返回文章
    返回