• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
高级检索

面向大数据流的多任务加速在线学习算法

李志杰, 李元香, 王峰, 匡立

李志杰, 李元香, 王峰, 匡立. 面向大数据流的多任务加速在线学习算法[J]. 计算机研究与发展, 2015, 52(11): 2545-2554. DOI: 10.7544/issn1000-1239.2015.20148280
引用本文: 李志杰, 李元香, 王峰, 匡立. 面向大数据流的多任务加速在线学习算法[J]. 计算机研究与发展, 2015, 52(11): 2545-2554. DOI: 10.7544/issn1000-1239.2015.20148280
Li Zhijie, Li Yuanxiang, Wang Feng, Kuang Li. Accelerated Multi-Task Online Learning Algorithm for Big Data Stream[J]. Journal of Computer Research and Development, 2015, 52(11): 2545-2554. DOI: 10.7544/issn1000-1239.2015.20148280
Citation: Li Zhijie, Li Yuanxiang, Wang Feng, Kuang Li. Accelerated Multi-Task Online Learning Algorithm for Big Data Stream[J]. Journal of Computer Research and Development, 2015, 52(11): 2545-2554. DOI: 10.7544/issn1000-1239.2015.20148280
李志杰, 李元香, 王峰, 匡立. 面向大数据流的多任务加速在线学习算法[J]. 计算机研究与发展, 2015, 52(11): 2545-2554. CSTR: 32373.14.issn1000-1239.2015.20148280
引用本文: 李志杰, 李元香, 王峰, 匡立. 面向大数据流的多任务加速在线学习算法[J]. 计算机研究与发展, 2015, 52(11): 2545-2554. CSTR: 32373.14.issn1000-1239.2015.20148280
Li Zhijie, Li Yuanxiang, Wang Feng, Kuang Li. Accelerated Multi-Task Online Learning Algorithm for Big Data Stream[J]. Journal of Computer Research and Development, 2015, 52(11): 2545-2554. CSTR: 32373.14.issn1000-1239.2015.20148280
Citation: Li Zhijie, Li Yuanxiang, Wang Feng, Kuang Li. Accelerated Multi-Task Online Learning Algorithm for Big Data Stream[J]. Journal of Computer Research and Development, 2015, 52(11): 2545-2554. CSTR: 32373.14.issn1000-1239.2015.20148280

面向大数据流的多任务加速在线学习算法

基金项目: 国家自然科学基金项目(61070009,61103125);国家“八六三”高技术研究发展计划基金项目(2007AA01Z290)
详细信息
  • 中图分类号: TP181

Accelerated Multi-Task Online Learning Algorithm for Big Data Stream

  • 摘要: 多任务在线学习框架采用直接数据处理的流式计算模式,是大数据流分析很有前途的一种工具.然而目前的多任务在线学习算法收敛率低,仅为O(1/〖KF(〗T〖KF)〗),T为算法迭代次数.提出一种新颖的多任务加速在线学习算法ADA-MTL(accelerated dual averaging method for multi-task learning),在保持多任务在线学习快捷计算优势的基础上,达到最优收敛率O(1/T\+2).对多任务权重学习矩阵W\-t的迭代闭式解表达式进行了推导,对提出算法的收敛性进行了详细的理论分析.实验表明,提出的多任务加速在线学习算法能够更好地保障大数据流处理的实时性和可伸缩性,有较广泛的实际应用价值.
    Abstract: Conventional machine learning and data mining techniques with batch computing mode suffer from many limitations when being applied to big data stream analytics tasks. Multi-task online learning framework with stream computing mode is a promising tool for big data stream analysis. However, current multi-task online learning algorithm has low convergence rate, such as O(1/〖KF(〗T〖KF)〗) up to the T-th iteration, and its low convergence rate has become a bottleneck of online algorithm performance. In this paper, we propose a novel multi-task accelerated online learning algorithm, called ADA-MTL(accelerated dual averaging method for multi-task learning), which simultaneously obtains low computational time complexity and optimal convergence rate O(1/T\+2). The proof of a closed-form solution theorem which efficiently updates the weight matrix W\-t at each iteration is provided, and detailed theoretical analysis for the algorithm convergence rate is conducted. The experimental results on real-world datasets demonstrate the merits of the proposed multi-task accelerated online learning algorithm for large-scale dynamic data stream problems. Since this multi-task accelerated online learning algorithm can obviously improve the real-time performance and the scalability for big data stream analysis, it is a realistic method for big data stream analytics tasks.
计量
  • 文章访问数:  1396
  • HTML全文浏览量:  0
  • PDF下载量:  1006
  • 被引次数: 0
出版历程
  • 发布日期:  2015-10-31

目录

    /

    返回文章
    返回