Abstract:
Conventional machine learning and data mining techniques with batch computing mode suffer from many limitations when being applied to big data stream analytics tasks. Multi-task online learning framework with stream computing mode is a promising tool for big data stream analysis. However, current multi-task online learning algorithm has low convergence rate, such as O(1/〖KF(〗T〖KF)〗) up to the T-th iteration, and its low convergence rate has become a bottleneck of online algorithm performance. In this paper, we propose a novel multi-task accelerated online learning algorithm, called ADA-MTL(accelerated dual averaging method for multi-task learning), which simultaneously obtains low computational time complexity and optimal convergence rate O(1/T\+2). The proof of a closed-form solution theorem which efficiently updates the weight matrix W\-t at each iteration is provided, and detailed theoretical analysis for the algorithm convergence rate is conducted. The experimental results on real-world datasets demonstrate the merits of the proposed multi-task accelerated online learning algorithm for large-scale dynamic data stream problems. Since this multi-task accelerated online learning algorithm can obviously improve the real-time performance and the scalability for big data stream analysis, it is a realistic method for big data stream analytics tasks.