Robust Heavy-Tailed Linear Bandits Algorithm

Ma Lanjihong; Zhao Peng; Zhou Zhihua

doi:10.7544/issn1000-1239.202220279

Ma Lanjihong, Zhao Peng, Zhou Zhihua. Robust Heavy-Tailed Linear Bandits Algorithm[J]. Journal of Computer Research and Development, 2023, 60(6): 1385-1395. DOI: 10.7544/issn1000-1239.202220279

Citation:

Ma Lanjihong, Zhao Peng, Zhou Zhihua. Robust Heavy-Tailed Linear Bandits Algorithm[J]. Journal of Computer Research and Development, 2023, 60(6): 1385-1395. DOI: 10.7544/issn1000-1239.202220279

Citation:

Ma Lanjihong, Zhao Peng, Zhou Zhihua. Robust Heavy-Tailed Linear Bandits Algorithm[J]. Journal of Computer Research and Development, 2023, 60(6): 1385-1395. DOI: 10.7544/issn1000-1239.202220279

Robust Heavy-Tailed Linear Bandits Algorithm

Graphical Abstract

Graphical Abstract

Abstract

Abstract

The linear bandits model is one of the most foundational online learning models, where a linear function parametrizes the mean payoff of each arm. The linear bandits model encompasses various applications with strong theoretical guarantees and practical modeling ability. However, existing algorithms suffer from the data irregularity that frequently emerges in real-world applications, as the data are usually collected from open and dynamic environments. In this paper, we are particularly concerned with two kinds of data irregularities: the underlying regression parameter could be changed with time, and the noise might not be bounded or even not sub-Gaussian, which are referred to as model drift and heavy-tailed noise, respectively. To deal with the two hostile factors, we propose a novel algorithm based on upper confidence bound. The median-of-means estimator is used to handle the potential heavy-tailed noise, and the restarting mechanism is employed to tackle the model drift. Theoretically, we establish the minimax lower bound to characterize the difficulty and prove that our algorithm enjoys a no-regret upper bound. The attained results subsume previous analysis for scenarios without either model drift or heavy-tailed noise. Empirically, we additionally design several online ensemble techniques to make our algorithm more adaptive to the environments. Extensive experiments are conducted on synthetic and real-world datasets to validate the effectiveness.

FullText(HTML)

References (43)

Cited By

Turn off MathJax

Article Contents

Robust Heavy-Tailed Linear Bandits Algorithm

Graphical Abstract

Abstract

Catalog

Export File

Citation

Format

Content