ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2019, Vol. 56 ›› Issue (3): 643-654.doi: 10.7544/issn1000-1239.2019.20180019

Previous Articles     Next Articles

An Adaptive Algorithm in Multi-Armed Bandit Problem

Zhang Xiaofang1,2, Zhou Qian1, Liang Bin1, Xu Jin1   

  1. 1(School of Computer Science and Technology, Soochow University, Suzhou, Jiangsu 215006); 2(State Key Laboratory for Novel Software Technology (Nanjing University), Nanjing 210023)
  • Online:2019-03-01

Abstract: As an important ongoing field in machine learning, reinforcement learning has received extensive attention in recent years. The multi-armed bandit (MAB) problem is a typical problem of the exploration and exploitation dilemma in reinforcement learning. As a classical MAB problem, the stochastic multi-armed bandit (SMAB) problem is the base of many new MAB problems. To solve the problems of insufficient use of information and poor generalization ability in existing MAB methods, this paper presents an adaptive SMAB algorithm to balance exploration and exploitation based on the chosen number of arm with minimal estimation, namely CNAME in short. CNAME makes use of the chosen times and the estimations of an action at the same time, so that an action is chosen according to the exploration probability, which is updated adaptively. In order to control the decline rate of exploration probability, the parameter w is introduced to adjust the influence degree of feedback during the selection process. Furthermore, CNAME does not depend on contextual information, hence it has better generalization ability. The upper bound of CNAMEs regret is theoretically proved and analyzed. Our experimental results in different scenarios show that CNAME can yield greater reward and smaller regret with high efficiency than commonly used methods. In addition, its generalization ability is very strong.

Key words: reinforcement learning, multi-armed bandit, exploration and exploitation, adaptation, contextual

CLC Number: