publication . Preprint . 2017

Robust Contextual Bandit via the Capped-$\ell_{2}$ norm

Zhu, Feiyun; Zhu, Xinliang; Wang, Sheng; Yao, Jiawen; Huang, Junzhou;
Open Access English
  • Published: 17 Aug 2017
This paper considers the actor-critic contextual bandit for the mobile health (mHealth) intervention. The state-of-the-art decision-making methods in mHealth generally assume that the noise in the dynamic system follows the Gaussian distribution. Those methods use the least-square-based algorithm to estimate the expected reward, which is prone to the existence of outliers. To deal with the issue of outliers, we propose a novel robust actor-critic contextual bandit method for the mHealth intervention. In the critic updating, the capped-$\ell_{2}$ norm is used to measure the approximation error, which prevents outliers from dominating our objective. A set of weigh...
free text keywords: Computer Science - Learning, Statistics - Machine Learning
