Skip to main content

Bayesian models and inference for reinforcement learning: the multi-armed bandit case