Decentralized Learning for Multi-player Multi-armed Bandits
We consider the problem of distributed online learning with multiple players in multi-armed bandits (MAB) models. Each player can pick among multiple arms. When a player picks an arm, it gets a reward. We consider both i.i.d. reward model and Markovian reward model.
- Year
- 2012
- Hosting
- External sourcelicense unknown
Cite
Notes
Only stored in your browser.