0

Decentralized Learning for Multi-player Multi-armed Bandits

We consider the problem of distributed online learning with multiple players in multi-armed bandits (MAB) models. Each player can pick among multiple arms. When a player picks an arm, it gets a reward. We consider both i.i.d. reward model and Markovian reward model.

Year
2012
Hosting
External sourcelicense unknown

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/1206.3582
TL;DR
Semantic Scholar
Attribution policy →