0

Optimal cross-learning for contextual bandits with unknown context distributions

We consider the problem of designing contextual bandit algorithms in the ``cross-learning'' setting of Balseiro et al., where the learner observes the loss for the action they play in all possible contexts, not just the context of the current round.

Year
2024
Hosting
External sourcelicense unknown

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2401.01857v1
TL;DR
Semantic Scholar
Attribution policy →