Mind2web GRPO
Fresh
Mind2Web web-action prediction with swappable binary vs shaped GRPO reward
- Type
- RL Env
- Runtime
multi-turn- License
- unknown
- Size
- v0.1.1
- Published
- Jun 2026
Cite
Notes
Only stored in your browser.
Mind2Web web-action prediction with swappable binary vs shaped GRPO reward
multi-turnCite
Notes
Only stored in your browser.