0

Mind2web GRPO

Fresh

Mind2Web web-action prediction with swappable binary vs shaped GRPO reward

Type
RL Env
Runtime
multi-turn
License
unknown
Size
v0.1.1
Published
Jun 2026

Cite

Notes

Only stored in your browser.

Contributors

1