Lisanbench RL Env (Community)
Fresh
Single-turn evaluation where the model is tasked to generate the longest valid chain of 1-word edits from a given starting word. The final score is the sum of the longest valid chains across all starting words.
- Type
- RL Env
- Tags
- Word Game
- Runtime
single-turn- License
- unknown
- Size
- v0.1.1
- Published
- Aug 2025
Cite
Notes
Only stored in your browser.
Public scores on this env
12 vf-eval reports across 1 model