0

Lisanbench RL Env (Community)

Fresh

Single-turn evaluation where the model is tasked to generate the longest valid chain of 1-word edits from a given starting word. The final score is the sum of the longest valid chains across all starting words.

Type
RL Env
Runtime
single-turn
License
unknown
Size
v0.1.1
Published
Aug 2025

Cite

Notes

Only stored in your browser.

Public scores on this env

1

2 vf-eval reports across 1 model

Open the scoring view →

Contributors

1