Litbench RL Env (Community)
Fresh
Literary evaluation benchmark using pair-wise comparison with self-critique prompting
- Type
- RL Env
- Runtime
single-turn- License
- unknown
- Size
- v0.1.4
- Published
- Oct 2025
Cite
Notes
Only stored in your browser.
Public scores on this env
78 vf-eval reports across 7 models
1GLM 4.5 AirZai80.0%2gpt-oss-20bOpenAIdisputed66.7%3Nemotron Nano 9B V2NVIDIA62.5%4Llama 3.3 Instruct 70BMeta Platforms50.0%5Kimi Dev 72BMoonshot AI50.0%6DeepSeek R1t2 ChimeraDeepSeek37.5%7Deephermes 3 Llama 3 8B Preview12.5%
Open the scoring view →