Litbench RL Env (Community)

Fresh

Literary evaluation benchmark using pair-wise comparison with self-critique prompting

Type: RL Env
Tags: Genrm Pair Wise Literary Self Critique Creativity
Runtime: single-turn
License: unknown
Size: v0.1.4
Published: Oct 2025
Canonical: app.primeintellect.ai/dashboard/environments/dmnsh001/LitBench

Cite

Notes

Only stored in your browser.

Attribution

README: api.primeintellect.ai/api/v1/environmentshub/dmnsh001/LitBench/@0.1.4/inspect
Scores: prime-hub

Attribution policy →

Public scores on this env

7

8 vf-eval reports across 7 models

1GLM 4.5 AirZai80.0%2gpt-oss-20bOpenAI66.7%3Nemotron Nano 9B V2NVIDIA62.5%4Llama 3.3 Instruct 70BMeta Platforms50.0%5Kimi Dev 72BMoonshot AI50.0%6DeepSeek R1t2 ChimeraDeepSeek37.5%7Deephermes 3 Llama 3 8B Preview12.5%

Open the scoring view →

Contributors

1