Thematic Generalization RL Env (Prime Intellect)
Fresh
This benchmark measures how effectively various LLMs can infer a narrow or specific "theme" (category/rule) from a small set of examples and anti-examples, then detect which item truly fits that theme among a collection of misleading candidates.
- Type
- RL Env
- Publisher
- Prime Intellect
- Runtime
single-turn- License
- unknown
- Size
- v0.1.0
- Published
- Sep 2025
Cite
Notes
Only stored in your browser.
Attribution
- README
- api.primeintellect.ai/api/v1/environmentshub/primeintellect/thematic-generalization/@0.1.0/inspect
- Scores
- prime-hub
Public scores on this env
12 vf-eval reports across 1 model