Thematic Generalization RL Env (Prime Intellect)

Fresh

This benchmark measures how effectively various LLMs can infer a narrow or specific "theme" (category/rule) from a small set of examples and anti-examples, then detect which item truly fits that theme among a collection of misleading candidates.

Type: RL Env
Publisher: Prime Intellect
Tags: Reasoning Single Choice
Runtime: single-turn
License: unknown
Size: v0.1.0
Published: Sep 2025
Canonical: app.primeintellect.ai/dashboard/environments/primeintellect/thematic-generalization

Cite

Notes

Only stored in your browser.

Attribution

README: api.primeintellect.ai/api/v1/environmentshub/primeintellect/thematic-generalization/@0.1.0/inspect
Scores: prime-hub

Attribution policy →

Public scores on this env

2 vf-eval reports across 1 model

1gpt-oss-120bOpenAI65.6%

Open the scoring view →