anusha

anusha is an RL env contributor.

Role: RL env contributor

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile: Semantic Scholar

Attribution policy →

2tool contribs

Tool contributions

Sophistry Bench RL Env (Community)

RL environment for asymmetric-info debate with sophistry-decomposed verifier

RL EnvReasoningMulti AgentScalable Oversight

Sophistry Bench Sprint

Single-agent advocacy variant of sophistry-bench for the Prime Intellect Reward Hacking Sprint. Pre-registered hypothesis: training Llama-3.2-1B on...

RL EnvReward HackingScalable OversightDebate

Affiliations

No known affiliations.