0

Oab Bench

Saturated

Benchmark made to evaluate llms in the Brazilian Bar Examination, using a multi-judge system.

Domain
rl-env
License
unknown
Published
Sep 2025

Cite

Notes

Only stored in your browser.

Attribution

Leaderboard scores
prime-hub
Attribution policy →

Top score 96.5% by GPT-4.1 Mini - 1 model reporting (1 frontier)

Top models

1
Oab BenchBar chart with 1 bar. Highest value: GPT-4.1 Mini at 96.5.
1 model

Related tools

1
View all

Implementations, trainers, datasets and scaffolds linked to this eval.

FAQ

What is Oab Bench?
Benchmark made to evaluate llms in the Brazilian Bar Examination, using a multi-judge system.
What is the current top score on Oab Bench?
The top reported score is 96.5% by GPT-4.1 Mini, across 1 model reporting (1 from frontier labs).
How can a model improve its Oab Bench score?
Tools linked to Oab Bench on Sophon include OAB Bench RL Env (Kunumi) - RL environments, datasets, and scaffolds that target this eval.
What license is Oab Bench under?
Oab Bench is available under unknown.