0

Medcasereasoning

MedCaseReasoning medical diagnosis evaluation

Domain
rl-env
License
unknown
Published
Sep 2025

Cite

Notes

Only stored in your browser.

Attribution

Leaderboard scores
prime-hub
Attribution policy →

Top score 26.7% by GPT-5 Nano - 3 models reporting (1 frontier)

Score history

2
0%25%50%75%100%Sep 24Dec 24Mar 25Jun 25Qwen2.5-7BGPT-5 Nano

Top models

3
MedcasereasoningBar chart with 3 bars. Highest value: GPT-5 Nano at 26.7.
3 models

Related tools

1
View all

Implementations, trainers, datasets and scaffolds linked to this eval.

FAQ

What is Medcasereasoning?
MedCaseReasoning medical diagnosis evaluation
What is the current top score on Medcasereasoning?
The top reported score is 26.7% by GPT-5 Nano, across 3 models reporting (1 from frontier labs).
How can a model improve its Medcasereasoning score?
Tools linked to Medcasereasoning on Sophon include Medcasereasoning RL Env (Medarc) - RL environments, datasets, and scaffolds that target this eval.
What license is Medcasereasoning under?
Medcasereasoning is available under unknown.