0

Medredqa

Frontier

MedRedQA medical question answer evaluation

Domain
rl-env
License
unknown
Published
Sep 2025

Cite

Notes

Only stored in your browser.

Attribution

Leaderboard scores
prime-hub
Attribution policy →

Top score 77.9% by GPT-4.1 Mini - 3 models reporting (3 frontier)

Score history

3
55%66%78%89%100%Jul 24Sep 24Nov 24Jan 25Mar 25GPT-4o-miniGPT-4.1GPT-4.1 Mini

Top models

3
MedredqaBar chart with 3 bars. Highest value: GPT-4.1 Mini at 77.9.
3 models

Related tools

1
View all

Implementations, trainers, datasets and scaffolds linked to this eval.

FAQ

What is Medredqa?
MedRedQA medical question answer evaluation
What is the current top score on Medredqa?
The top reported score is 77.9% by GPT-4.1 Mini, across 3 models reporting (3 from frontier labs).
How can a model improve its Medredqa score?
Tools linked to Medredqa on Sophon include Medredqa RL Env (Community) - RL environments, datasets, and scaffolds that target this eval.
What license is Medredqa under?
Medredqa is available under unknown.