0

Metamedqa RL Env (Medarc)

Fresh

MetaMedQA medical MCQ evaluation

Type
RL Env
Publisher
Medarc
Runtime
single-turn
License
unknown
Size
v0.1.1
Published
Sep 2025

Cite

Notes

Only stored in your browser.

MetaMedQA

Evaluation environment for the MetaMedQA dataset.

Overview

  • Environment ID: metamedqa
  • Short description: Single-turn medical multiple-choice QA drawn from multiple medical exam sources
  • Tags: medical, single-turn, multiple-choice, eval

Datasets

  • Primary dataset(s): MetaMedQA
  • Source links: maximegmd/MetaMedQA
  • Split sizes: Uses provided test split

Task

  • Type: single-turn
  • Rubric overview: Binary scoring (1.0 / 0.0) based on correct letter or answer text match

Quickstart

Run an evaluation with default settings:

prime eval run metamedqa -m "openai/gpt-5-mini" -n 5 -s

Configure model and sampling:

medarc-eval metamedqa -m "openai/gpt-5-mini" -n 20 --shuffle-answers --shuffle-seed 1618

Environment Arguments

ArgTypeDefaultDescription
splitstr"test"Dataset split to use
shuffle_answersboolFalseWhether to shuffle answer choices
shuffle_seedint | None1618Seed for deterministic answer shuffling

Metrics

MetricMeaning
accuracy(weight 1.0): 1.0 if parsed letter matches the gold letter, else 0.0

Authors

This environment has been put together by:

Aymane Ouraq - (@aymaneo)