clinical-diagnosis-differential
This environment tests an agent's ability to perform differential diagnosis by iteratively gathering patient information and consulting medical knowledge bases. The agent must accurately identify the most likely condition from a set of possibilities.
Overview
Domain: medicine Base Class: StatefulToolEnv Difficulty: medium Task: The model must interact with a simulated patient and medical tools to gather information, formulate a differential diagnosis, and ultimately identify the correct primary diagnosis.
Quickstart
Installation
uv run vf-install clinical-diagnosis-differential
Usage
import verifiers as vf
env = vf.load_environment("clinical-diagnosis-differential")
results = env.evaluate_sync(
client=vf.OpenAI(),
model="gpt-4.1-mini",
num_examples=10,
rollouts_per_example=1
)
Evaluation
Run an evaluation with default settings:
uv run vf-eval clinical-diagnosis-differential
Configure model and sampling:
uv run vf-eval clinical-diagnosis-differential \
-m gpt-4.1-mini \
-n 20 -r 3 -t 1024 -T 0.7
Environment Arguments
| Arg | Type | Default | Description |
|---|---|---|---|
num_examples | int | 1000 | Number of training examples |
num_eval_examples | int | 100 | Number of evaluation examples |
seed | int | 42 | Random seed for reproducibility |
Metrics
| Metric | Meaning |
|---|---|
reward | Primary reward signal |
format_reward | Format adherence reward (if applicable) |
About
Generated by synthetic-rl-env-creator.
Tags: medicine