Agentclinic
Multi-agent medical diagnosis environment for evaluating LLMs on clinical diagnosis through interactive conversations.
- Domain
- rl-env
- License
- unknown
- Published
- Oct 2025
Cite
Notes
Only stored in your browser.
Top score 16.7% by GPT-4o-mini - 2 models reporting (1 frontier)
Top models
2Related tools
1Implementations, trainers, datasets and scaffolds linked to this eval.
FAQ
- What is Agentclinic?
- Multi-agent medical diagnosis environment for evaluating LLMs on clinical diagnosis through interactive conversations.
- What is the current top score on Agentclinic?
- The top reported score is 16.7% by GPT-4o-mini, across 2 models reporting (1 from frontier labs).
- How can a model improve its Agentclinic score?
- Tools linked to Agentclinic on Sophon include Agentclinic RL Env (Community) - RL environments, datasets, and scaffolds that target this eval.
- What license is Agentclinic under?
- Agentclinic is available under unknown.
