0

Sctpublic RL Env (Medarc)

Fresh

SCT-Bench Public Environment

Type
RL Env
Publisher
Medarc
Runtime
single-turn
License
unknown
Size
v0.1.0
Published
Feb 2026

Cite

Notes

Only stored in your browser.

sctpublic

Evaluation environment for SCT-Bench public dataset.

Overview

  • Environment ID: sctpublic
  • Short description: Single-turn SCT dataset environment
  • Tags: medical, clinical, single-turn, eval

Datasets

Task

  • Type: Single-turn clinical reasoning evaluation
  • Rubric overview: Custom sct_rubric that normalizes the answer distribution so that the greatest score is always 1

Environment Arguments

ArgTypeDefaultDescription
reasonboolFalseIf True, prompts include an explanation requirement
few_shotboolFalseIf True, includes 5 example ratings in the prompt

Quickstart

Run an evaluation with default settings:

prime eval run sctpublic -m "openai/gpt-5-mini" -n 5 -s

Usage

To run an evaluation using medarc-eval with few-shot prompting and reasoning enabled:

medarc-eval sctpublic -m "openai/gpt-5-mini" -n 5 -s --reason --few-shot

Authors

This environment has been put together by:

Ratna Sagari Grandhi - (@sagarigrandhi)

Credits

Dataset:

@article{mccoy2025assessment,
  title={Assessment of large language models in clinical reasoning: a novel benchmarking study},
  author={McCoy, Liam G and Swamy, Rajiv and Sagar, Nidhish and Wang, Minjia and Bacchi, Stephen and Fong, Jie Ming Nigel and Tan, Nigel CK and Tan, Kevin and Buckley, Thomas A and Brodeur, Peter and others},
  journal={NEJM AI},
  volume={2},
  number={10},
  pages={AIdbp2500120},
  year={2025},
  publisher={Massachusetts Medical Society}
}