0

Gsm8k

Fresh

GSM8K environment

Type
RL Env
Publisher
Prime
Capabilities
Math
Tags
Gsm8k
Runtime
single-turn
License
unknown
Size
v0.2.0
Published
May 2026

Cite

Notes

Only stored in your browser.

gsm8k

Source Code

Overview

  • Environment ID: gsm8k
  • Short description: Single-turn GSM8K math word problems with boxed numeric answers and CoT.
  • Tags: math, gsm8k, single-turn, think, boxed-answer

Datasets

  • Primary dataset(s): gsm8k train (train) and test (eval) via load_example_dataset
  • Source links: Uses the example loader in verifiers.utils.data_utils
  • Split sizes: Full GSM8K train (source) and test (eval) splits

Task

  • Type: single-turn
  • Scoring: Exact match on parsed \boxed{} answer

Quickstart

Run an evaluation with default settings:

prime eval run gsm8k

Configure model and sampling:

prime eval run gsm8k \
  -m gpt-4.1-mini \
  -n 20 -r 3 -t 1024 -T 0.7

Metrics

MetricMeaning
reward1.0 if parsed boxed answer equals target, else 0.0