Nemotron-Science
Description
Nemotron-Science is a single-turn science reasoning environment based on NVIDIA's Nemotron-Science-v1 dataset. It contains two subsets:
- MCQ: 174,155 multiple-choice science questions across STEM domains including genetics, pharmacogenomics, synthetic biology, evolutionary biology, and more.
- RQA: 52,179 reasoning-based chemistry questions requiring mathematical derivations with boxed answers.
Capabilities
- Scientific reasoning across diverse STEM domains
- Multiple-choice question answering with structured option selection
- Open-ended mathematical/scientific problem solving (chemistry focus)
- Multi-step reasoning and derivation
License
Tasks
There are two splits in this environment:
| Split | Type | Tasks | Description |
|---|---|---|---|
mcq | train | 174,155 | Multiple-choice science questions (GPQA-style) |
rqa | train | 52,179 | Reasoning-based chemistry questions with boxed answers |
Reward Structure
This is a sparse, verifiable reward environment. Rewards are binary:
- MCQ: Exact letter match against the expected answer. Reward is 1.0 for correct, 0.0 for incorrect.
- RQA: LLM-graded equivalence check (gpt-5-mini) comparing submitted answers against expected values, allowing for equivalent notations and minor rounding differences. Reward is 1.0 for correct, 0.0 for incorrect.
Data
Data is sourced from nvidia/Nemotron-Science-v1 on Hugging Face. The dataset consists of synthetic science reasoning data generated using GPT-OSS-120B for training the Nemotron 3 model family.
Each task contains a user question and an assistant response with reasoning traces. The environment extracts the question as the prompt and the answer from the assistant response for grading.
Tools
| Tool | Description |
|---|---|
submit_answer | Submit a final answer. For MCQ, provide a single letter (e.g. C). For RQA, provide the answer value. |
Time Horizon
Single-turn: one prompt, one tool call, episode ends.
Other Environment Requirements
- RQA split requires an OpenAI API key (passed via
secrets["openai_api_key"]) for LLM-based answer grading. - MCQ split has no external requirements.
Safety
This environment presents no direct safety risks. Agents answer science questions and receive binary correctness feedback. No external actions, network access, or file system manipulation is involved.
Citations
@dataset{nvidia2026nemotronscience,
author = {NVIDIA},
title = {Nemotron-Science-v1},
year = {2026},
publisher = {Hugging Face},
url = {https://huggingface.co/datasets/nvidia/Nemotron-Science-v1}
}