0

ValidMol

Fresh

ValidMol is an enviroment that tests molecule generation using programmatic verifiers.

Type
RL Env
Runtime
ORS
License
unknown
Size
931 tasks
Published
Jan 2026

Cite

Notes

Only stored in your browser.

ValidMol

OpenReward Environment

Description

ValidMol is an environment for evaluating an agent's ability to complete and fix corrupted molecular SMILES strings. Given a partial or corrupted SMILES notation, the agent must produce a valid, chemically stable molecule. Tasks apply four corruption strategies to real PubChem compounds: truncation, character deletion, ring opening, and bond corruption.

Capabilities

  • Completing and repairing corrupted SMILES molecular representations
  • Understanding chemical structure and bonding rules
  • Reasoning about molecular stability and validity
  • Working with cheminformatics concepts (ring closure, bond types)

Compute Requirements

Agents are given a standard environment with no sandbox or file system access.

Tasks

There are two splits in this environment:

  • train: 789 molecular completion tasks
  • test: 142 molecular completion tasks

Tasks are generated from PubChem compounds (CIDs 2500-3500 for train, CIDs 4500-4700 for test) with four corruption types:

Corruption TypeFrequencyDescription
Truncation40%Removes 20-50% of the SMILES suffix
Character deletion25%Removes 1-3 random characters
Ring opening20%Removes a closing ring number
Bond corruption15%Removes or changes bond symbols

Reward Structure

This is a sparse, verifiable reward environment. The agent calls submit_completion with a SMILES string and receives a binary reward based on three-tier RDKit validation:

  1. Syntactic validity: RDKit parsing succeeds
  2. Structural validity: RDKit sanitization passes
  3. Chemical stability: No peroxides, hydrazines, or conjugated alkynes
  • 1.0: All three tiers pass
  • 0.0: Any tier fails

No LLM graders are used.

Data

Task data is generated from PubChem compounds via the PUG-REST API. Quality filters include SMILES length 10-80, heavy atoms 5-50, and a maximum of 5 rings. Data is stored on the OpenReward platform.

Tools

ToolDescription
submit_completionSubmit a completed/fixed SMILES string for three-tier chemical validation

Time Horizon

Single-turn. Agents receive a corrupted SMILES string and submit one completion.

Environment Difficulty

[Put environment difficulty statistics here]

Other Environment Requirements

There are no further environment requirements; ValidMol works out of the box with the OpenReward endpoint.

Safety

Agents in ValidMol complete molecular structures from corrupted inputs. There is a dual-use concern in that improved molecular generation capabilities could be applied to both beneficial and harmful purposes.

Citation

@dataset{GRValidMol,
  author    = {General Reasoning Inc. Team},
  title     = {ValidMol},
  year      = {2026},
  publisher = {OpenReward},
  url       = {https://www.openreward.ai/GeneralReasoning/ValidMol}
}