What capabilities does Atari 57 test?

Atari 57 evaluates planning, image understanding.

How can a model improve its Atari 57 score?

Tools linked to Atari 57 on Sophon include OpenEnv Atari (ALE) - RL environments, datasets, and scaffolds that target this eval.

Atari 57 is available under GPL-2.0.

Active

57 Atari 2600 games played from raw pixels - the foundational reinforcement-learning benchmark from DeepMind's DQN era.

Publisher: Google DeepMind
Capabilities: Planning Image Understanding
Domain: agentic
Format: Openenv
Size: 57 tasks
License: GPL-2.0
Published: Jul 2012
Notable for: Benchmark for evaluating planning and image understanding in the agentic domain.
Canonical: github.com/google-deepmind/atari-57
Also on: gymlibrary.dev/environments/atari

Cite

Notes

Only stored in your browser.

Implementations, trainers, datasets and scaffolds linked to this eval.

Hugging Face

OpenEnv wrapper around the Arcade Learning Environment - the classic 57-game Atari 2600 benchmark suite that defined deep RL.

JAIR · 2013

Bellemare et al.'s Atari 2600 emulator framework that became the standard RL evaluation platform for the deep-RL era.

JAIR · 2013

Bellemare et al.'s Atari 2600 emulator framework that became the standard RL evaluation platform for the deep-RL era.

What is Atari 57?: 57 Atari 2600 games played from raw pixels - the foundational reinforcement-learning benchmark from DeepMind's DQN era.
What capabilities does Atari 57 test?: Atari 57 evaluates planning, image understanding.
How can a model improve its Atari 57 score?: Tools linked to Atari 57 on Sophon include OpenEnv Atari (ALE) - RL environments, datasets, and scaffolds that target this eval.
What license is Atari 57 under?: Atari 57 is available under GPL-2.0.