0

Browser Nav Challenge

RL environment for post-training small models on Brett Adcock's 30-step browser navigation challenge using DOM accessibility trees and Playwright

Domain
rl-env
License
unknown
Published
Feb 2026

Cite

Notes

Only stored in your browser.

Attribution

Leaderboard scores
prime-hub
Attribution policy →

Top score 16.7% by Claude Opus 4.5 - 4 models reporting (2 frontier)

Score history

4
0%25%50%75%100%Apr 25Jun 25Aug 25Oct 25Dec 25Qwen3 30B A3BClaude Opus 4.5

Top models

4
Browser Nav ChallengeBar chart with 4 bars. Highest value: Claude Opus 4.5 at 16.7.
4 models

Related tools

1
View all

Implementations, trainers, datasets and scaffolds linked to this eval.

FAQ

What is Browser Nav Challenge?
RL environment for post-training small models on Brett Adcock's 30-step browser navigation challenge using DOM accessibility trees and Playwright
What is the current top score on Browser Nav Challenge?
The top reported score is 16.7% by Claude Opus 4.5, across 4 models reporting (2 from frontier labs).
How can a model improve its Browser Nav Challenge score?
Tools linked to Browser Nav Challenge on Sophon include NAV Challenge RL Env (Chakra Labs) - RL environments, datasets, and scaffolds that target this eval.
What license is Browser Nav Challenge under?
Browser Nav Challenge is available under unknown.