0

Deepdive

DeepDive QA RL environment with a Serper-powered search tool

Domain
rl-env
License
unknown
Published
Oct 2025

Cite

Notes

Only stored in your browser.

Attribution

Leaderboard scores
prime-hub
Attribution policy →

Top score 66.7% by GPT-5 Mini - 3 models reporting (2 frontier)

Score history

2
40%55%70%85%100%Aug 25Sep 25Oct 25Nov 25Dec 25GPT-5 Mini

Top models

3
DeepdiveBar chart with 3 bars. Highest value: GPT-5 Mini at 66.7.
3 models

Related tools

1
View all

Implementations, trainers, datasets and scaffolds linked to this eval.

FAQ

What is Deepdive?
DeepDive QA RL environment with a Serper-powered search tool
What is the current top score on Deepdive?
The top reported score is 66.7% by GPT-5 Mini, across 3 models reporting (2 from frontier labs).
How can a model improve its Deepdive score?
Tools linked to Deepdive on Sophon include Deepdive RL Env (Prime Intellect) - RL environments, datasets, and scaffolds that target this eval.
What license is Deepdive under?
Deepdive is available under unknown.