Needle in a Haystack
Classic long-context stress test that hides a short fact ("the needle") inside a long document and asks the model to retrieve it.
- Publisher
- Independent
- Capabilities
- Long ContextRetrieval
- Format
- Custom
- Size
- Unknown tasks
- License
- MIT
- Published
- Nov 2023
- Notable for
- Benchmark for evaluating long context and retrieval.
Cite
Notes
Only stored in your browser.
Related tools
2Implementations, trainers, datasets and scaffolds linked to this eval.
Papers
1Contributors
1FAQ
- What is Needle in a Haystack?
- Classic long-context stress test that hides a short fact ("the needle") inside a long document and asks the model to retrieve it.
- What capabilities does Needle in a Haystack test?
- Needle in a Haystack evaluates long context, retrieval.
- How can a model improve its Needle in a Haystack score?
- Tools linked to Needle in a Haystack on Sophon include Haystack RLM RL Env (Prime Intellect), Context Needle RL Env (Community) - RL environments, datasets, and scaffolds that target this eval.
- What license is Needle in a Haystack under?
- Needle in a Haystack is available under MIT.