0

MuSR: Testing the Limits of Chain-of-thought with Multistep Soft Reasoning

Active

Evaluating models on multistep soft reasoning tasks in the form of free text narratives.

Domain
Reasoning
License
mit
Published
May 2026
Notable for
Benchmark for evaluating Reasoning.

Cite

Notes

Only stored in your browser.

FAQ

What is MuSR: Testing the Limits of Chain-of-thought with Multistep Soft Reasoning?
Evaluating models on multistep soft reasoning tasks in the form of free text narratives.
What license is MuSR: Testing the Limits of Chain-of-thought with Multistep Soft Reasoning under?
MuSR: Testing the Limits of Chain-of-thought with Multistep Soft Reasoning is available under mit.