MuSR: Testing the Limits of Chain-of-thought with Multistep Soft Reasoning
Active
Evaluating models on multistep soft reasoning tasks in the form of free text narratives.
- Publisher
- University of Texas at Austin
- Domain
- Reasoning
- License
- mit
- Published
- May 2026
- Notable for
- Benchmark for evaluating Reasoning.
Cite
Notes
Only stored in your browser.
FAQ
- What is MuSR: Testing the Limits of Chain-of-thought with Multistep Soft Reasoning?
- Evaluating models on multistep soft reasoning tasks in the form of free text narratives.
- What license is MuSR: Testing the Limits of Chain-of-thought with Multistep Soft Reasoning under?
- MuSR: Testing the Limits of Chain-of-thought with Multistep Soft Reasoning is available under mit.