NoveltyBench: Evaluating Language Models for Humanlike Diversity
Active
Evaluates how well language models generate diverse, humanlike responses across multiple reasoning and generation tasks. This evaluation assesses whether LLMs can produce varied outputs rather than repetitive or uniform answers.
- Publisher
- Carnegie Mellon University
- Domain
- Reasoning
- License
- mit
- Published
- Dec 2025
- Notable for
- Benchmark for evaluating Reasoning.
Cite
Notes
Only stored in your browser.
FAQ
- What is NoveltyBench: Evaluating Language Models for Humanlike Diversity?
- Evaluates how well language models generate diverse, humanlike responses across multiple reasoning and generation tasks. This evaluation assesses whether LLMs can produce varied outputs rather than repetitive or uniform answers.
- What license is NoveltyBench: Evaluating Language Models for Humanlike Diversity under?
- NoveltyBench: Evaluating Language Models for Humanlike Diversity is available under mit.