Large Language Models (LLMs) often struggle with creative generation, and multi-agent frameworks that improve reasoning through interaction can paradoxically hinder creativity by inducing content homogenization. We introduce LLM Review, a peer-review-inspired framework implementing Blind Peer Review: agents exchange targeted feedback while revising independently, preserving divergent creative trajectories. To enable rigorous evaluation, we propose SciFi-100, a science fiction writing dataset with a unified framework combining LLM-as-a-judge scoring, human annotation, and rule-based novelty metrics. Experiments demonstrate that LLM Review consistently outperforms multi-agent baselines, and smaller models with our framework can surpass larger single-agent models, suggesting interaction structure may substitute for model scale.
LLM Review: Enhancing Creative Writing via Blind Peer Review Feedback
LLM Review, a peer-review-inspired framework with blind peer review mechanism, enhances creative generation while maintaining diverse creative trajectories, outperforming traditional multi-agent approaches and enabling smaller models to exceed larger single-agent models.
- Year
- 2026
- Venue
- arXiv 2026
- Authors
- 9
- Hosting
- Abstract onlyARXIV-DEFAULT
Cite
Notes
Only stored in your browser.
Attribution
- Abstract & full text
- arxiv.org/abs/2601.08003ARXIV-DEFAULT
- TL;DR
- Semantic Scholar