Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
The methodology paper for Chatbot Arena, which collects crowd pairwise preference votes on anonymized side-by-side LLM responses and aggregates them via Bradley-Terry into Elo-style rankings.
- Publisher
- LMArena
- Year
- 2024
- Venue
- ICML
- Authors
- 11
- Hosting
- External sourcelicense unknown
Cite
Notes
Only stored in your browser.
TL;DR
Semantic Scholar
This paper describes the Chatbot Arena platform, analyzes the data collected so far, and explains the tried-and-true statistical methods used for efficient and accurate evaluation and ranking of models, to establish a robust foundation for the credibility of Chatbot Arena.