LMArena
Crowdsourced human-preference LLM evaluation platform; the dominant public arena leaderboard, spun out of UC Berkeley LMSYS.
- Type
- eval co
- HQ
- Berkeley, CA, USA
- Founded
- 2023 (as Chatbot Arena under LMSYS); incorporated as LMArena Inc. 2025
- Funding
- Private; raised $100M seed at $600M valuation in 2025, led by a16z and UC Investments
- Website
- lmarena.ai
- GitHub
- github.com/lm-sys
- twitter.com/lmarena_ai
Cite
Notes
Only stored in your browser.
Evals
2
Tools
1
Models
0
Papers
3
Boards
18
People
4
Leaderboards
18Arena Hard PromptsLMArena subcategory ranking models on a filtered slice of Arena prompts auto-classified as hard along multiple difficulty axes.Arena MathLMArena subcategory ranking models on user pairwise votes restricted to math-related prompts.Arena OverallThe headline LMArena (formerly Chatbot Arena) ranking aggregating crowd pairwise preference votes across all prompt categories into a Bradley-Terry Elo-style rating.Arena - DocumentCrowdsourced document model ratings from LMArena. Elo-style scores computed from pairwise human preference votes.Arena - Document Style ControlCrowdsourced document style control model ratings from LMArena. Elo-style scores computed from pairwise human preference votes.Arena - Image EditCrowdsourced image edit model ratings from LMArena. Elo-style scores computed from pairwise human preference votes.Arena - Image to VideoCrowdsourced image to video model ratings from LMArena. Elo-style scores computed from pairwise human preference votes.Arena - SearchCrowdsourced search model ratings from LMArena. Elo-style scores computed from pairwise human preference votes.Arena - Search Style ControlCrowdsourced search style control model ratings from LMArena. Elo-style scores computed from pairwise human preference votes.Arena - TextCrowdsourced text model ratings from LMArena. Elo-style scores computed from pairwise human preference votes.Arena - Text Style ControlCrowdsourced text style control model ratings from LMArena. Elo-style scores computed from pairwise human preference votes.Arena - Text to ImageCrowdsourced text to image model ratings from LMArena. Elo-style scores computed from pairwise human preference votes.Arena - Text to VideoCrowdsourced text to video model ratings from LMArena. Elo-style scores computed from pairwise human preference votes.Arena - Video EditCrowdsourced video edit model ratings from LMArena. Elo-style scores computed from pairwise human preference votes.Arena - Vision Style ControlCrowdsourced vision style control model ratings from LMArena. Elo-style scores computed from pairwise human preference votes.Arena - WebdevCrowdsourced webdev model ratings from LMArena. Elo-style scores computed from pairwise human preference votes.Arena CodingLMArena subcategory ranking models on user pairwise votes restricted to coding prompts.Arena VisionLMArena subcategory for vision-language models (VLMs), ranked by user pairwise votes on prompts that include an image.