Cite
Notes
Only stored in your browser.
Attribution
ZeroSumEval: Scaling LLM Evaluation with Inter-Model Competition
arXiv 2025
When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards
arXiv 2024
from 2 papers
Haidar Khan
M Saiful Bari
Areeb Alowisheq
Bülent Yener
Faisal Mirza
Hisham A. Alyahya
Hisham Abdullah Alyahya
Nora AlTwairesh
Norah Alzahrani
Nouf Alotaibi