Cite
Notes
Only stored in your browser.
Attribution
BLEUBERI: BLEU is a surprisingly effective reward for instruction following
arXiv 2025
One ruler to measure them all: Benchmarking multilingual long-context language models
FABLES: Evaluating faithfulness and content selection in book-length summarization
arXiv 2024
from 3 papers
Mohit Iyyer
Marzena Karpinska
Yapei Chang
Amir Zadeh
Aparna Garimella
Chris Tanner
Chuan Li
Jenna Russell
Kyle Lo
Michael Krumdick