Cite
Notes
Only stored in your browser.
Attribution
Pretraining with hierarchical memories: separating long-tail and common knowledge
arXiv 2025
The AdEMAMix Optimizer: Better, Faster, Older
arXiv 2024
BLEU might be Guilty but References are not Innocent
EMNLP 2020 11
from 3 papers
C Thomas
Hadi Pouransari
Isaac Caswell
Markus Freitag
Matteo Pagliardini
Michael Kirchhof
Oncel Tuzel
Pierre Ablin