Cite
Notes
Only stored in your browser.
Attribution
Recovered in Translation: Efficient Pipeline for Automated Translation of Benchmarks and Datasets
arXiv 2026
RedPajama: an Open Dataset for Training Large Language Models
arXiv 2024
BgGPT 1.0: Extending English-centric LLMs to other languages
from 3 papers
Ce Zhang
Martin Vechev
Ben Athiwaratkun
Christopher Ré
Daniel Fu
Dimitar I. Dimitrov
Hanna Yukhymenko
Huu Nguyen
Irina Rish
Kezhen Chen