Georgios Smyrnis
PhD student at UT Austin and contributor to DataComp-LM (DCLM) and open-source LM pretraining benchmarks.
- Role
- grad-student
- Currently at
- University of Texas at Austin
- Unknown
- GitHub
- github.com/gsmyrnis
- Scholar
- scholar.google.com/citations
- Papers
- 4
Cite
Notes
Only stored in your browser.
4papers
Authored papers
4Open Thoughts: Curating Reasoning Datasets for Open-Source R1 Replications
blog
OpenThoughts: Data Recipes for Reasoning Models
arXiv 2025
Language models scale reliably with over-training and on downstream tasks
arXiv 2024
DataComp: In search of the next generation of multimodal datasets
NeurIPS 2023 11
Affiliations
Frequent co-authors
10from 4 papers
Jean Mercat
researcher
Jenia Jitsev
Ludwig Schmidt
professor
Marianna Nezhurina
researcher
Ryan Marten
engineer
Sedrick Keh
researcher
Vaishaal Shankar
Achal Dave
Alex Fang
Alexandros G. Dimakis