Christopher Akiki
- Papers
- 5
Cite
Notes
Only stored in your browser.
5papers
Authored papers
5The German Commons - 154 Billion Tokens of Openly Licensed Text for German Language Models
arXiv 2025
SantaCoder: don't reach for the stars!
arXiv 2023
GAIA Search: Hugging Face and Pyserini Interoperability for NLP Training Data Exploration
arXiv 2023
The ROOTS Search Tool: Data Transparency for LLMs
arXiv 2023
Spacerini: Plug-and-play Search Engines with Pyserini and Hugging Face
arXiv 2023
Affiliations
No known affiliations.
Frequent co-authors
10from 5 papers