Benjamin Minixhofer
- Papers
- 7
Cite
Notes
Only stored in your browser.
7papers
Authored papers
7Cross-Tokenizer Distillation via Approximate Likelihood Matching
arXiv 2025
Bolmo: Byteifying the Next Generation of Language Models
arXiv 2025
Retrofitting Large Language Models with Dynamic Tokenization
arXiv 2024
Segment Any Text: A Universal Approach for Robust, Efficient and Adaptable Sentence Segmentation
arXiv 2024
Zero-Shot Tokenizer Transfer
arXiv 2024
CompoundPiece: Evaluating and Improving Decompounding Performance of Language Models
arXiv 2023
WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models
NAACL 2022 7
Affiliations
No known affiliations.
Frequent co-authors
10from 7 papers