Cite
Notes
Only stored in your browser.
Attribution
Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models
arXiv 2025
Rethinking Chunk Size For Long-Document Retrieval: A Multi-Dataset Analysis
from 2 papers
Abbas Goher Khan
Alex Jude
Alexander Arno Weber
David Kaczér
Elias Wendt
Florian Mai
Jannis Spiekermann
Joachim köhler
Kristian Kersting
Lucie Flek