David Samuel
- Papers
- 12
Cite
Notes
Only stored in your browser.
Authored papers
12NorEval: A Norwegian Language Understanding and Generation Evaluation Benchmark
arXiv 2025
An Expanded Massive Multilingual Dataset for High-Performance Language Technologies
arXiv 2025
Multi-label Scandinavian Language Identification (SLIDE)
arXiv 2025
Small Languages, Big Models: A Study of Continual Training on Languages of Norway
arXiv 2024
GPT or BERT: why not both?
arXiv 2024
BERTs are Generative In-Context Learners
arXiv 2024
Tokenization with Factorized Subword Encoding
arXiv 2023
NorBench -- A Benchmark for Norwegian Language Models
arXiv 2023
Mean BERTs make erratic language teachers: the effectiveness of latent bootstrapping in low-resource settings
arXiv 2023
Trained on 100 million words and still in shape: BERT meets British National Corpus
arXiv 2023
Direct parsing to sentiment graphs
ACL 2022 5
ÚFAL at MultiLexNorm 2021: Improving Multilingual Lexical Normalization by Fine-tuning ByT5
WNUT (ACL) 2021 11
Affiliations
Frequent co-authors
10from 12 papers
Lilja Øvrelid
Erik Velldal
Andrey Kutuzov
Stephan Oepen
Vladislav Mikhailov
Lucas Georges Gabriel Charpentier
Mariia Fedorova
Petter Mæhlum
and Amanda Myntti
and Andrey Kutuzov