Gerard de Melo
- Papers
- 11
Cite
Notes
Only stored in your browser.
Authored papers
11AweDist: Attention-aware Embedding Distillation for New Input Token Embeddings
arXiv 2025
I Don't Know: Explicit Modeling of Uncertainty with an [IDK] Token
arXiv 2024
NextLevelBERT: Masked Language Modeling with Higher-Level Representations for Long Documents
arXiv 2024
CliMedBench: A Large-Scale Chinese Benchmark for Evaluating Medical Large Language Models in Clinical Scenarios
arXiv 2024
Language Adaptation on a Tight Academic Compute Budget: Tokenizer Swapping Works and Pure bfloat16 Is Enough
arXiv 2024
Mitigate the Gap: Investigating Approaches for Improving Cross-Modal Alignment in CLIP
arXiv 2024
FOCUS: Effective Embedding Initialization for Monolingual Specialization of Multilingual Models
arXiv 2023
Efficient Parallelization Layouts for Large-Scale Distributed Model Training
arXiv 2023
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
TMLR
Art Creation with Multi-Conditional StyleGANs
arXiv 2022
NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation
arXiv 2021
Affiliations
Frequent co-authors
10from 11 papers
Konstantin Dobler
Chandan Singh
Damien Sileo
Denis Kleyko
Genta Indra Winata
Gloria Wang
Jascha Sohl-Dickstein
Kaustubh D. Dhole
Marie Tolkiehn
Maximilian Schall