Omer Levy
- Papers
- 23
Cite
Notes
Only stored in your browser.
Authored papers
23Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length
arXiv 2024
ZeroSCROLLS: A Zero-Shot Benchmark for Long Text Understanding
arXiv 2023
Pick-a-Pic: An Open Dataset of User Preferences for Text-to-Image Generation
NeurIPS 2023 11
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
TMLR
Transformer Language Models without Positional Encodings Still Learn Positional Information
arXiv 2022
LMentry: A Language Model Benchmark of Elementary Language Tasks
arXiv 2022
Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor
arXiv 2022
SCROLLS: Standardized CompaRison Over Long Language Sequences
arXiv 2022
Instruction Induction: From Few Examples to Natural Language Task Descriptions
arXiv 2022
Cryptonite: A Cryptic Crossword Benchmark for Extreme Ambiguity in Language
EMNLP 2021 11
How to Train BERT with an Academic Budget
EMNLP 2021 11
Few-Shot Question Answering by Pretraining Span Selection
ACL 2021 5
Learning to Retrieve Passages without Supervision
NAACL 2022 7
Coreference Resolution without Span Representations
ACL 2021 5
How Optimal is Greedy Decoding for Extractive Question Answering?
arXiv 2021
ParaShoot: A Hebrew Question Answering Dataset
EMNLP (MRQA) 2021 11
Neural Machine Translation without Embeddings
NAACL 2021 4
Transformer Feed-Forward Layers Are Key-Value Memories
EMNLP 2021 11
Are Sixteen Heads Really Better than One?
are-sixteen-heads-really-better-than-one-1
Blockwise Self-Attention for Long Document Understanding
Findings of the Association for Computational Linguistics 2020
SpanBERT: Improving Pre-training by Representing and Predicting Spans
spanbert-improving-pre-training-by-1
What Does BERT Look At? An Analysis of BERT's Attention
what-does-bert-look-at-an-analysis-of-berts-1
code2seq: Generating Sequences from Structured Representations of Code
code2seq-generating-sequences-from-structured-1
Affiliations
Frequent co-authors
10from 23 papers