Omer Levy

SCROLLS: Standardized CompaRison Over Long Language Sequences

arXiv 2022

Transformer Language Models without Positional Encodings Still Learn Positional Information

arXiv 2022

LMentry: A Language Model Benchmark of Elementary Language Tasks

arXiv 2022

Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor

arXiv 2022

Instruction Induction: From Few Examples to Natural Language Task Descriptions

arXiv 2022

Coreference Resolution without Span Representations

ACL 2021 5

Cryptonite: A Cryptic Crossword Benchmark for Extreme Ambiguity in Language

EMNLP 2021 11

How Optimal is Greedy Decoding for Extractive Question Answering?

arXiv 2021

Few-Shot Question Answering by Pretraining Span Selection

ACL 2021 5

Learning to Retrieve Passages without Supervision

NAACL 2022 7

ParaShoot: A Hebrew Question Answering Dataset

EMNLP (MRQA) 2021 11

How to Train BERT with an Academic Budget

EMNLP 2021 11

Transformer Feed-Forward Layers Are Key-Value Memories

EMNLP 2021 11

2020

Neural Machine Translation without Embeddings

NAACL 2021 4

2020

SpanBERT: Improving Pre-training by Representing and Predicting Spans

spanbert-improving-pre-training-by-1

What Does BERT Look At? An Analysis of BERT's Attention

what-does-bert-look-at-an-analysis-of-berts-1

Are Sixteen Heads Really Better than One?

are-sixteen-heads-really-better-than-one-1

Blockwise Self-Attention for Long Document Understanding

Findings of the Association for Computational Linguistics 2020