Alexander M. Rush
- Papers
- 31
Cite
Notes
Only stored in your browser.
Authored papers
31M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
arXiv 2025
Multi-Turn Code Generation Through Single-Step Rewards
arXiv 2025
Challenges in Trustworthy Human Evaluation of Chatbots
arXiv 2024
ShadowLLM: Predictor-based Contextual Sparsity for Large Language Models
arXiv 2024
The Mamba in the Llama: Distilling and Accelerating Hybrid Models
arXiv 2024
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling
arXiv 2023
Language Model Inversion
arXiv 2023
Scaling Data-Constrained Language Models
scaling-data-constrained-language-models
OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents
obelics-an-open-web-scale-filtered-dataset-of
Tree Prompting: Efficient Task Adaptation without Fine-Tuning
arXiv 2023
PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts
ACL 2022 5
Evaluate & Evaluation on the Hub: Better Best Practices for Data and Model Measurements
arXiv 2022
Pretraining Without Attention
arXiv 2022
Markup-to-Image Diffusion Models with Scheduled Sampling
arXiv 2022
Model Criticism for Long-Form Text Generation
arXiv 2022
Datasets: A Community Library for Natural Language Processing
EMNLP (ACL) 2021 11
Block Pruning For Faster Transformers
EMNLP 2021 11
Rationales for Sequential Predictions
EMNLP 2021 11
Pre-trained Summarization Distillation
arXiv 2020
Movement Pruning: Adaptive Sparsity by Fine-Tuning
NeurIPS 2020 12
Cascaded Text Generation with Markov Transformers
NeurIPS 2020 12
Parameter-Efficient Transfer Learning with Diff Pruning
parameter-efficient-transfer-learning-with
Neural Linguistic Steganography
neural-linguistic-steganography-1
GLTR: Statistical Detection and Visualization of Generated Text
gltr-statistical-detection-and-visualization-1
OpenNMT: Neural Machine Translation Toolkit
opennmt-neural-machine-translation-toolkit-1
Latent Alignment and Variational Attention
latent-alignment-and-variational-attention-1
Bottom-Up Abstractive Summarization
bottom-up-abstractive-summarization-1
Challenges in Data-to-Document Generation
challenges-in-data-to-document-generation-1
Image-to-Markup Generation with Coarse-to-Fine Attention
image-to-markup-generation-with-coarse-to-1
Sequence-Level Knowledge Distillation
sequence-level-knowledge-distillation-1
Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks
arXiv 2015
Affiliations
Frequent co-authors
10from 31 papers
Yuntian Deng
professor
Victor Sanh
Thomas Wolf
chief-science-officer
Yoon Kim
Junxiong Wang
Wenting Zhao
grad-student
Abhishek Thakur
Aleksandra Piktus
Canwen Xu
Colin Raffel