Elias Frantar

Papers: 13

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile: Semantic Scholar

Attribution policy →

13papers

Authored papers

MARLIN: Mixed-Precision Auto-Regressive Parallel Inference on Large Language Models

arXiv 2024

2024

SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot

arXiv 2023

2023

SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression

arXiv 2023

2023

QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models

arXiv 2023

2023

QUIK: Towards End-to-End 4-Bit Inference on Generative Large Language Models

arXiv 2023

2023

Sparse Fine-tuning for Inference Acceleration of Large Language Models

arXiv 2023

2023

Error Feedback Can Accurately Compress Preconditioners

arXiv 2023

2023

ZipLM: Inference-Aware Structured Pruning of Language Models

ziplm-inference-aware-structured-pruning-of

2023

GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers

arXiv 2022

2022

The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for Large Language Models

arXiv 2022

2022

Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning

arXiv 2022

2022

L-GreCo: Layerwise-Adaptive Gradient Compression for Efficient and Accurate Deep Learning

arXiv 2022

2022

M-FAC: Efficient Matrix-Free Approximations of Second-Order Information

NeurIPS 2021 12

2021

Affiliations

No known affiliations.

Frequent co-authors

from 13 papers

Dan Alistarh

Eldar Kurtic

Torsten Hoefler

Saleh Ashkboos

Denis Kuznedelev

Ilia Markov

Michael Goin

Aleksei Kalinov

Alexander Borzunov

Benjamin Fineran