Amir Gholami

Papers: 22

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile: Semantic Scholar

Attribution policy →

22papers

Authored papers

Residual Context Diffusion Language Models

arXiv 2026

2026

CDLM: Consistency Diffusion Language Models For Faster Sampling

arXiv 2025

2025

Arbitrage: Efficient Reasoning via Advantage-Aware Speculation

arXiv 2025

2025

ETS: Efficient Tree Search for Inference-Time Scaling

arXiv 2025

2025

TinyAgent: Function Calling at the Edge

arXiv 2024

2024

KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization

arXiv 2024

2024

Efficient and Scalable Estimation of Tool Representations in Vector Space

arXiv 2024

2024

LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement

arXiv 2024

2024

Squeezed Attention: Accelerating Long Context Length LLM Inference

arXiv 2024

2024

An LLM Compiler for Parallel Function Calling

arXiv 2023

2023

SqueezeLLM: Dense-and-Sparse Quantization

arXiv 2023

2023

Speculative Decoding with Big Little Decoder

speculative-decoding-with-big-little-decoder

2023

Squeezeformer: An Efficient Transformer for Automatic Speech Recognition

arXiv 2022

2022

I-BERT: Integer-only BERT Quantization

arXiv 2021

2021

Learned Token Pruning for Transformers

arXiv 2021

2021

Hessian-Aware Pruning and Optimal Neural Implant

arXiv 2021

2021

HAWQV3: Dyadic Neural Network Quantization

arXiv 2020

2020

ZeroQ: A Novel Zero Shot Quantization Framework

zeroq-a-novel-zero-shot-quantization-1

2020

ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning

arXiv 2020

2020

PowerNorm: Rethinking Batch Normalization in Transformers

ICML 2020 1

2020

HAWQ: Hessian AWare Quantization of Neural Networks with Mixed-Precision

hawq-hessian-aware-quantization-of-neural

2019

HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks

NeurIPS 2020 12

2019

Affiliations

No known affiliations.

Frequent co-authors

from 22 papers

Kurt Keutzer

Michael W. Mahoney

Sehoon Kim

Coleman Hooper

Zhewei Yao

Nicholas Lee

Zhen Dong

Sheng Shen

Suhong Moon

Monishwaran Maheswaran

4 shared papers