Baris Kasikci
- Papers
- 7
Cite
Notes
Only stored in your browser.
7papers
Authored papers
7FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving
arXiv 2025
LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation
arXiv 2025
ConsumerBench: Benchmarking Generative AI Applications on End-User Devices
arXiv 2025
Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts Models
arXiv 2024
NanoFlow: Towards Optimal Large Language Model Serving Throughput
arXiv 2024
Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference
arXiv 2024
Atom: Low-bit Quantization for Efficient and Accurate LLM Serving
arXiv 2023
Affiliations
No known affiliations.
Frequent co-authors
10from 7 papers