Souvik Kundu
- Papers
- 11
Cite
Notes
Only stored in your browser.
Authored papers
11LANTERN++: Enhancing Relaxed Speculative Decoding with Static Tree Drafting for Visual Auto-regressive Models
arXiv 2025
SEAL: Steerable Reasoning Calibration of Large Language Models for Free
arXiv 2025
GEAR: An Efficient KV Cache Compression Recipe for Near-Lossless Generative Inference of LLM
arXiv 2024
Demystifying AI Platform Design for Distributed Inference of Next-Generation LLM models
arXiv 2024
LaMDA: Large Model Fine-Tuning via Spectrally Decomposed Low-Dimensional Adaptation
arXiv 2024
Understanding the Performance and Estimating the Cost of LLM Fine-Tuning
arXiv 2024
ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization
arXiv 2024
Etalon: Holistic Performance Evaluation Framework for LLM Inference Systems
arXiv 2024
NeRFool: Uncovering the Vulnerability of Generalizable Neural Radiance Fields against Adversarial Perturbations
arXiv 2023
Fusing Models with Complementary Expertise
arXiv 2023
Hybrid Ranking Network for Text-to-SQL
arXiv 2020
Affiliations
Frequent co-authors
10from 11 papers