Jayashree Mohan

Cite

Notes

Only stored in your browser.

Attribution

3papers

Authored papers

vAttention: Dynamic Memory Management for Serving LLMs without PagedAttention

arXiv 2024

Taming Throughput-Latency Tradeoff in LLM Inference with Sarathi-Serve

arXiv 2024

Etalon: Holistic Performance Evaluation Framework for LLM Inference Systems

arXiv 2024

No known affiliations.

from 3 papers

Ramachandran Ramjee

Alexey Tumanov

Amey Agrawal

Ashish Panwar

Nipun Kwatra

Nitin Kedia

Ajay Nayak

Anmol Agarwal

Bhargav S. Gulavani

Ramya Prabhu