Ying Sheng
Stanford CS PhD; co-founder and lead author of SGLang, FlexGen, and S-LoRA; previously co-led xAI's inference team.
- Role
- researcher
- Currently at
- LMSYS Org
- twitter.com/ying11231
- GitHub
- github.com/Ying1123
- Scholar
- scholar.google.com/citations
- Papers
- 11
Cite
Notes
Only stored in your browser.
Authored papers
11Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
ICML
Post-Training Sparse Attention with Double Sparsity
arXiv 2024
SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal Behaviors
arXiv 2024
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
NeurIPS
Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90% ChatGPT Quality
blog
H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
arXiv 2023
FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU
arXiv 2023
SGLang: Efficient Execution of Structured Language Model Programs
arXiv 2023
Efficient Memory Management for Large Language Model Serving with PagedAttention
arXiv 2023
S-LoRA: Serving Thousands of Concurrent LoRA Adapters
arXiv 2023
On Optimal Caching and Model Multiplexing for Large Model Inference
arXiv 2023
Affiliations
Frequent co-authors
10from 11 papers
Lianmin Zheng
grad-student
Ion Stoica
professor / co-founder
Joseph E. Gonzalez
Clark Barrett
Dacheng Li
grad-student
Hao Zhang
professor
Zhuohan Li
researcher
Banghua Zhu
professor
Joseph E. Gonzalez
professor
Siyuan Zhuang
researcher