Ion Stoica
UC Berkeley CS professor; co-founder of Databricks, Anyscale, and LMSYS / LMArena; advisor on the academic side of the Arena infrastructure.
- Role
- professor / co-founder
- Currently at
- University of California, Berkeley
- twitter.com/istoica05
- GitHub
- github.com/istoica
- Scholar
- scholar.google.com/citations
- Papers
- 45
Cite
Notes
Only stored in your browser.
Authored papers
45K-Search: LLM Kernel Generation via Co-Evolving Intrinsic World Model
arXiv 2026
Flash-KMeans: Fast and Memory-Efficient Exact K-Means
arXiv 2026
ClawEnvKit: Automatic Environment Generation for Claw-Like Agents
arXiv 2026
AstraFlow: Dataflow-Oriented Reinforcement Learning for Agentic LLMs
arXiv 2026
VisGym: Diverse, Customizable, Scalable Environments for Multimodal Agents
arXiv 2026
DeepScholar-Bench: A Live Benchmark and Automated Evaluation for Generative Research Synthesis
arXiv 2025
Fast Video Generation with Sliding Tile Attention
arXiv 2025
lmgame-Bench: How Good are LLMs at Playing Games?
arXiv 2025
Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via Semantic-Aware Permutation
arXiv 2025
Why Do Multi-Agent LLM Systems Fail?
arXiv 2025
S*: Test Time Scaling for Code Generation
arXiv 2025
Prompt-to-Leaderboard
arXiv 2025
GSO: Challenging Software Optimization Tasks for Evaluating SWE-Agents
arXiv 2025
Sleep-time Compute: Beyond Inference Scaling at Test-time
arXiv 2025
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times
arXiv 2025
FrontierCS: Evolving Challenges for Evolving Intelligence
arXiv 2025
GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning
arXiv 2025
SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse-Linear Attention
arXiv 2025
Radial Attention: $O(n\log n)$ Sparse Attention with Energy Decay for Long Video Generation
arXiv 2025
The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks
arXiv 2025
Efficient Long-context Language Model Training by Core Attention Disaggregation
arXiv 2025
Optimizing Model Selection for Compound AI Systems
arXiv 2025
From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline
preprint
Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
ICML
LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code
NeurIPS
RouteLLM: Learning to Route LLMs with Preference Data
arXiv 2024
Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
arXiv 2024
Post-Training Sparse Attention with Double Sparsity
arXiv 2024
GoEX: Perspectives and Designs Towards a Runtime for Autonomous LLM Applications
arXiv 2024
JudgeBench: A Benchmark for Evaluating LLM-based Judges
arXiv 2024
Efficient LLM Scheduling by Learning to Rank
arXiv 2024
How to Evaluate Reward Models for RLHF
arXiv 2024
Mélange: Cost Efficient Large Language Model Serving by Exploiting GPU Heterogeneity
arXiv 2024
OR-Bench: An Over-Refusal Benchmark for Large Language Models
arXiv 2024
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
NeurIPS
Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90% ChatGPT Quality
blog
FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU
arXiv 2023
SGLang: Efficient Execution of Structured Language Model Programs
arXiv 2023
Efficient Memory Management for Large Language Model Serving with PagedAttention
arXiv 2023
S-LoRA: Serving Thousands of Concurrent LoRA Adapters
arXiv 2023
Rethinking Benchmark and Contamination for Language Models with Rephrased Samples
arXiv 2023
DISTFLASHATTN: Distributed Memory-efficient Attention for Long-context LLMs Training
arXiv 2023
Online Speculative Decoding
arXiv 2023
CLUTR: Curriculum Learning via Unsupervised Task Representation Learning
arXiv 2022
Ray: A Distributed Framework for Emerging AI Applications
arXiv 2017
Eval contributions
1Affiliations
Previously
Frequent co-authors
10from 45 papers
Joseph E. Gonzalez
Hao Zhang
professor
Wei-Lin Chiang
co-founder / President
Lianmin Zheng
grad-student
Ying Sheng
researcher
Shuo Yang
Dacheng Li
grad-student
Kurt Keutzer
Siyuan Zhuang
researcher
Banghua Zhu
professor