Ruihang Lai
- Papers
- 5
Cite
Notes
Only stored in your browser.
5papers
Authored papers
5Gecko: An Efficient Neural Architecture Inherently Processing Sequences with Arbitrary Lengths
arXiv 2026
FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving
arXiv 2025
WebLLM: A High-Performance In-Browser LLM Inference Engine
arXiv 2024
XGrammar: Flexible and Efficient Structured Generation Engine for Large Language Models
arXiv 2024
MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding
arXiv 2024
Affiliations
No known affiliations.
Frequent co-authors
10from 5 papers