Cite
Notes
Only stored in your browser.
Attribution
Mélange: Cost Efficient Large Language Model Serving by Exploiting GPU Heterogeneity
arXiv 2024
Online Speculative Decoding
arXiv 2023
GACT: Activation Compressed Training for Generic Network Architectures
arXiv 2022
from 3 papers
Alvin Cheung
Ion Stoica
professor / co-founder
Dequan Wang
Doyoung Kim
Hao Zhang
professor
Jianfei Chen
Jiaxiang Yu
Jie Tang
engineer
Joey Gonzalez
Lanxiang Hu