Cite
Notes
Only stored in your browser.
Attribution
PRIMA.CPP: Speeding Up 70B-Scale LLM Inference on Low-Resource Everyday Home Clusters
arXiv 2025
TPI-LLM: Serving 70B-scale LLMs Efficiently on Low-resource Edge Devices
arXiv 2024
from 2 papers
Mohsen Guizani
Wenjiao Feng
Zonghang Li
Tao Li