Cite
Notes
Only stored in your browser.
Attribution
MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs
arXiv 2024
LEMON: Lossless model expansion
arXiv 2023
from 2 papers
Haibin Lin
Ding Zhou
Hanlin Lu
Haohan Xu
Haoran Wei
Hongmin Chen
Hongxia Yang
Jiahao Su
Jianbo Yuan
Jianxi Ye