Cite
Notes
Only stored in your browser.
Attribution
Decouple Searching from Training: Scaling Data Mixing via Model Merging for Large Language Model Pre-training
arXiv 2026
from 1 papers
Fangcheng Shi
Fei Zhao
Haifeng Liu
Jieying Ye
Shaosheng Cao
Shengrui Li
Yao Hu
Zheyong Xie