Cite
Notes
Only stored in your browser.
Attribution
Rewriting Pre-Training Data Boosts LLM Performance in Math and Code
arXiv 2025
Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks
from 2 papers
Rio Yokota
Taishi Nakamura
Takumi Okamoto
Daisuke Nohara
Hinari Shimada
Hiroya Takamura
Jun Suzuki
Kakeru Hattori
Kazuki Fujii
Koshiro Saito