Cite
Notes
Only stored in your browser.
Attribution
Superpose Singular Features for Model Merging
arXiv 2025
Why Low-Precision Transformer Training Fails: An Analysis on Flash Attention
Understanding Expressivity of GNN in Rule Learning
arXiv 2023
from 3 papers
Quanming Yao
Yong Li
Yongqi Zhang
You Wu