Cite
Notes
Only stored in your browser.
Attribution
MUDDFormer: Breaking Residual Bottlenecks in Transformers via Multiway Dynamic Dense Connections
arXiv 2025
Improving Transformers with Dynamically Composable Multi-Head Attention
arXiv 2024
from 2 papers
Da Xiao
Qingye Meng
Xingyuan Yuan