Róbert Csordás

Papers: 10

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile: Semantic Scholar

Attribution policy →

10papers

Authored papers

Do Language Models Use Their Depth Efficiently?

arXiv 2025

2025

Mixture of Sparse Attention: Content-Based Learnable Sparse Attention via Expert-Choice Routing

arXiv 2025

2025

Chain-of-Experts: Unlocking the Communication Power of Mixture-of-Experts Models

arXiv 2025

2025

MoEUT: Mixture-of-Experts Universal Transformers

arXiv 2024

2024

Recurrent Neural Networks Learn to Store and Generate Sequences using Non-Linear Representations

arXiv 2024

2024

MrT5: Dynamic Token Merging for Efficient Byte-level Language Models

arXiv 2024

2024

SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention

arXiv 2023

2023

Randomized Positional Encodings Boost Length Generalization of Transformers

arXiv 2023

2023

Approximating Two-Layer Feedforward Networks for Efficient Transformers

arXiv 2023

2023

A Modern Self-Referential Weight Matrix That Learns to Modify Itself

arXiv 2022

2022

Affiliations

No known affiliations.

Frequent co-authors

from 10 papers

Jürgen Schmidhuber

5 shared papers

Christopher D. Manning

Christopher Potts

Kazuki Irie

Piotr Piękos

Anian Ruoss

Atticus Geiger

Grégoire Delétang

Imanol Schlag

Jiajun Wu