0

A mathematical perspective on Transformers

A mathematical framework analyzing Transformers as interacting particle systems reveals the emergence of clusters over time.

Year
2023
Venue
arXiv 2023
Authors
4
Hosting
Abstract onlyARXIV-DEFAULT

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2312.10794v4ARXIV-DEFAULT
TL;DR
Semantic Scholar
Attribution policy →

Abstract

Transformers play a central role in the inner workings of large language models. We develop a mathematical framework for analyzing Transformers based on their interpretation as interacting particle systems, which reveals that clusters emerge in long time. Our study explores the underlying theory and offers new perspectives for mathematicians as well as computer scientists.

Authors

4