We offer a theoretical mathematical background through Lagrangian optimization on the unit hyperspherical manifold and its tangential structure. Our methods can be categorized as inexact since our methods are projection-based and since we will perturb the functional optimization with epsilon-type quantities. We draw connections to the attention mechanism and the Transformer since it exists as a flow map in the tangent fiber for each token along the high-dimensional unit sphere. Our motivation for this work is primarily twofold: we study the attention mechanism under its flow map and its relations to traditional calculus of variations and Lagrangian optimization; and we study a range of calculus of variations on the unit hypersphere that appeal to a broader mathematical lens in approximating, variational contexts.
Inexact calculus of variations on the hyperspherical tangent bundle with connections to the attention mechanism
We offer a theoretical mathematical background through Lagrangian optimization on the unit hyperspherical manifold and its tangential structure. Our methods can be categorized as inexact since our methods are projection-based and since we will perturb the functional optimization…
- Preview

- Year
- 2025
- Hosting
- Abstract onlyARXIV-DEFAULT
Cite
Notes
Only stored in your browser.
Attribution
- Abstract & full text
- arxiv.org/abs/2507.15431ARXIV-DEFAULT
- TL;DR
- Semantic Scholar