Cite
Notes
Only stored in your browser.
Attribution
From Directions to Regions: Decomposing Activations in Language Models via Local Geometry
arXiv 2026
Decomposing MLP Activations into Interpretable Features via Semi-Nonnegative Matrix Factorization
arXiv 2025
from 2 papers
Atticus Geiger
Mor Geva
Omri Fahn
Shaked Ronen
Shauli Ravfogel