Cite
Notes
Only stored in your browser.
Attribution
SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability
arXiv 2025
A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders
arXiv 2024
Identifying Linear Relational Concepts in Large Language Models
arXiv 2023
from 3 papers
Joseph Bloom
Adam Karvonen
Anthony Hunter
Arthur Conmy
Callum McDougall
Can Rager
Curt Tigges
Eoin Farrell
Hardik Bhatnagar
James Wilken-Smith