Adam Karvonen
- Papers
- 5
Cite
Notes
Only stored in your browser.
5papers
Authored papers
5SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability
arXiv 2025
Activation Oracles: Training and Evaluating LLMs as General-Purpose Activation Explainers
arXiv 2025
Steering Out-of-Distribution Generalization with Concept Ablation Fine-Tuning
arXiv 2025
Learning Multi-Level Features with Matryoshka Sparse Autoencoders
arXiv 2025
Measuring Progress in Dictionary Learning for Language Model Interpretability with Board Game Models
arXiv 2024
Affiliations
No known affiliations.
Frequent co-authors
10from 5 papers