Can Rager
- Papers
- 6
Cite
Notes
Only stored in your browser.
6papers
Authored papers
6SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability
arXiv 2025
NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model Internals
arXiv 2024
Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models
arXiv 2024
Measuring Progress in Dictionary Learning for Language Model Interpretability with Board Game Models
arXiv 2024
Structured World Representations in Maze-Solving Transformers
arXiv 2023
A Configurable Library for Generating and Manipulating Maze Datasets
arXiv 2023
Affiliations
No known affiliations.
Frequent co-authors
10from 6 papers