Emil Ryd

Cite

Notes

Only stored in your browser.

Attribution

2papers

Authored papers

Eliciting Secret Knowledge from Language Models

arXiv 2025

Towards eliciting latent knowledge from LLMs with mechanistic interpretability

arXiv 2025

No known affiliations.

from 2 papers

Bartosz Cywiński

Neel Nanda

researcher

Senthooran Rajamanoharan

Arthur Conmy

Rowan Wang

Samuel Marks