Thomas Icard

Cite

Notes

Only stored in your browser.

Attribution

3papers

Authored papers

Internal Causal Mechanisms Robustly Predict Language Model Out-of-Distribution Behaviors

arXiv 2025

Belief in the Machine: Investigating Epistemological Blind Spots of Language Models

arXiv 2024

A Reply to Makelov et al. (2023)'s "Interpretability Illusion" Arguments

arXiv 2024

No known affiliations.

from 3 papers

Christopher Potts

Jing Huang

Aryaman Arora

Atticus Geiger

Dan Jurafsky

Daniel E. Ho

Diyi Yang

Federico Bianchi

James Zou

Junyi Tao