Tomek Korbak
- Papers
- 4
Cite
Notes
Only stored in your browser.
4papers
Authored papers
4Reasoning Models Struggle to Control their Chains of Thought
arXiv 2026
Deep Ignorance: Filtering Pretraining Data Builds Tamper-Resistant Safeguards into Open-Weight LLMs
arXiv 2025
Async Control: Stress-testing Asynchronous Control Measures for LLM Agents
arXiv 2025
Looking Inward: Language Models Can Learn About Themselves by Introspection
arXiv 2024
Affiliations
No known affiliations.
Frequent co-authors
10from 4 papers