Cite
Notes
Only stored in your browser.
Attribution
Deep Ignorance: Filtering Pretraining Data Builds Tamper-Resistant Safeguards into Open-Weight LLMs
arXiv 2025
from 1 papers
Geoffrey Irving
Kyle O'Brien
Quentin Anthony
Robert Kirk
Stella Biderman
founder
Stephen Casper
Tomek Korbak
Xander Davies
Yarin Gal