Robert Kirk
- Papers
- 4
Cite
Notes
Only stored in your browser.
4papers
Authored papers
4Deep Ignorance: Filtering Pretraining Data Builds Tamper-Resistant Safeguards into Open-Weight LLMs
arXiv 2025
Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models
arXiv 2024
Understanding the Effects of RLHF on LLM Generalisation and Diversity
arXiv 2023
Reward Model Ensembles Help Mitigate Overoptimization
arXiv 2023
Affiliations
No known affiliations.
Frequent co-authors
10from 4 papers