Peter Hase
- Papers
- 11
Cite
Notes
Only stored in your browser.
Authored papers
11The Truthfulness Spectrum Hypothesis
arXiv 2026
Unlearning Sensitive Information in Multimodal LLMs: Benchmark and Attack-Defense Evaluation
arXiv 2025
LACIE: Listener-Aware Finetuning for Confidence Calibration in Large Language Models
arXiv 2024
Teaching Models to Balance Resisting and Accepting Persuasion
arXiv 2024
The Unreasonable Effectiveness of Easy Training Data for Hard Tasks
arXiv 2024
Can Language Models Teach Weaker Agents? Teacher Explanations Improve Students via Personalization
arXiv 2023
Does Localization Inform Editing? Surprising Differences in Causality-Based Localization vs. Knowledge Editing in Language Models
does-localization-inform-editing-surprising
Can Sensitive Information Be Deleted From LLMs? Objectives for Defending Against Extraction Attacks
arXiv 2023
Are Hard Examples also Harder to Explain? A Study with Human and Model-Generated Explanations
arXiv 2022
When Can Models Learn From Explanations? A Formal Framework for Understanding the Roles of Explanation Data
LNLS (ACL) 2022 5
Evaluating Explainable AI: Which Algorithmic Explanations Help Users Predict Model Behavior?
evaluating-explainable-ai-which-algorithmic-1
Affiliations
Frequent co-authors
10from 11 papers