Cite
Notes
Only stored in your browser.
Attribution
Representation Noising: A Defence Mechanism Against Harmful Finetuning
arXiv 2024
Long-form evaluation of model editing
from 2 papers
Domenic Rosati
Frank Rudzicz
Hassan Sajjad
Carsten Maple
David Atanasov
Jan Wehner
Jinkun Chen
Kai Williams
Łukasz Bartoszcze
Melis Erkan