Vivek Hebbar

Cite

Notes

Only stored in your browser.

Attribution

1papers

Authored papers

Latent Adversarial Training Improves Robustness to Persistent Harmful Behaviors in LLMs

arXiv 2024

No known affiliations.

from 1 papers

Abhay Sheshadri

Aengus Lynch

Aidan Ewart

Asa Cooper Stickland

researcher

Cindy Wu

Dylan Hadfield-Menell

Ethan Perez

Henry Sleight

Phillip Guo

Stephen Casper