Cite
Notes
Only stored in your browser.
Attribution
One-Shot Safety Alignment for Large Language Models via Optimal Dualization
arXiv 2024
Demystifying Disagreement-on-the-Line in High Dimensions
arXiv 2023
from 2 papers
Edgar Dobriban
Hamed Hassani
Behrad Moniri
Donghwan Lee
Dongsheng Ding
Osbert Bastani
Shuo Li