Cite
Notes
Only stored in your browser.
Attribution
Rethinking Reward Models for Multi-Domain Test-Time Scaling
arXiv 2025
SafeRoute: Adaptive Model Selection for Efficient and Accurate Safety Guardrails in Large Language Models
HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models
arXiv 2024
from 3 papers
Dong Bok Lee
Minki Kang
Seanie Lee
Sung Ju Hwang
Haebin Seong
Juho Lee
Tobias Bocklet
DongKi Kim
Heejun Lee
Jiang Bia