Cite
Notes
Only stored in your browser.
Attribution
SafeGRPO: Self-Rewarded Multimodal Safety Alignment via Rule-Governed Policy Optimization
arXiv 2025
from 1 papers
Bo Du
Daiguo Zhou
Mang Ye
Wenke Huang
Xuankun Rong