Jiancan Wu
- Papers
- 7
Cite
Notes
Only stored in your browser.
7papers
Authored papers
7On the Direction of RLVR Updates for LLM Reasoning: Identification and Exploitation
arXiv 2026
Quantile Advantage Estimation for Entropy-Safe Reasoning
arXiv 2025
Robust Preference Optimization via Dynamic Target Margins
arXiv 2025
RePO: ReLU-based Preference Optimization
arXiv 2025
$β$-DPO: Direct Preference Optimization with Dynamic $β$
arXiv 2024
Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization
arXiv 2024
MuggleMath: Assessing the Impact of Query and Response Augmentation on Math Reasoning
arXiv 2023
Affiliations
No known affiliations.
Frequent co-authors
10from 7 papers