Ruqi Zhang
- Papers
- 12
Cite
Notes
Only stored in your browser.
Authored papers
12Addressing Performance Saturation for LLM RL via Precise Entropy Curve Control
arXiv 2026
Learning Self-Correction in Vision-Language Models via Rollout Augmentation
arXiv 2026
DRIFT: Learning from Abundant User Dissatisfaction in Real-World Preference Learning
arXiv 2025
CoT-UQ: Improving Response-wise Uncertainty Quantification in LLMs with Chain-of-Thought
arXiv 2025
Making Reliable and Flexible Decisions in Long-tailed Classification
arXiv 2025
Sherlock: Self-Correcting Reasoning in Vision-Language Models
arXiv 2025
Energy-Based Reward Models for Robust Language Model Alignment
arXiv 2025
Training Bayesian Neural Networks with Sparse Subspace Variational Inference
arXiv 2024
Cascade Reward Sampling for Efficient Decoding-Time Alignment
arXiv 2024
ETA: Evaluating Then Aligning Safety of Vision Language Models at Inference Time
arXiv 2024
DP-Fast MH: Private, Fast, and Accurate Metropolis-Hastings for Large-Scale Bayesian Inference
arXiv 2023
Entropy-MCMC: Sampling from Flat Basins with Ease
arXiv 2023
Affiliations
Frequent co-authors
10from 12 papers