Songyang Gao
- Papers
- 13
Cite
Notes
Only stored in your browser.
Authored papers
13Text-Only Data Synthesis for Vision Language Model Training
arXiv 2025
CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward
arXiv 2025
Pre-Trained Policy Discriminators are General Reward Models
arXiv 2025
Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning
arXiv 2025
EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models
arXiv 2024
Are Your LLMs Capable of Stable Reasoning?
arXiv 2024
Secrets of RLHF in Large Language Models Part II: Reward Modeling
arXiv 2024
AgentGym: Evolving Large Language Model-based Agents across Diverse Environments
arXiv 2024
ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios
arXiv 2024
Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback
arXiv 2024
LoRAMoE: Alleviate World Knowledge Forgetting in Large Language Models via MoE-Style Plugin
arXiv 2023
TRACE: A Comprehensive Benchmark for Continual Learning in Large Language Models
arXiv 2023
Self-Polish: Enhance Reasoning in Large Language Models via Problem Refinement
arXiv 2023
Affiliations
Frequent co-authors
10from 13 papers