Rui Zheng
- Papers
- 26
Cite
Notes
Only stored in your browser.
Authored papers
26ABC-Bench: Benchmarking Agentic Backend Coding in Real-World Development
arXiv 2026
Better Process Supervision with Bi-directional Rewarding Signals
arXiv 2025
Nex-N1: Agentic Models Trained via a Unified Ecosystem for Large-Scale Environment Construction
arXiv 2025
AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning
arXiv 2025
BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping
arXiv 2025
Critique-RL: Training Language Models for Critiquing through Two-Stage Reinforcement Learning
arXiv 2025
EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models
arXiv 2024
StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback
arXiv 2024
Multi-Programming Language Sandbox for LLMs
arXiv 2024
Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning
arXiv 2024
MouSi: Poly-Visual-Expert Vision-Language Models
arXiv 2024
RMB: Comprehensively Benchmarking Reward Models in LLM Alignment
arXiv 2024
SPA-VL: A Comprehensive Safety Preference Alignment Dataset for Vision Language Model
arXiv 2024
Secrets of RLHF in Large Language Models Part II: Reward Modeling
arXiv 2024
AgentGym: Evolving Large Language Model-based Agents across Diverse Environments
arXiv 2024
Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback
arXiv 2024
Aligning Large Language Models from Self-Reference AI Feedback with one General Principle
arXiv 2024
SafeAligner: Safety Alignment against Jailbreak Attacks via Response Disparity Guidance
arXiv 2024
LoRAMoE: Alleviate World Knowledge Forgetting in Large Language Models via MoE-Style Plugin
arXiv 2023
HACK: Learning a Parametric Head and Neck Model for High-fidelity Animation
arXiv 2023
TRACE: A Comprehensive Benchmark for Continual Learning in Large Language Models
arXiv 2023
Self-Polish: Enhance Reasoning in Large Language Models via Problem Refinement
arXiv 2023
Rescue: Ranking LLM Responses with Partial Ordering to Improve Response Generation
arXiv 2023
Orthogonal Subspace Learning for Language Model Continual Learning
arXiv 2023
The Rise and Potential of Large Language Model Based Agents: A Survey
arXiv 2023
InstructUIE: Multi-task Instruction Tuning for Unified Information Extraction
arXiv 2023
Affiliations
Frequent co-authors
10from 26 papers