0

Rui Zheng

Papers
26

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
26papers

Authored papers

26

ABC-Bench: Benchmarking Agentic Backend Coding in Real-World Development

arXiv 2026

2026

Better Process Supervision with Bi-directional Rewarding Signals

arXiv 2025

2025

Nex-N1: Agentic Models Trained via a Unified Ecosystem for Large-Scale Environment Construction

arXiv 2025

2025

AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning

arXiv 2025

2025

BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping

arXiv 2025

2025

Critique-RL: Training Language Models for Critiquing through Two-Stage Reinforcement Learning

arXiv 2025

2025

EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models

arXiv 2024

2024

StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback

arXiv 2024

2024

Multi-Programming Language Sandbox for LLMs

arXiv 2024

2024

Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning

arXiv 2024

2024

MouSi: Poly-Visual-Expert Vision-Language Models

arXiv 2024

2024

RMB: Comprehensively Benchmarking Reward Models in LLM Alignment

arXiv 2024

2024

SPA-VL: A Comprehensive Safety Preference Alignment Dataset for Vision Language Model

arXiv 2024

2024

Secrets of RLHF in Large Language Models Part II: Reward Modeling

arXiv 2024

2024

AgentGym: Evolving Large Language Model-based Agents across Diverse Environments

arXiv 2024

2024

Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback

arXiv 2024

2024

Aligning Large Language Models from Self-Reference AI Feedback with one General Principle

arXiv 2024

2024

SafeAligner: Safety Alignment against Jailbreak Attacks via Response Disparity Guidance

arXiv 2024

2024

LoRAMoE: Alleviate World Knowledge Forgetting in Large Language Models via MoE-Style Plugin

arXiv 2023

2023

HACK: Learning a Parametric Head and Neck Model for High-fidelity Animation

arXiv 2023

2023

TRACE: A Comprehensive Benchmark for Continual Learning in Large Language Models

arXiv 2023

2023

Self-Polish: Enhance Reasoning in Large Language Models via Problem Refinement

arXiv 2023

2023

Rescue: Ranking LLM Responses with Partial Ordering to Improve Response Generation

arXiv 2023

2023

Orthogonal Subspace Learning for Language Model Continual Learning

arXiv 2023

2023

The Rise and Potential of Large Language Model Based Agents: A Survey

arXiv 2023

2023

InstructUIE: Multi-task Instruction Tuning for Unified Information Extraction

arXiv 2023

2023

Affiliations

No known affiliations.

Frequent co-authors

10

from 26 papers