0

Zhiheng Xi

Papers
36

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
36papers

Authored papers

36

AI Can Learn Scientific Taste

arXiv 2026

2026

AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security

arXiv 2026

2026

Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large Language Models

arXiv 2026

2026

LLMEval-Logic: A Solver-Verified Chinese Benchmark for Logical Reasoning of LLMs with Adversarial Hardening

arXiv 2026

2026

OpenNovelty: An LLM-powered Agentic System for Verifiable Scholarly Novelty Assessment

arXiv 2026

2026

ABC-Bench: Benchmarking Agentic Backend Coding in Real-World Development

arXiv 2026

2026

Which Reasoning Trajectories Teach Students to Reason Better? A Simple Metric of Informative Alignment

arXiv 2026

2026

SciAgentGym: Benchmarking Multi-Step Scientific Tool-use in LLM Agents

arXiv 2026

2026

FRoM-W1: Towards General Humanoid Whole-Body Control with Language Instructions

arXiv 2026

2026

Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training

arXiv 2025

2025

CritiQ: Mining Data Quality Criteria from Human Preferences

arXiv 2025

2025

Better Process Supervision with Bi-directional Rewarding Signals

arXiv 2025

2025

Nex-N1: Agentic Models Trained via a Unified Ecosystem for Large-Scale Environment Construction

arXiv 2025

2025

AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning

arXiv 2025

2025

Pre-Trained Policy Discriminators are General Reward Models

arXiv 2025

2025

BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping

arXiv 2025

2025

Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination

arXiv 2025

2025

BMMR: A Large-Scale Bilingual Multimodal Multi-Discipline Reasoning Dataset

arXiv 2025

2025

Feedback-Driven Tool-Use Improvements in Large Language Models via Automated Build Environments

arXiv 2025

2025

PFDial: A Structured Dialogue Instruction Fine-tuning Method Based on UML Flowcharts

arXiv 2025

2025

Critique-RL: Training Language Models for Critiquing through Two-Stage Reinforcement Learning

arXiv 2025

2025

EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models

arXiv 2024

2024

Distill Visual Chart Reasoning Ability from LLMs to MLLMs

arXiv 2024

2024

StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback

arXiv 2024

2024

Secrets of RLHF in Large Language Models Part II: Reward Modeling

arXiv 2024

2024

AgentGym: Evolving Large Language Model-based Agents across Diverse Environments

arXiv 2024

2024

RMB: Comprehensively Benchmarking Reward Models in LLM Alignment

arXiv 2024

2024

Mitigating Tail Narrowing in LLM Self-Improvement via Socratic-Guided Sampling

arXiv 2024

2024

Multi-Programming Language Sandbox for LLMs

arXiv 2024

2024

Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning

arXiv 2024

2024

MouSi: Poly-Visual-Expert Vision-Language Models

arXiv 2024

2024

Self-Demos: Eliciting Out-of-Demonstration Generalizability in Large Language Models

arXiv 2024

2024

LoRAMoE: Alleviate World Knowledge Forgetting in Large Language Models via MoE-Style Plugin

arXiv 2023

2023

The Rise and Potential of Large Language Model Based Agents: A Survey

arXiv 2023

2023

Self-Polish: Enhance Reasoning in Large Language Models via Problem Refinement

arXiv 2023

2023

TRACE: A Comprehensive Benchmark for Continual Learning in Large Language Models

arXiv 2023

2023

Affiliations

No known affiliations.

Frequent co-authors

10

from 36 papers