Chi Zhang
PKU-Alignment researcher; co-author on PKU-SafeRLHF / BeaverTails safety dataset and aligned-LLM papers.
- Role
- researcher
- Currently at
- Peking University
- Unknown
- GitHub
- Unknown
- Scholar
- scholar.google.com/scholar
- Papers
- 54
Cite
Notes
Only stored in your browser.
Authored papers
54InternAgent-1.5: A Unified Agentic Framework for Long-Horizon Autonomous Scientific Discovery
arXiv 2026
Being-H0.5: Scaling Human-Centric Robot Learning for Cross-Embodiment Generalization
arXiv 2026
Reinforcing Few-step Generators via Reward-Tilted Distribution Matching
arXiv 2026
LongCat-Next: Lexicalizing Modalities as Discrete Tokens
arXiv 2026
DIVE: Scaling Diversity in Agentic Task Synthesis for Generalizable Tool Use
arXiv 2026
FAPO: Flawed-Aware Policy Optimization for Efficient and Reliable Reasoning
arXiv 2025
A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers
arXiv 2025
RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics
arXiv 2025
Distill Any Depth: Distillation Creates a Stronger Monocular Depth Estimator
arXiv 2025
ViStoryBench: Comprehensive Benchmark Suite for Story Visualization
arXiv 2025
Meta-rater: A Multi-dimensional Data Selection Method for Pre-training Language Models
arXiv 2025
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention
arXiv 2025
SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond
arXiv 2025
MotionAgent: Fine-grained Controllable Video Generation via Motion Field Agent
ICCV 2025
Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving
arXiv 2025
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer
arXiv 2025
Fast3Dcache: Training-free 3D Geometry Synthesis Acceleration
arXiv 2025
WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents
arXiv 2025
Fin-PRM: A Domain-Specialized Process Reward Model for Financial Reasoning in Large Language Models
arXiv 2025
LangGPT: Rethinking Structured Reusable Prompt Design Framework for LLMs from the Programming Language
arXiv 2024
Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond
arXiv 2024
MeshAnything V2: Artist-Created Mesh Generation With Adjacent Mesh Tokenization
ICCV 2025
MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers
arXiv 2024
UniScene: Unified Occupancy-centric Driving Scene Generation
CVPR 2025 1
StyleStudio: Text-Driven Style Transfer with Selective Control of Style Elements
CVPR 2025 1
Ray Denoising: Depth-aware Hard Negative Sampling for Multi-view 3D Object Detection
arXiv 2024
MVPaint: Synchronized Multi-View Diffusion for Painting Anything 3D
CVPR 2025 1
Metric3Dv2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation
metric3d-v2-a-versatile-monocular-geometric
EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts
arXiv 2024
Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining
arXiv 2024
FipTR: A Simple yet Effective Transformer Framework for Future Instance Prediction in Autonomous Driving
arXiv 2024
HybridFlow: A Flexible and Efficient RLHF Framework
arXiv 2024
Reflections from the 2024 Large Language Model (LLM) Hackathon for Applications in Materials Science and Chemistry
arXiv 2024
MeshXL: Neural Coordinate Field for Generative 3D Foundation Models
arXiv 2024
MovieLLM: Enhancing Long Video Understanding with AI-Generated Movies
arXiv 2024
BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset
NeurIPS
AppAgent: Multimodal Agents as Smartphone Users
arXiv 2023
GaussianEditor: Swift and Controllable 3D Editing with Gaussian Splatting
CVPR 2024 1
LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning
arXiv 2023
ESSAformer: Efficient Transformer for Hyperspectral Image Super-resolution
ICCV 2023 1
Creative Agents: Empowering Agents with Imagination for Creative Tasks
arXiv 2023
Two-Stage Constrained Actor-Critic for Short Video Recommendation
arXiv 2023
MEWL: Few-shot multimodal word learning with referential uncertainty
arXiv 2023
Deep Learning Based Joint Beamforming Design in IRS-Assisted Secure Communications
arXiv 2023
DreamGaussian4D: Generative 4D Gaussian Splatting
arXiv 2023
FaceStudio: Put Your Face Everywhere in Seconds
arXiv 2023
StableLLaVA: Enhanced Visual Instruction Tuning with Synthesized Image-Dialogue Data
arXiv 2023
IT3D: Improved Text-to-3D Generation with Explicit View Synthesis
arXiv 2023
PivotNet: Vectorized Pivot Learning for End-to-end HD Map Construction
ICCV 2023 1
M3DBench: Let's Instruct Large Models with Multi-modal 3D Prompts
arXiv 2023
X-VoE: Measuring eXplanatory Violation of Expectation in Physical Events
ICCV 2023 1
StyleAvatar3D: Leveraging Image-Text Diffusion Models for High-Fidelity 3D Avatar Generation
arXiv 2023
Evaluating and Inducing Personality in Pre-trained Language Models
arXiv 2022
End-to-End Human Object Interaction Detection with HOI Transformer
arXiv 2021
Affiliations
Previously
Frequent co-authors
10from 54 papers