Chao Du
- Papers
- 44
Cite
Notes
Only stored in your browser.
Authored papers
44Orient Anything V2: Unifying Orientation and Rotation Understanding
arXiv 2026
Rethinking the Trust Region in LLM Reinforcement Learning
arXiv 2026
VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use
arXiv 2025
UFO2: The Desktop AgentOS
arXiv 2025
Understanding R1-Zero-Like Training: A Critical Perspective
arXiv 2025
NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation
arXiv 2025
Optimizing Anytime Reasoning via Budget Relative Policy Optimization
arXiv 2025
FlowReasoner: Reinforcing Query-Level Meta-Agents
arXiv 2025
Adversarial Attacks against Closed-Source MLLMs via Feature Optimal Alignment
arXiv 2025
QuickVideo: Real-Time Long Video Understanding with System Algorithm Co-Design
arXiv 2025
Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs
arXiv 2025
Diffusion Language Models are Super Data Learners
arXiv 2025
Variational Reasoning for Language Models
arXiv 2025
Defeating the Training-Inference Mismatch via FP16
arXiv 2025
Language Models Can Learn from Verbal Feedback Without Scalar Rewards
arXiv 2025
Reinforcing General Reasoning without Verifiers
arXiv 2025
Fostering Video Reasoning via Next-Event Prediction
arXiv 2025
Lifelong Safety Alignment for Language Models
arXiv 2025
When Attention Sink Emerges in Language Models: An Empirical View
arXiv 2024
Sample-Efficient Alignment for LLMs
arXiv 2024
Scaling up Masked Diffusion Models on Text
arXiv 2024
Improved Techniques for Optimization-Based Jailbreaking on Large Language Models
arXiv 2024
Weak-to-Strong Jailbreaking on Large Language Models
arXiv 2024
Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs
arXiv 2024
Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast
arXiv 2024
Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates
arXiv 2024
Improving Long-Text Alignment for Text-to-Image Diffusion Models
arXiv 2024
SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction
arXiv 2024
Orient Anything: Learning Robust Object Orientation Estimation from Rendering 3D Models
arXiv 2024
When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training
arXiv 2024
Bootstrapping Language Models with DPO Implicit Rewards
arXiv 2024
TaskWeaver: A Code-First Agent Framework
arXiv 2023
LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition
arXiv 2023
On Evaluating Adversarial Robustness of Large Vision-Language Models
NeurIPS 2023 11
A Recipe for Watermarking Diffusion Models
arXiv 2023
Better Diffusion Models Further Improve Adversarial Training
arXiv 2023
Efficient Diffusion Policies for Offline Reinforcement Learning
efficient-diffusion-policies-for-offline
Exploring Model Dynamics for Accumulative Poisoning Discovery
arXiv 2023
Finetuning Text-to-Image Diffusion Models for Fairness
arXiv 2023
Bag of Tricks for Training Data Extraction from Language Models
arXiv 2023
On Calibrating Diffusion Probabilistic Models
on-calibrating-diffusion-probabilistic-models
Intriguing Properties of Data Attribution on Diffusion Models
arXiv 2023
BAFFLE: A Baseline of Backpropagation-Free Federated Learning
arXiv 2023
Nonparametric Generative Modeling with Conditional Sliced-Wasserstein Flows
arXiv 2023
Affiliations
Frequent co-authors
10from 44 papers