0

Chao Du

Papers
44

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
44papers

Authored papers

44

Orient Anything V2: Unifying Orientation and Rotation Understanding

arXiv 2026

2026

Rethinking the Trust Region in LLM Reinforcement Learning

arXiv 2026

2026

VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use

arXiv 2025

2025

UFO2: The Desktop AgentOS

arXiv 2025

2025

Understanding R1-Zero-Like Training: A Critical Perspective

arXiv 2025

2025

NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation

arXiv 2025

2025

Optimizing Anytime Reasoning via Budget Relative Policy Optimization

arXiv 2025

2025

FlowReasoner: Reinforcing Query-Level Meta-Agents

arXiv 2025

2025

Adversarial Attacks against Closed-Source MLLMs via Feature Optimal Alignment

arXiv 2025

2025

QuickVideo: Real-Time Long Video Understanding with System Algorithm Co-Design

arXiv 2025

2025

Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs

arXiv 2025

2025

Diffusion Language Models are Super Data Learners

arXiv 2025

2025

Variational Reasoning for Language Models

arXiv 2025

2025

Defeating the Training-Inference Mismatch via FP16

arXiv 2025

2025

Language Models Can Learn from Verbal Feedback Without Scalar Rewards

arXiv 2025

2025

Reinforcing General Reasoning without Verifiers

arXiv 2025

2025

Fostering Video Reasoning via Next-Event Prediction

arXiv 2025

2025

Lifelong Safety Alignment for Language Models

arXiv 2025

2025

When Attention Sink Emerges in Language Models: An Empirical View

arXiv 2024

2024

Sample-Efficient Alignment for LLMs

arXiv 2024

2024

Scaling up Masked Diffusion Models on Text

arXiv 2024

2024

Improved Techniques for Optimization-Based Jailbreaking on Large Language Models

arXiv 2024

2024

Weak-to-Strong Jailbreaking on Large Language Models

arXiv 2024

2024

Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs

arXiv 2024

2024

Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast

arXiv 2024

2024

Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates

arXiv 2024

2024

Improving Long-Text Alignment for Text-to-Image Diffusion Models

arXiv 2024

2024

SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction

arXiv 2024

2024

Orient Anything: Learning Robust Object Orientation Estimation from Rendering 3D Models

arXiv 2024

2024

When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training

arXiv 2024

2024

Bootstrapping Language Models with DPO Implicit Rewards

arXiv 2024

2024

TaskWeaver: A Code-First Agent Framework

arXiv 2023

2023

LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition

arXiv 2023

2023

On Evaluating Adversarial Robustness of Large Vision-Language Models

NeurIPS 2023 11

2023

A Recipe for Watermarking Diffusion Models

arXiv 2023

2023

Better Diffusion Models Further Improve Adversarial Training

arXiv 2023

2023

Efficient Diffusion Policies for Offline Reinforcement Learning

efficient-diffusion-policies-for-offline

2023

Exploring Model Dynamics for Accumulative Poisoning Discovery

arXiv 2023

2023

Finetuning Text-to-Image Diffusion Models for Fairness

arXiv 2023

2023

Bag of Tricks for Training Data Extraction from Language Models

arXiv 2023

2023

On Calibrating Diffusion Probabilistic Models

on-calibrating-diffusion-probabilistic-models

2023

Intriguing Properties of Data Attribution on Diffusion Models

arXiv 2023

2023

BAFFLE: A Baseline of Backpropagation-Free Federated Learning

arXiv 2023

2023

Nonparametric Generative Modeling with Conditional Sliced-Wasserstein Flows

arXiv 2023

2023

Affiliations

No known affiliations.

Frequent co-authors

10

from 44 papers