Tianyu Pang
- Papers
- 52
Cite
Notes
Only stored in your browser.
Authored papers
52Orient Anything V2: Unifying Orientation and Rotation Understanding
arXiv 2026
Rethinking the Trust Region in LLM Reinforcement Learning
arXiv 2026
Your Agent, Their Asset: A Real-World Safety Analysis of OpenClaw
arXiv 2026
Reinforcing Few-step Generators via Reward-Tilted Distribution Matching
arXiv 2026
OpenSearch-VL: An Open Recipe for Frontier Multimodal Search Agents
arXiv 2026
VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use
arXiv 2025
Understanding R1-Zero-Like Training: A Critical Perspective
arXiv 2025
SkyLadder: Better and Faster Pretraining via Context Window Scheduling
arXiv 2025
NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation
arXiv 2025
Optimizing Anytime Reasoning via Budget Relative Policy Optimization
arXiv 2025
FlowReasoner: Reinforcing Query-Level Meta-Agents
arXiv 2025
Adversarial Attacks against Closed-Source MLLMs via Feature Optimal Alignment
arXiv 2025
QuickVideo: Real-Time Long Video Understanding with System Algorithm Co-Design
arXiv 2025
Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs
arXiv 2025
Diffusion Language Models are Super Data Learners
arXiv 2025
Variational Reasoning for Language Models
arXiv 2025
Defeating the Training-Inference Mismatch via FP16
arXiv 2025
LIFT the Veil for the Truth: Principal Weights Emerge after Rank Reduction for Reasoning-Focused Supervised Fine-Tuning
arXiv 2025
Language Models Can Learn from Verbal Feedback Without Scalar Rewards
arXiv 2025
Reinforcing General Reasoning without Verifiers
arXiv 2025
Safety at Scale: A Comprehensive Survey of Large Model Safety
arXiv 2025
Fostering Video Reasoning via Next-Event Prediction
arXiv 2025
Efficient Process Reward Model Training via Active Learning
arXiv 2025
Lifelong Safety Alignment for Language Models
arXiv 2025
When Attention Sink Emerges in Language Models: An Empirical View
arXiv 2024
Scaling up Masked Diffusion Models on Text
arXiv 2024
Improved Techniques for Optimization-Based Jailbreaking on Large Language Models
arXiv 2024
Orient Anything: Learning Robust Object Orientation Estimation from Rendering 3D Models
arXiv 2024
When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training
arXiv 2024
Model Balancing Helps Low-data Training and Fine-tuning
arXiv 2024
RegMix: Data Mixture as Regression for Language Model Pre-training
arXiv 2024
Weak-to-Strong Jailbreaking on Large Language Models
arXiv 2024
Self-Distillation Bridges Distribution Gap in Language Model Fine-Tuning
arXiv 2024
Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs
arXiv 2024
Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast
arXiv 2024
Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates
arXiv 2024
Improving Long-Text Alignment for Text-to-Image Diffusion Models
arXiv 2024
SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction
arXiv 2024
Bootstrapping Language Models with DPO Implicit Rewards
arXiv 2024
LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition
arXiv 2023
On Evaluating Adversarial Robustness of Large Vision-Language Models
NeurIPS 2023 11
Nonparametric Generative Modeling with Conditional Sliced-Wasserstein Flows
arXiv 2023
A Recipe for Watermarking Diffusion Models
arXiv 2023
Better Diffusion Models Further Improve Adversarial Training
arXiv 2023
Efficient Diffusion Policies for Offline Reinforcement Learning
efficient-diffusion-policies-for-offline
Finetuning Text-to-Image Diffusion Models for Fairness
arXiv 2023
Bag of Tricks for Training Data Extraction from Language Models
arXiv 2023
On Calibrating Diffusion Probabilistic Models
on-calibrating-diffusion-probabilistic-models
Intriguing Properties of Data Attribution on Diffusion Models
arXiv 2023
BAFFLE: A Baseline of Backpropagation-Free Federated Learning
arXiv 2023
Robustness and Accuracy Could Be Reconcilable by (Proper) Definition
arXiv 2022
Adversarial Attacks and Defences Competition
arXiv 2018
Affiliations
Frequent co-authors
10from 52 papers