Bo Zheng
- Papers
- 50
Cite
Notes
Only stored in your browser.
Authored papers
50FashionChameleon: Towards Real-Time and Interactive Human-Garment Video Customization
arXiv 2026
The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models
arXiv 2026
What Matters for Diffusion-Friendly Latent Manifold? Prior-Aligned Autoencoders for Latent Diffusion
arXiv 2026
Continuous-Time Distribution Matching for Few-Step Diffusion Distillation
arXiv 2026
Generative World Renderer
arXiv 2026
A Semantically Consistent Dataset for Data-Efficient Query-Based Universal Sound Separation
arXiv 2026
Perception or Prejudice: Can MLLMs Go Beyond First Impressions of Personality?
arXiv 2026
Complementary Reinforcement Learning
arXiv 2026
One Sample to Rule Them All: Extreme Data Efficiency in RL Scaling
arXiv 2026
WildWorld: A Large-Scale Dataset for Dynamic World Modeling with Actions and Explicit State toward Generative ARPG
arXiv 2026
PackForcing: Short Video Training Suffices for Long Video Sampling and Long Context Inference
arXiv 2026
Qwen3-Omni Technical Report
arXiv 2025
Qwen3 Technical Report
preprint
Qwen3-VL Technical Report
arXiv 2025
Reinforcement Learning Optimization for Large-Scale Learning: An Efficient and User-Friendly Scaling Library
arXiv 2025
HiddenDetect: Detecting Jailbreak Attacks against Large Vision-Language Models via Monitoring Hidden States
arXiv 2025
A Comprehensive Survey on Long Context Language Modeling
arXiv 2025
PAID: A Framework of Product-Centric Advertising Image Design
arXiv 2025
FloodDiffusion: Tailored Diffusion Forcing for Streaming Motion Generation
arXiv 2025
Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem
arXiv 2025
VStyle: A Benchmark for Voice Style Adaptation with Spoken Instructions
arXiv 2025
ChineseEcomQA: A Scalable E-commerce Concept Evaluation Benchmark for Large Language Models
arXiv 2025
DeepPHY: Benchmarking Agentic VLMs on Physical Reasoning
arXiv 2025
Graph2Eval: Automatic Multimodal Task Generation for Agents via Knowledge Graphs
arXiv 2025
Deconstructing Long Chain-of-Thought: A Structured Reasoning Optimization Framework for Long CoT Distillation
arXiv 2025
DMM: Building a Versatile Image Generation Model via Distillation-Based Model Merging
arXiv 2025
Compression with Global Guidance: Towards Training-free High-Resolution MLLMs Acceleration
arXiv 2025
UQABench: Evaluating User Embedding for Prompting LLMs in Personalized Question Answering
arXiv 2025
"See the World, Discover Knowledge": A Chinese Factuality Evaluation for Large Vision Language Models
arXiv 2025
Think-J: Learning to Think for Generative LLM-as-a-Judge
arXiv 2025
AIR: Complex Instruction Generation via Automatic Iterative Refinement
arXiv 2025
ECKGBench: Benchmarking Large Language Models in E-commerce Leveraging Knowledge Graph
arXiv 2025
Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning?
arXiv 2025
USB: A Comprehensive and Unified Safety Evaluation Benchmark for Multimodal Large Language Models
arXiv 2025
ProgCo: Program Helps Self-Correction of Large Language Models
arXiv 2025
Differentiable Solver Search for Fast Diffusion Sampling
arXiv 2025
Qwen2.5 Technical Report
arXiv 2024
Qwen2 Technical Report
arXiv 2024
RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness
CVPR 2025 1
Align Anything: Training All-Modality Models to Follow Instructions with Language Feedback
arXiv 2024
LLaVA-UHD v2: an MLLM Integrating High-Resolution Feature Pyramid via Hierarchical Window Transformer
arXiv 2024
ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models
arXiv 2024
MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language Models
arXiv 2024
Accelerating Image Generation with Sub-path Linear Approximation Model
arXiv 2024
Making Pre-trained Language Models Great on Tabular Prediction
arXiv 2024
LongDocURL: a Comprehensive Multimodal Long Document Benchmark Integrating Understanding, Reasoning, and Locating
arXiv 2024
EMAGE: Towards Unified Holistic Co-Speech Gesture Generation via Expressive Masked Audio Gesture Modeling
CVPR 2024 1
StableMoE: Stable Routing Strategy for Mixture of Experts
ACL 2022 5
Allocating Large Vocabulary Capacity for Cross-lingual Language Model Pre-training
EMNLP 2021 11
Improving Pretrained Cross-Lingual Language Models via Self-Labeled Word Alignment
ACL 2021 5
Affiliations
Frequent co-authors
10from 50 papers