0

Bo Zheng

Papers
50

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
50papers

Authored papers

50

FashionChameleon: Towards Real-Time and Interactive Human-Garment Video Customization

arXiv 2026

2026

The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models

arXiv 2026

2026

What Matters for Diffusion-Friendly Latent Manifold? Prior-Aligned Autoencoders for Latent Diffusion

arXiv 2026

2026

Continuous-Time Distribution Matching for Few-Step Diffusion Distillation

arXiv 2026

2026

Generative World Renderer

arXiv 2026

2026

A Semantically Consistent Dataset for Data-Efficient Query-Based Universal Sound Separation

arXiv 2026

2026

Perception or Prejudice: Can MLLMs Go Beyond First Impressions of Personality?

arXiv 2026

2026

Complementary Reinforcement Learning

arXiv 2026

2026

One Sample to Rule Them All: Extreme Data Efficiency in RL Scaling

arXiv 2026

2026

WildWorld: A Large-Scale Dataset for Dynamic World Modeling with Actions and Explicit State toward Generative ARPG

arXiv 2026

2026

PackForcing: Short Video Training Suffices for Long Video Sampling and Long Context Inference

arXiv 2026

2026

Qwen3-Omni Technical Report

arXiv 2025

2025

Qwen3 Technical Report

preprint

2025

Qwen3-VL Technical Report

arXiv 2025

2025

Reinforcement Learning Optimization for Large-Scale Learning: An Efficient and User-Friendly Scaling Library

arXiv 2025

2025

HiddenDetect: Detecting Jailbreak Attacks against Large Vision-Language Models via Monitoring Hidden States

arXiv 2025

2025

A Comprehensive Survey on Long Context Language Modeling

arXiv 2025

2025

PAID: A Framework of Product-Centric Advertising Image Design

arXiv 2025

2025

FloodDiffusion: Tailored Diffusion Forcing for Streaming Motion Generation

arXiv 2025

2025

Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem

arXiv 2025

2025

VStyle: A Benchmark for Voice Style Adaptation with Spoken Instructions

arXiv 2025

2025

ChineseEcomQA: A Scalable E-commerce Concept Evaluation Benchmark for Large Language Models

arXiv 2025

2025

DeepPHY: Benchmarking Agentic VLMs on Physical Reasoning

arXiv 2025

2025

Graph2Eval: Automatic Multimodal Task Generation for Agents via Knowledge Graphs

arXiv 2025

2025

Deconstructing Long Chain-of-Thought: A Structured Reasoning Optimization Framework for Long CoT Distillation

arXiv 2025

2025

DMM: Building a Versatile Image Generation Model via Distillation-Based Model Merging

arXiv 2025

2025

Compression with Global Guidance: Towards Training-free High-Resolution MLLMs Acceleration

arXiv 2025

2025

UQABench: Evaluating User Embedding for Prompting LLMs in Personalized Question Answering

arXiv 2025

2025

"See the World, Discover Knowledge": A Chinese Factuality Evaluation for Large Vision Language Models

arXiv 2025

2025

Think-J: Learning to Think for Generative LLM-as-a-Judge

arXiv 2025

2025

AIR: Complex Instruction Generation via Automatic Iterative Refinement

arXiv 2025

2025

ECKGBench: Benchmarking Large Language Models in E-commerce Leveraging Knowledge Graph

arXiv 2025

2025

Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning?

arXiv 2025

2025

USB: A Comprehensive and Unified Safety Evaluation Benchmark for Multimodal Large Language Models

arXiv 2025

2025

ProgCo: Program Helps Self-Correction of Large Language Models

arXiv 2025

2025

Differentiable Solver Search for Fast Diffusion Sampling

arXiv 2025

2025

Qwen2.5 Technical Report

arXiv 2024

2024

Qwen2 Technical Report

arXiv 2024

2024

RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness

CVPR 2025 1

2024

Align Anything: Training All-Modality Models to Follow Instructions with Language Feedback

arXiv 2024

2024

LLaVA-UHD v2: an MLLM Integrating High-Resolution Feature Pyramid via Hierarchical Window Transformer

arXiv 2024

2024

ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models

arXiv 2024

2024

MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language Models

arXiv 2024

2024

Accelerating Image Generation with Sub-path Linear Approximation Model

arXiv 2024

2024

Making Pre-trained Language Models Great on Tabular Prediction

arXiv 2024

2024

LongDocURL: a Comprehensive Multimodal Long Document Benchmark Integrating Understanding, Reasoning, and Locating

arXiv 2024

2024

EMAGE: Towards Unified Holistic Co-Speech Gesture Generation via Expressive Masked Audio Gesture Modeling

CVPR 2024 1

2023

StableMoE: Stable Routing Strategy for Mixture of Experts

ACL 2022 5

2022

Allocating Large Vocabulary Capacity for Cross-lingual Language Model Pre-training

EMNLP 2021 11

2021

Improving Pretrained Cross-Lingual Language Models via Self-Labeled Word Alignment

ACL 2021 5

2021

Affiliations

No known affiliations.

Frequent co-authors

10

from 50 papers