Bohan Zeng
- Papers
- 27
Cite
Notes
Only stored in your browser.
Authored papers
27DataFlex: A Unified Framework for Data-Centric Dynamic Training of Large Language Models
arXiv 2026
Artifact-Bench: Evaluating MLLMs on Detecting and Assessing the Artifacts of AI-Generated Videos
arXiv 2026
LatentOmni: Rethinking Omni-Modal Understanding via Unified Audio-Visual Latent Reasoning
arXiv 2026
LongAV-Compass: Towards Unified Evaluation of Minute-Scale Audio-Visual Generation Across T2AV, I2AV, and V2AV
arXiv 2026
OpenWorldLib: A Unified Codebase and Definition of Advanced World Models
arXiv 2026
CoF-T2I: Video Models as Pure Visual Reasoners for Text-to-Image Generation
arXiv 2026
Semantic Routing: Exploring Multi-Layer LLM Feature Weighting for Diffusion Transformers
arXiv 2026
VABench: A Comprehensive Benchmark for Audio-Video Generation
arXiv 2025
Any2AnyTryon: Leveraging Adaptive Position Embeddings for Versatile Virtual Clothing Tasks
ICCV 2025
DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI
arXiv 2025
WideRange4D: Enabling High-Quality 4D Reconstruction with Wide-Range Movements and Scenes
arXiv 2025
Multi-Step Visual Reasoning with Visual Tokens Scaling and Verification
arXiv 2025
MME-VideoOCR: Evaluating OCR-Based Capabilities of Multimodal LLMs in Video Scenarios
arXiv 2025
Mavors: Multi-granularity Video Representation for Multimodal Large Language Model
arXiv 2025
SVG-T2I: Scaling Up Text-to-Image Latent Diffusion Model Without Variational Autoencoder
arXiv 2025
Scone: Bridging Composition and Distinction in Subject-Driven Image Generation via Unified Understanding-Generation Modeling
arXiv 2025
Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation
arXiv 2025
MorphoBench: A Benchmark with Difficulty Adaptive to Model Reasoning
arXiv 2025
Multimodal Reasoning for Science: Technical Report and 1st Place Solution to the ICML 2025 SeePhys Challenge
arXiv 2025
RealUnify: Do Unified Models Truly Benefit from Unification? A Comprehensive Benchmark
arXiv 2025
Let's Verify Math Questions Step by Step
arXiv 2025
EditWorld: Simulating World Dynamics for Instruction-Following Image Editing
arXiv 2024
Trans4D: Realistic Geometry-Aware Transition for Compositional Text-to-4D Synthesis
arXiv 2024
Semantic Score Distillation Sampling for Compositional Text-to-3D Generation
arXiv 2024
Controllable Mind Visual Diffusion Model
arXiv 2023
ZONE: Zero-Shot Instruction-Guided Local Editing
CVPR 2024 1
FNeVR: Neural Volume Rendering for Face Animation
arXiv 2022
Affiliations
Frequent co-authors
10from 27 papers