Chengzhuo Tong
- Papers
- 9
Cite
Notes
Only stored in your browser.
Authored papers
9Artifact-Bench: Evaluating MLLMs on Detecting and Assessing the Artifacts of AI-Generated Videos
arXiv 2026
LatentOmni: Rethinking Omni-Modal Understanding via Unified Audio-Visual Latent Reasoning
arXiv 2026
OpenWorldLib: A Unified Codebase and Definition of Advanced World Models
arXiv 2026
CoF-T2I: Video Models as Pure Visual Reasoners for Text-to-Image Generation
arXiv 2026
Scone: Bridging Composition and Distinction in Subject-Driven Image Generation via Unified Understanding-Generation Modeling
arXiv 2025
RealUnify: Do Unified Models Truly Benefit from Unification? A Comprehensive Benchmark
arXiv 2025
Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step
arXiv 2025
SAM2Point: Segment Any 3D as Videos in Zero-shot and Promptable Manners
arXiv 2024
MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine
arXiv 2024
Affiliations
Frequent co-authors
10from 9 papers