Zhi-Qi Cheng
- Papers
- 18
Cite
Notes
Only stored in your browser.
Authored papers
18Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond
arXiv 2026
Overcoming Joint Intractability with Lossless Hierarchical Speculative Decoding
arXiv 2026
HCMA: Hierarchical Cross-model Alignment for Grounded Text-to-Image Generation
arXiv 2025
Why We Feel: Breaking Boundaries in Emotional Reasoning with Multimodal Large Language Models
arXiv 2025
MaxSup: Overcoming Representation Collapse in Label Smoothing
arXiv 2025
A Video-grounded Dialogue Dataset and Metric for Event-driven Activities
arXiv 2025
StableAnimator: High-Quality Identity-Preserving Human Image Animation
CVPR 2025 1
Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with Instruction Tuning
arXiv 2024
UMETTS: A Unified Framework for Emotional Text-to-Speech Synthesis with Multimodal Prompts
arXiv 2024
FlexEdit: Marrying Free-Shape Masks to VLLM for Flexible Image Editing
arXiv 2024
MIPS at SemEval-2024 Task 3: Multimodal Emotion-Cause Pair Extraction in Conversations with Multimodal Language Models
arXiv 2024
ProMQA: Question Answering Dataset for Multimodal Procedural Activity Understanding
arXiv 2024
Towards Calibrated Robust Fine-Tuning of Vision-Language Models
arXiv 2023
PoSynDA: Multi-Hypothesis Pose Synthesis Domain Adaptation for Robust 3D Human Pose Estimation
arXiv 2023
MotionEditor: Editing Video Motion via Content-Aware Diffusion
CVPR 2024 1
Implicit Temporal Modeling with Learnable Alignment for Video Recognition
ICCV 2023 1
ChartReader: A Unified Framework for Chart Derendering and Comprehension without Heuristic Rules
ICCV 2023 1
DAMO-StreamNet: Optimizing Streaming Perception in Autonomous Driving
arXiv 2023
Affiliations
Frequent co-authors
10from 18 papers