Yuan Zhang
- Papers
- 18
Cite
Notes
Only stored in your browser.
Authored papers
18Xiaomi-Robotics-0: An Open-Sourced Vision-Language-Action Model with Real-Time Execution
arXiv 2026
General365: Benchmarking General Reasoning in Large Language Models Across Diverse and Challenging Tasks
arXiv 2026
Awaking Spatial Intelligence in Unified Multimodal Understanding and Generation
arXiv 2026
MoVE-KD: Knowledge Distillation for VLMs with Mixture of Visual Encoders
CVPR 2025 1
TimeSearch-R: Adaptive Temporal Search for Long-Form Video Understanding via Self-Verification Reinforcement Learning
arXiv 2025
Bridging Your Imagination with Audio-Video Generation via a Unified Director
arXiv 2025
LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control
arXiv 2024
SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference
arXiv 2024
Benchmarking and Improving Detail Image Caption
arXiv 2024
Unveiling the Tapestry of Consistency in Large Vision-Language Models
arXiv 2024
PlacidDreamer: Advancing Harmony in Text-to-3D Generation
arXiv 2024
Focus, Distinguish, and Prompt: Unleashing CLIP for Efficient and Flexible Scene Text Retrieval
arXiv 2024
Dynamic Snake Convolution based on Topological Geometric Constraints for Tubular Structure Segmentation
ICCV 2023 1
DVIS: Decoupled Video Instance Segmentation Framework
ICCV 2023 1
DVIS++: Improved Decoupled Framework for Universal Video Segmentation
arXiv 2023
Collaboration and Transition: Distilling Item Transitions into Multi-Query Self-Attention for Sequential Recommendation
arXiv 2023
All You Need is a Few Shifts: Designing Efficient Convolutional Neural Networks for Image Classification
all-you-need-is-a-few-shifts-designing-1
PAWS-X: A Cross-lingual Adversarial Dataset for Paraphrase Identification
paws-x-a-cross-lingual-adversarial-dataset-1
Affiliations
Frequent co-authors
10from 18 papers