Yanghao Li
- Papers
- 14
Cite
Notes
Only stored in your browser.
Authored papers
14Imagination Helps Visual Reasoning, But Not Yet in Latent Space
arXiv 2026
Cheers: Decoupling Patch Details from Semantic Representations Enables Unified Multimodal Comprehension and Generation
arXiv 2026
MiniCPM4: Ultra-Efficient LLMs on End Devices
arXiv 2025
Improve Vision Language Model Chain-of-thought Reasoning
arXiv 2024
Idempotence and Perceptual Image Compression
arXiv 2024
Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles
arXiv 2023
R-MAE: Regions Meet Masked Autoencoders
arXiv 2023
Efficient Semantic Segmentation by Altering Resolutions for Compressed Videos
CVPR 2023 1
Exploring Plain Vision Transformer Backbones for Object Detection
arXiv 2022
Scaling Language-Image Pre-training via Masking
CVPR 2023 1
Masked Autoencoders As Spatiotemporal Learners
arXiv 2022
MViTv2: Improved Multiscale Vision Transformers for Classification and Detection
CVPR 2022 1
Masked Autoencoders Are Scalable Vision Learners
CVPR 2022 1
Ego4D: Around the World in 3,000 Hours of Egocentric Video
CVPR 2022 1
Affiliations
Frequent co-authors
10from 14 papers