Shusheng Yang
- Papers
- 8
Cite
Notes
Only stored in your browser.
8papers
Authored papers
8VideoNSA: Native Sparse Attention Scales Video Understanding
arXiv 2025
Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs
arXiv 2024
Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces
CVPR 2025 1
Qwen Technical Report
arXiv 2023
Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond
arXiv 2023
ViTMatte: Boosting Image Matting with Pretrained Plain Vision Transformers
arXiv 2023
TouchStone: Evaluating Vision-Language Models by Language Models
arXiv 2023
Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection
ICCV 2023 1
Affiliations
No known affiliations.
Frequent co-authors
10from 8 papers