Shijie Wang
- Papers
- 15
Cite
Notes
Only stored in your browser.
Authored papers
15Mixture-of-Depths Attention
arXiv 2026
Qwen2.5-VL Technical Report
arXiv 2025
A Survey of Reinforcement Learning for Large Reasoning Models
arXiv 2025
HAIBU-ReMUD: Reasoning Multimodal Ultrasound Dataset and Model Bridging to General Specific Domains
arXiv 2025
How Can Objects Help Video-Language Understanding?
ICCV 2025
Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution
arXiv 2024
Qwen2 Technical Report
arXiv 2024
Do Pre-trained Vision-Language Models Encode Object States?
arXiv 2024
Qwen Technical Report
arXiv 2023
Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond
arXiv 2023
ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities
arXiv 2023
AntGPT: Can Large Language Models Help Long-term Action Anticipation from Videos?
arXiv 2023
Vamos: Versatile Action Models for Video Understanding
arXiv 2023
Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection
ICCV 2023 1
Pose Recognition with Cascade Transformers
CVPR 2021 1
Affiliations
Frequent co-authors
10from 15 papers