Jiannan Wu
- Papers
- 7
Cite
Notes
Only stored in your browser.
7papers
Authored papers
7InSight-o3: Empowering Multimodal Foundation Models with Generalized Visual Search
arXiv 2025
The Synergy Dilemma of Long-CoT SFT and RL: Investigating Post-Training Techniques for Reasoning VLMs
arXiv 2025
VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks
arXiv 2024
Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models
arXiv 2024
VisionLLM: Large Language Model is also an Open-Ended Decoder for Vision-Centric Tasks
NeurIPS 2023 11
UniRef++: Segment Every Reference Object in Spatial and Temporal Spaces
arXiv 2023
Language as Queries for Referring Video Object Segmentation
CVPR 2022 1
Affiliations
No known affiliations.
Frequent co-authors
10from 7 papers