Shilin Xu
- Papers
- 9
Cite
Notes
Only stored in your browser.
Authored papers
9Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
arXiv 2025
An Empirical Study of GPT-4o Image Generation Capabilities
arXiv 2025
On Path to Multimodal Generalist: General-Level and General-Bench
arXiv 2025
Mixed-R1: Unified Reward Perspective For Reasoning Capability in Multimodal Large Language Models
arXiv 2025
DenseWorld-1M: Towards Detailed Dense Grounded Caption in the Real World
arXiv 2025
Are They the Same? Exploring Visual Correspondence Shortcomings of Multimodal LLMs
ICCV 2025
An Open and Comprehensive Pipeline for Unified Object Grounding and Detection
arXiv 2024
RMP-SAM: Towards Real-Time Multi-Purpose Segment Anything
arXiv 2024
Fashionformer: A simple, Effective and Unified Baseline for Human Fashion Segmentation and Recognition
arXiv 2022
Affiliations
Frequent co-authors
10from 9 papers