Jiacong Wang
- Papers
- 10
Cite
Notes
Only stored in your browser.
Authored papers
10SAMTok: Representing Any Mask with Two Words
arXiv 2026
The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single Transformer
ICCV 2025
Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology
arXiv 2025
UVE: Are MLLMs Unified Evaluators for AI-Generated Videos?
arXiv 2025
SAIL-VL2 Technical Report
arXiv 2025
Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs
arXiv 2025
Benchmarking and Improving Detail Image Caption
arXiv 2024
Seeing the Image: Prioritizing Visual Correlation by Contrastive Alignment
arXiv 2024
Unveiling the Tapestry of Consistency in Large Vision-Language Models
arXiv 2024
World to Code: Multi-modal Data Generation via Self-Instructed Compositional Captioning and Filtering
arXiv 2024
Affiliations
Frequent co-authors
10from 10 papers