Sheng Jin
- Papers
- 12
Cite
Notes
Only stored in your browser.
Authored papers
12Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
ICCV 2025
JavisGPT: A Unified Multi-modal LLM for Sounding-Video Comprehension and Generation
arXiv 2025
COIG-Writer: A High-Quality Dataset for Chinese Creative Writing with Thought Processes
arXiv 2025
Beyond Correctness: Evaluating Subjective Writing Preferences Across Cultures
arXiv 2025
F-LMM: Grounding Frozen Large Multimodal Models
CVPR 2025 1
FrozenSeg: Harmonizing Frozen Foundation Models for Open-Vocabulary Segmentation
arXiv 2024
AutoMMLab: Automatically Generating Deployable Models from Language Instructions for Computer Vision Tasks
arXiv 2024
Vision-Language Models for Vision Tasks: A Survey
arXiv 2023
CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction
arXiv 2023
Uncertainty-aware Unsupervised Multi-Object Tracking
ICCV 2023 1
CLIM: Contrastive Language-Image Mosaic for Region Representation
arXiv 2023
You Only Learn One Query: Learning Unified Human Query for Single-Stage Multi-Person Multi-Task Human-Centric Perception
arXiv 2023
Affiliations
Frequent co-authors
10from 12 papers