Jiasen Lu
- Papers
- 8
Cite
Notes
Only stored in your browser.
Authored papers
8GIE-Bench: Towards Grounded Evaluation for Text-Guided Image Editing
arXiv 2025
Pico-Banana-400K: A Large-Scale Dataset for Text-Guided Image Editing
arXiv 2025
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models
CVPR 2025 1
One Diffusion to Generate Them All
CVPR 2025 1
The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities
arXiv 2024
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action
arXiv 2023
12-in-1: Multi-Task Vision and Language Representation Learning
12-in-1-multi-task-vision-and-language-1
Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning
knowing-when-to-look-adaptive-attention-via-a-1
Affiliations
Frequent co-authors
10from 8 papers