Shufan Li
- Papers
- 13
Cite
Notes
Only stored in your browser.
Authored papers
13LaViDa: A Large Diffusion Language Model for Multimodal Understanding
arXiv 2025
Reflect-DiT: Inference-Time Scaling for Text-to-Image Diffusion Transformers via In-Context Reflection
reflect-dit-inference-time-scaling-for-text
MobileWorldBench: Towards Semantic World Modeling For Mobile Agents
arXiv 2025
From Masks to Worlds: A Hitchhiker's Guide to World Models
arXiv 2025
OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows
CVPR 2025 1
Aligning Diffusion Models by Optimizing Human Utility
arXiv 2024
xT: Nested Tokenization for Larger Context in Large Images
arXiv 2024
Mamba-ND: Selective State Space Modeling for Multi-Dimensional Data
arXiv 2024
MedMax: Mixed-Modal Instruction Tuning for Training Biomedical Assistants
arXiv 2024
Hierarchical Open-vocabulary Universal Image Segmentation
hierarchical-open-vocabulary-universal-image
InstructAny2Pix: Flexible Visual Editing via Multimodal Instruction Following
arXiv 2023
Scale-MAE: A Scale-Aware Masked Autoencoder for Multiscale Geospatial Representation Learning
ICCV 2023 1
Refine and Represent: Region-to-Object Representation Learning
arXiv 2022
Affiliations
Frequent co-authors
10from 13 papers
Aditya Grover
Kazuki Kozuka
Konstantinos Kallidromitis
Yusuke Kato
Akash Gokul
Trevor Darrell
professor
Colorado J. Reed
Harkanwar Singh
Hritik Bansal
grad-student
Ritwik Gupta