Bin Fu
- Papers
- 26
Cite
Notes
Only stored in your browser.
Authored papers
26Project Imaging-X: A Survey of 1000+ Open-Access Medical Imaging Datasets for Foundation Model Development
arXiv 2026
InternVL-U: Democratizing Unified Multimodal Models for Understanding, Reasoning, Generation and Editing
arXiv 2026
Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding
arXiv 2025
A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers
arXiv 2025
Lumina-Image 2.0: A Unified and Efficient Image Generative Framework
ICCV 2025
UniPercept: Towards Unified Perceptual-Level Image Understanding across Aesthetics, Quality, Structure, and Texture
arXiv 2025
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
arXiv 2025
Unimedvl: Unifying Medical Multimodal Understanding And Generation Through Observation-Knowledge-Analysis
arXiv 2025
LeX-Art: Rethinking Text Generation via Scalable High-Quality Data Synthesis
arXiv 2025
PICABench: How Far Are We from Physically Realistic Image Editing?
arXiv 2025
GMAI-VL-R1: Harnessing Reinforcement Learning for Multimodal Medical Reasoning
arXiv 2025
MVPaint: Synchronized Multi-View Diffusion for Painting Anything 3D
CVPR 2025 1
GMAI-VL & GMAI-VL-5.5M: A Large Vision-Language Model and A Comprehensive Multimodal Dataset Towards General Medical AI
arXiv 2024
GMAI-MMBench: A Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI
arXiv 2024
EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts
arXiv 2024
ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment
arXiv 2024
Parameter-Efficient Fine-Tuning for Pre-Trained Vision Models: A Survey and Benchmark
arXiv 2024
MeshXL: Neural Coordinate Field for Generative 3D Foundation Models
arXiv 2024
Interactive Medical Image Segmentation: A Benchmark Dataset and Baseline
CVPR 2025 1
AppAgent: Multimodal Agents as Smartphone Users
arXiv 2023
Paint3D: Paint Anything 3D with Lighting-Less Texture Diffusion Models
CVPR 2024 1
Michelangelo: Conditional 3D Shape Generation based on Shape-Image-Text Aligned Latent Representation
michelangelo-conditional-3d-shape-generation
SAM-Med3D: Towards General-purpose Segmentation Models for Volumetric Medical Images
arXiv 2023
FaceStudio: Put Your Face Everywhere in Seconds
arXiv 2023
StableLLaVA: Enhanced Visual Instruction Tuning with Synthesized Image-Dialogue Data
arXiv 2023
StyleAvatar3D: Leveraging Image-Text Diffusion Models for High-Fidelity 3D Avatar Generation
arXiv 2023
Affiliations
Frequent co-authors
10from 26 papers