Xiaoshuai Sun
- Papers
- 15
Cite
Notes
Only stored in your browser.
Authored papers
15RePrompt: Reasoning-Augmented Reprompting for Text-to-Image Generation via Reinforcement Learning
arXiv 2025
Zooming In on Fakes: A Novel Dataset for Localized AI-Generated Image Detection with Forgery Amplification Approach
arXiv 2025
Grounded Chain-of-Thought for Multimodal Large Language Models
arXiv 2025
ComfyGPT: A Self-Optimizing Multi-Agent System for Comprehensive ComfyUI Workflow Generation
arXiv 2025
Accelerating Multimodal Large Language Models via Dynamic Visual-Token Exit and the Empirical Findings
arXiv 2024
Feast Your Eyes: Mixture-of-Resolution Adaptation for Multimodal Large Language Models
arXiv 2024
ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models
arXiv 2024
FlashSloth: Lightning Multimodal Large Language Models via Embedded Visual Compression
arXiv 2024
TraDiffusion: Trajectory-Based Training-Free Image Generation
arXiv 2024
INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal Large Language Model
arXiv 2024
Multi-branch Collaborative Learning Network for 3D Visual Grounding
arXiv 2024
Towards Efficient Diffusion-Based Image Editing with Instant Attention Masks
arXiv 2024
JM3D & JM3D-LLM: Elevating 3D Understanding with Joint Multi-modal Cues
arXiv 2023
X-Dreamer: Creating High-quality 3D Content by Bridging the Domain Gap Between Text-to-2D and Text-to-3D Generation
arXiv 2023
X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via Dynamic Textual Guidance
ICCV 2023 1
Affiliations
Frequent co-authors
10from 15 papers