ran Xu
- Papers
- 25
Cite
Notes
Only stored in your browser.
Authored papers
25AutoResearch AI: Towards AI-Powered Research Automation for Scientific Discovery
arXiv 2026
Future Optical Flow Prediction Improves Robot Control & Video Generation
arXiv 2026
VLAA-GUI: Knowing When to Stop, Recover, and Search, A Modular Framework for GUI Automation
arXiv 2026
BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset
arXiv 2025
MedAgentGym: Training LLM Agents for Code-Based Medical Reasoning at Scale
arXiv 2025
VLM2Vec-V2: Advancing Multimodal Embedding for Videos, Images, and Visual Documents
arXiv 2025
Collab-RAG: Boosting Retrieval-Augmented Generation for Complex Question Answering via White-Box and Black-Box LLM Collaboration
arXiv 2025
Scaling Agentic Reinforcement Learning for Tool-Integrated Reasoning in VLMs
arXiv 2025
GTA1: GUI Test-time Scaling Agent
arXiv 2025
UNIDOC-BENCH: A Unified Benchmark for Document-Centric Multimodal RAG
arXiv 2025
AceSearcher: Bootstrapping Reasoning and Search for LLMs via Reinforced Self-Play
arXiv 2025
CoDA: Coding LM via Diffusion Adaptation
arXiv 2025
TrustLLM: Trustworthiness in Large Language Models
arXiv 2024
ProVision: Programmatically Scaling Vision-centric Instruction Data for Multimodal Language Models
arXiv 2024
FOFO: A Benchmark to Evaluate LLMs' Format-Following Capability
arXiv 2024
BMRetriever: Tuning Large Language Models as Better Biomedical Text Retrievers
arXiv 2024
MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens
arXiv 2024
SQ-LLaVA: Self-Questioning for Large Vision-Language Assistant
arXiv 2024
ULIP-2: Towards Scalable Multimodal Pre-training for 3D Understanding
CVPR 2024 1
UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild
unicontrol-a-unified-diffusion-model-for
Knowledge-Infused Prompting: Assessing and Advancing Clinical Text Data Generation with Large Language Models
arXiv 2023
Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization
arXiv 2023
BOLAA: Benchmarking and Orchestrating LLM-augmented Autonomous Agents
arXiv 2023
LayoutDETR: Detection Transformer Is a Good Multimodal Layout Designer
arXiv 2022
ApproxDet: Content and Contention-Aware Approximate Object Detection for Mobiles
arXiv 2020
Affiliations
Frequent co-authors
10from 25 papers