Haoqin Tu
- Papers
- 19
Cite
Notes
Only stored in your browser.
Authored papers
19AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration
arXiv 2026
ClinSeekAgent: Automating Multimodal Evidence Seeking for Agentic Clinical Reasoning
arXiv 2026
MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild
arXiv 2026
From Seeing to Thinking: Decoupling Perception and Reasoning Improves Post-Training of Vision-Language Models
arXiv 2026
Target-Oriented Pretraining Data Selection via Neuron-Activated Graph
arXiv 2026
VLAA-GUI: Knowing When to Stop, Recover, and Search, A Modular Framework for GUI Automation
arXiv 2026
Chasing the Public Score: User Pressure and Evaluation Exploitation in Coding Agent Workflows
arXiv 2026
Your Agent, Their Asset: A Real-World Safety Analysis of OpenClaw
arXiv 2026
OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learning
ICCV 2025
AHELM: A Holistic Evaluation of Audio-Language Models
arXiv 2025
SpatialThinker: Reinforcing 3D Reasoning in Multimodal LLMs via Spatial Rewards
arXiv 2025
Language Models Can See Better: Visual Contrastive Decoding For LLM Multimodal Reasoning
arXiv 2025
MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?
arXiv 2024
How Far Are We From AGI: Are LLMs All We Need?
arXiv 2024
What If We Recaption Billions of Web Images with LLaMA-3?
arXiv 2024
Autoregressive Pretraining with Mamba in Vision
arXiv 2024
ReSee: Responding through Seeing Fine-grained Visual Knowledge in Open-domain Dialogue
arXiv 2023
How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs
arXiv 2023
AdaVAE: Exploring Adaptive GPT-2s in Variational Auto-Encoders for Language Modeling
arXiv 2022
Affiliations
Frequent co-authors
10from 19 papers