Tiancheng Zhao
- Papers
- 14
Cite
Notes
Only stored in your browser.
Authored papers
14MM-CondChain: A Programmatically Verified Benchmark for Visually Grounded Deep Compositional Reasoning
arXiv 2026
VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model
arXiv 2025
OmAgent: A Multi-modal Agent Framework for Complex Video Understanding with Task Divide-and-Conquer
arXiv 2024
Real-time Transformer-based Open-Vocabulary Detection with Efficient Fusion Head
arXiv 2024
ImageRAG: Enhancing Ultra High Resolution Remote Sensing Imagery Analysis with ImageRAG
arXiv 2024
ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration
arXiv 2024
QDA-SQL: Questions Enhanced Dialogue Augmentation for Multi-Turn Text-to-SQL
arXiv 2024
Evaluating and Enhancing LLMs for Multi-turn Text-to-SQL with Multiple Question Types
arXiv 2024
GUI Testing Arena: A Unified Benchmark for Advancing Autonomous GUI Testing Agent
arXiv 2024
How to Evaluate the Generalization of Detection? A Benchmark for Comprehensive Open-Vocabulary Detection
arXiv 2023
RS5M and GeoRSCLIP: A Large Scale Vision-Language Dataset and A Large Vision-Language Model for Remote Sensing
arXiv 2023
GroundVLP: Harnessing Zero-shot Visual Grounding from Vision-Language Pre-training and Open-Vocabulary Object Detection
arXiv 2023
Benchmarking Sequential Visual Input Reasoning and Prediction in Multimodal Large Language Models
arXiv 2023
VL-CheckList: Evaluating Pre-trained Vision-Language Models with Objects, Attributes and Relations
arXiv 2022
Affiliations
Frequent co-authors
10from 14 papers