Yan Shu

Papers: 10

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile: Semantic Scholar

Attribution policy →

10papers

Authored papers

Qwen-Image-VAE-2.0 Technical Report

arXiv 2026

2026

TerraScope: Pixel-Grounded Visual Reasoning for Earth Observation

arXiv 2026

2026

Video-XL-Pro: Reconstructive Token Compression for Extremely Long Video Understanding

arXiv 2025

2025

EarthMind: Towards Multi-Granular and Multi-Sensor Earth Observation with Large Multimodal Models

arXiv 2025

2025

Visual Text Processing: A Comprehensive Review and Unified Evaluation

arXiv 2025

2025

When Semantics Mislead Vision: Mitigating Large Multimodal Models Hallucinations in Scene Text Spotting and Understanding

arXiv 2025

2025

VidText: Towards Comprehensive Evaluation for Video Text Understanding

arXiv 2025

2025

Video-BrowseComp: Benchmarking Agentic Video Research on Open Web

arXiv 2025

2025

MLVU: Benchmarking Multi-task Long Video Understanding

CVPR 2025 1

2024

TextCtrl: Diffusion-based Scene Text Editing with Prior Guidance Control

arXiv 2024

2024

Affiliations

No known affiliations.

Frequent co-authors

from 10 papers

Nicu Sebe

Paolo Rota

Yu Zhou

Zheng Liu

Begum Demir

Bin Ren

Bo Zhao

Gangyan Zeng

Minghao Qin

Weichao Zeng