Hang Zhang
- Papers
- 25
Cite
Notes
Only stored in your browser.
Authored papers
25Sat3DGen: Comprehensive Street-Level 3D Scene Generation from Single Satellite Image
arXiv 2026
Qwen-Image Technical Report
arXiv 2025
Qwen2.5-VL Technical Report
arXiv 2025
Qwen3-VL Technical Report
arXiv 2025
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding
arXiv 2025
The Hitchhiker's Guide to Program Analysis, Part II: Deep Thoughts by LLMs
arXiv 2025
SVGenius: Benchmarking LLMs in SVG Understanding, Editing and Generation
arXiv 2025
MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources
arXiv 2025
Embodied-Reasoner: Synergizing Visual Search, Reasoning, and Action for Embodied Interactive Tasks
arXiv 2025
Think Twice, Click Once: Enhancing GUI Grounding via Fast and Slow Systems
arXiv 2025
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs
arXiv 2024
Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss
arXiv 2024
Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective
arXiv 2024
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM
CVPR 2025 1
SVIPTR: Fast and Efficient Scene Text Recognition with Vision Permutable Extractor
arXiv 2024
The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio
arXiv 2024
Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
arXiv 2023
SeaLLMs -- Large Language Models for Southeast Asia
arXiv 2023
AnnoLLM: Making Large Language Models to Be Better Crowdsourced Annotators
arXiv 2023
Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP
CVPR 2023 1
APOLLO: An Optimized Training Approach for Long-form Numerical Reasoning
arXiv 2022
Adversarial Retriever-Ranker for dense text retrieval
adversarial-retriever-ranker-for-dense-text-1
ResNeSt: Split-Attention Networks
arXiv 2020
AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data
arXiv 2020
Bag of Freebies for Training Object Detection Neural Networks
arXiv 2019
Affiliations
Frequent co-authors
10from 25 papers