Can Huang
- Papers
- 15
Cite
Notes
Only stored in your browser.
Authored papers
15GLM-5: from Vibe Coding to Agentic Engineering
arXiv 2026
TextPecker: Rewarding Structural Anomaly Quantification for Enhancing Visual Text Rendering
arXiv 2026
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
arXiv 2025
Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting
arXiv 2025
Seed1.5-VL Technical Report
arXiv 2025
Vision as LoRA
arXiv 2025
Metasql: A Generate-then-Rank Framework for Natural Language to SQL Translation
arXiv 2024
Elysium: Exploring Object-level Perception in Videos via MLLM
arXiv 2024
MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering
arXiv 2024
TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy
arXiv 2024
Dynamic-VLM: Simple Dynamic Visual Token Compression for VideoLLM
arXiv 2024
ParGo: Bridging Vision-Language with Partial and Global Views
arXiv 2024
OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning
arXiv 2024
Harmonizing Visual Text Comprehension and Generation
arXiv 2024
ESTextSpotter: Towards Better Scene Text Spotting with Explicit Synergy in Transformer
ICCV 2023 1
Affiliations
Frequent co-authors
10from 15 papers