Shih-Fu Chang
- Papers
- 9
Cite
Notes
Only stored in your browser.
Authored papers
9From Pixels to Insights: A Survey on Automatic Chart Understanding in the Era of Large Foundation Models
arXiv 2024
Ferret: Refer and Ground Anything Anywhere at Any Granularity
arXiv 2023
IdealGPT: Iteratively Decomposing Vision and Language Reasoning via Large Language Models
arXiv 2023
Do LVLMs Understand Charts? Analyzing and Correcting Factual Errors in Chart Captioning
arXiv 2023
Non-Sequential Graph Script Induction via Multimedia Grounding
arXiv 2023
UniFine: A Unified and Fine-grained Approach for Zero-shot Vision-Language Understanding
arXiv 2023
What, when, and where? -- Self-Supervised Spatio-Temporal Grounding in Untrimmed Multi-Action Videos from Narrated Instructions
arXiv 2023
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
NeurIPS 2021 12
Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos
ICCV 2021 10
Affiliations
Frequent co-authors
10from 9 papers