Cite
Notes
Only stored in your browser.
Attribution
Florence: A New Foundation Model for Computer Vision
arXiv 2021
RegionCLIP: Region-based Language-Image Pretraining
CVPR 2022 1
Unified Vision-Language Pre-Training for Image Captioning and VQA
arXiv 2019
from 3 papers
Jianfeng Gao
Chunyuan Li
Houdong Hu
Jianwei Yang
Lu Yuan
Noel Codella
Pengchuan Zhang
Xiyang Dai
Bin Xiao
Boxin Li