0

Lu Yuan

Papers
26

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
26papers

Authored papers

26

Efficient Modulation for Vision Networks

arXiv 2024

2024

OmniVid: A Generative Framework for Universal Video Understanding

CVPR 2024 1

2024

Designing a Better Asymmetric VQGAN for StableDiffusion

arXiv 2023

2023

Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks

CVPR 2024 1

2023

Video-Bench: A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models

arXiv 2023

2023

iFusion: Inverting Diffusion for Pose-Free Reconstruction from Sparse Views

arXiv 2023

2023

Fully Authentic Visual Question Answering Dataset from Online Communities

arXiv 2023

2023

TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance

ICCV 2023 1

2023

Improving Adversarial Robustness of Masked Autoencoders via Test-time Frequency-domain Prompting

ICCV 2023 1

2023

Focal Modulation Networks

arXiv 2022

2022

Reduce Information Loss in Transformers for Pluralistic Image Inpainting

CVPR 2022 1

2022

Generalized Decoding for Pixel, Image, and Language

CVPR 2023 1

2022

DaViT: Dual Attention Vision Transformers

arXiv 2022

2022

Semantic Image Synthesis via Diffusion Models

arXiv 2022

2022

CLIP Itself is a Strong Fine-tuner: Achieving 85.7% and 88.0% Top-1 Accuracy with ViT-B and ViT-L on ImageNet

arXiv 2022

2022

GLIPv2: Unifying Localization and Vision-Language Understanding

arXiv 2022

2022

RegionCLIP: Region-based Language-Image Pretraining

CVPR 2022 1

2021

CvT: Introducing Convolutions to Vision Transformers

ICCV 2021 10

2021

Vector Quantized Diffusion Model for Text-to-Image Synthesis

CVPR 2022 1

2021

Lite-HRNet: A Lightweight High-Resolution Network

CVPR 2021 1

2021

Dynamic Head: Unifying Object Detection Heads with Attentions

CVPR 2021 1

2021

HairCLIP: Design Your Hair by Text and Reference Image

CVPR 2022 1

2021

CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows

cswin-transformer-a-general-vision-1

2021

Florence: A New Foundation Model for Computer Vision

arXiv 2021

2021

Deep Exemplar-based Colorization

arXiv 2018

2018

Visual Attribute Transfer through Deep Image Analogy

arXiv 2017

2017

Affiliations

No known affiliations.

Frequent co-authors

10

from 26 papers