Dong Chen

Papers: 24

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile: Semantic Scholar

Attribution policy →

24papers

Authored papers

Lens: Rethinking Training Efficiency for Foundational Text-to-Image Models

arXiv 2026

2026

InsightTok: Improving Text and Face Fidelity in Discrete Tokenization for Autoregressive Image Generation

arXiv 2026

2026

Gaussian Variation Field Diffusion for High-fidelity Video-to-4D Synthesis

ICCV 2025

2025

SWE-Exp: Experience-Driven Software Issue Resolution

arXiv 2025

2025

Diffusion Models without Classifier-free Guidance

diffusion-models-without-classifier-free

2025

A Simple Aerial Detection Baseline of Multimodal Language Models

arXiv 2025

2025

A Simple Aerial Detection Baseline of Multimodal Language Models

arXiv 2025

2025

Structured 3D Latents for Scalable and Versatile 3D Generation

CVPR 2025 1

2024

CodeR: Issue Resolving with Multi-Agent and Task Graphs

arXiv 2024

2024

Scaling the Codebook Size of VQGAN to 100,000 with a Utilization Rate of 99%

arXiv 2024

2024

Make-It-3D: High-Fidelity 3D Creation from A Single Image with Diffusion Prior

ICCV 2023 1

2023

Efficient Diffusion Training via Min-SNR Weighting Strategy

ICCV 2023 1

2023

IRGen: Generative Modeling for Image Retrieval

arXiv 2023

2023

Paint by Example: Exemplar-based Image Editing with Diffusion Models

CVPR 2023 1

2022

Pretraining is All You Need for Image-to-Image Translation

arXiv 2022

2022

Improved Vector Quantized Diffusion Models

arXiv 2022

2022

Semantic Image Synthesis via Diffusion Models

arXiv 2022

2022

CLIP Itself is a Strong Fine-tuner: Achieving 85.7% and 88.0% Top-1 Accuracy with ViT-B and ViT-L on ImageNet

arXiv 2022

2022

I^2R-Net: Intra- and Inter-Human Relation Network for Multi-Person Pose Estimation

arXiv 2022

2022

Vector Quantized Diffusion Model for Text-to-Image Synthesis

CVPR 2022 1

2021

StyleSwin: Transformer-based GAN for High-resolution Image Generation

styleswin-transformer-based-gan-for-high

2021

CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows

cswin-transformer-a-general-vision-1

2021

Old Photo Restoration via Deep Latent Space Translation

arXiv 2020

2020

Cross-Task Transfer for Geotagged Audiovisual Aerial Scene Recognition

ECCV 2020 8

2020

Affiliations

No known affiliations.

Frequent co-authors

from 24 papers

Jianmin Bao

Baining Guo

Fang Wen

Bo Zhang

Shuyang Gu

Dongdong Chen

Ting Zhang

Lu Yuan

BoWen Zhang

Fangyun Wei