0

Yongdong Zhang

Papers
20

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
20papers

Authored papers

20

Lance: Unified Multimodal Modeling by Multi-Task Synergy

arXiv 2026

2026

Stream-R1: Reliability-Perplexity Aware Reward Distillation for Streaming Video Generation

arXiv 2026

2026

FS-Researcher: Test-Time Scaling for Long-Horizon Research Tasks with File-System-Based Agents

arXiv 2026

2026

Wiki Live Challenge: Challenging Deep Research Agents with Expert-Level Wikipedia Articles

arXiv 2026

2026

Mask$^2$DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation

arXiv 2025

2025

MotionGS: Exploring Explicit Motion Guidance for Deformable 3D Gaussian Splatting

arXiv 2024

2024

ScaleDepth: Decomposing Metric Depth Estimation into Scale Prediction and Relative Depth Estimation

arXiv 2024

2024

DEADiff: An Efficient Stylization Diffusion Model with Disentangled Representations

CVPR 2024 1

2024

RealCustom++: Representing Images as Real-Word for Real-Time Customization

arXiv 2024

2024

RealCustom: Narrowing Real Text Word for Real-Time Open-Domain Text-to-Image Customization

CVPR 2024 1

2024

Air-Decoding: Attribute Distribution Reconstruction for Decoding-Time Controllable Text Generation

arXiv 2023

2023

ExpertPrompting: Instructing Large Language Models to be Distinguished Experts

arXiv 2023

2023

Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dynamic Vector Quantization

towards-accurate-image-coding-improved

2023

ChiMed-GPT: A Chinese Medical Large Language Model with Full Training Regime and Better Alignment to Human Preferences

arXiv 2023

2023

Improving Chinese Spelling Check by Character Pronunciation Prediction: The Effects of Adaptivity and Granularity

arXiv 2022

2022

ABINet++: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Spotting

arXiv 2022

2022

Bridging the Gap Between Vision Transformers and Convolutional Neural Networks on Small Datasets

arXiv 2022

2022

Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition

CVPR 2021 1

2021

LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation

arXiv 2020

2020

CatGCN: Graph Convolutional Networks with Categorical Node Features

arXiv 2020

2020

Affiliations

No known affiliations.

Frequent co-authors

10

from 20 papers