Yao Zhao
- Papers
- 28
Cite
Notes
Only stored in your browser.
Authored papers
28VideoWorld 2: Learning Transferable Knowledge from Real-world Videos
arXiv 2026
StructDiff: A Structure-Preserving and Spatially Controllable Diffusion Model for Single-Image Generation
arXiv 2026
Let ViT Speak: Generative Language-Image Pre-training
arXiv 2026
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
preprint
EvEnhancer: Empowering Effectiveness, Efficiency and Generalizability for Continuous Space-Time Video Super-Resolution with Events
CVPR 2025 1
ThinkGen: Generalized Thinking for Visual Generation
arXiv 2025
FlexVAR: Flexible Visual Autoregressive Modeling without Residual Prediction
arXiv 2025
DIDS: Domain Impact-aware Data Sampling for Large Language Model Training
arXiv 2025
CharaConsist: Fine-Grained Consistent Character Generation
ICCV 2025
From Editor to Dense Geometry Estimator
arXiv 2025
DeepSeek-V3 Technical Report
arXiv 2024
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
arXiv 2024
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
arXiv 2024
Frequency-Aware Deepfake Detection: Improving Generalizability through Frequency Space Learning
arXiv 2024
Once-for-All: Controllable Generative Image Compression with Dynamic Granularity Adaptation
arXiv 2024
Transferable and Principled Efficiency for Open-Vocabulary Segmentation
CVPR 2024 1
Eliminating Warping Shakes for Unsupervised Online Video Stitching
arXiv 2024
Region-Adaptive Transform with Segmentation Prior for Image Compression
arXiv 2024
ClassDiffusion: More Aligned Personalization Tuning with Explicit Class Guidance
arXiv 2024
Deep Learning for Camera Calibration and Beyond: A Survey
arXiv 2023
Parallax-Tolerant Unsupervised Deep Image Stitching
ICCV 2023 1
Rethinking the Up-Sampling Operations in CNN-based Generative Network for Generalizable Deepfake Detection
CVPR 2024 1
Group Pose: A Simple Baseline for End-to-End Multi-person Pose Estimation
ICCV 2023 1
Disentangling Orthogonal Planes for Indoor Panoramic Room Layout Estimation with Cross-Scale Distortion Awareness
CVPR 2023 1
Lookahead: An Inference Acceleration Framework for Large Language Model with Lossless Generation Accuracy
arXiv 2023
CTP: Towards Vision-Language Continual Pretraining via Compatible Momentum Contrast and Topology Preservation
arXiv 2023
Investigating Efficiently Extending Transformers for Long Input Summarization
arXiv 2022
Adversarial Attacks and Defences Competition
arXiv 2018
Affiliations
Frequent co-authors
10from 28 papers