0

Shuai Yang

Papers
42

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
42papers

Authored papers

42

A Pragmatic VLA Foundation Model

arXiv 2026

2026

Causal World Modeling for Robot Control

arXiv 2026

2026

Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond

arXiv 2026

2026

DVD: Deterministic Video Depth Estimation with Generative Priors

arXiv 2026

2026

LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation

arXiv 2026

2026

WORLDMEM: Long-term Consistent World Simulation with Memory

arXiv 2025

2025

Citrus: Leveraging Expert Cognitive Pathways in a Medical Language Model for Advanced Medical Decision Support

arXiv 2025

2025

Vision-Language Model for Object Detection and Segmentation: A Review and Evaluation

arXiv 2025

2025

Training-Free Watermarking for Autoregressive Image Generation

arXiv 2025

2025

Balanced Image Stylization with Style Matching Score

ICCV 2025

2025

DualCamCtrl: Dual-Branch Diffusion Model for Geometry-Aware Camera-Controlled Video Generation

arXiv 2025

2025

Trainable Log-linear Sparse Attention for Efficient Diffusion Transformers

arXiv 2025

2025

SANA-Video: Efficient Video Generation with Block Linear Diffusion Transformer

arXiv 2025

2025

InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation

arXiv 2025

2025

TokensGen: Harnessing Condensed Tokens for Long Video Generation

ICCV 2025

2025

MEAT: Multiview Diffusion Model for Human Generation on Megapixels with Mesh Attention

CVPR 2025 1

2025

MemFlow: Flowing Adaptive Memory for Consistent and Efficient Long Video Narratives

arXiv 2025

2025

InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy

arXiv 2025

2025

STream3R: Scalable Sequential 3D Reconstruction with Causal Transformer

arXiv 2025

2025

QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs

arXiv 2025

2025

RAPO++: Cross-Stage Prompt Optimization for Text-to-Video Generation via Data Alignment and Test-Time Scaling

arXiv 2025

2025

Alias-Free Latent Diffusion Models:Improving Fractional Shift Equivariance of Diffusion Latent Space

arXiv 2025

2025

Imagine360: Immersive 360 Video Generation from Perspective Anchor

arXiv 2024

2024

GaussianAnything: Interactive Point Cloud Flow Matching For 3D Object Generation

arXiv 2024

2024

MMScan: A Multi-Modal 3D Scene Dataset with Hierarchical Grounded Language Annotations

arXiv 2024

2024

3DTopia: Large Text-to-3D Generation Model with Hybrid Diffusion Priors

arXiv 2024

2024

SEED-Story: Multimodal Long Story Generation with Large Language Model

arXiv 2024

2024

Grounded 3D-LLM with Referent Tokens

arXiv 2024

2024

Forward Learning of Graph Neural Networks

arXiv 2024

2024

FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation

CVPR 2024 1

2024

LN3Diff: Scalable Latent Neural Fields Diffusion for Speedy 3D Generation

arXiv 2024

2024

StyleGANEX: StyleGAN-Based Manipulation Beyond Cropped Aligned Faces

ICCV 2023 1

2023

Text2Performer: Text-Driven Human Video Generation

ICCV 2023 1

2023

Scenimefy: Learning to Craft Anime Scene via Semi-Supervised Image-to-Image Translation

ICCV 2023 1

2023

DeformToon3D: Deformable 3D Toonification from Neural Radiance Fields

arXiv 2023

2023

Not All Steps are Created Equal: Selective Diffusion Distillation for Image Manipulation

ICCV 2023 1

2023

Denoising Diffusion Step-aware Models

arXiv 2023

2023

Defect Spectrum: A Granular Look of Large-Scale Defect Datasets with Rich Semantics

arXiv 2023

2023

VToonify: Controllable High-Resolution Portrait Video Style Transfer

arXiv 2022

2022

Pastiche Master: Exemplar-Based High-Resolution Portrait Style Transfer

CVPR 2022 1

2022

Text2Human: Text-Driven Controllable Human Image Generation

arXiv 2022

2022

BARS-CTR: Open Benchmarking for Click-Through Rate Prediction

arXiv 2020

2020

Affiliations

No known affiliations.

Frequent co-authors

10

from 42 papers