0

Yanwei Fu

Papers
30

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
30papers

Authored papers

30

VerseCrafter: Dynamic Realistic Video World Model with 4D Geometric Control

arXiv 2026

2026

The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook

arXiv 2026

2026

MVGGT: Multimodal Visual Geometry Grounded Transformer for Multiview 3D Referring Expression Segmentation

arXiv 2026

2026

Uni3C: Unifying Precisely 3D-Enhanced Camera and Human Motion Controls for Video Generation

arXiv 2025

2025

Disentangle Identity, Cooperate Emotion: Correlation-Aware Emotional Talking Portrait Generation

arXiv 2025

2025

LongVie 2: Multimodal Controllable Ultra-Long Video World Model

arXiv 2025

2025

ReVSeg: Incentivizing the Reasoning Chain for Video Segmentation with Reinforcement Learning

arXiv 2025

2025

InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy

arXiv 2025

2025

StrandDesigner: Towards Practical Strand Generation with Sketch Guidance

arXiv 2025

2025

Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination

arXiv 2025

2025

Visual Document Understanding and Question Answering: A Multi-Agent Collaboration Framework with Test-Time Scaling

arXiv 2025

2025

MVGenMaster: Scaling Multi-View Generation from Any Image via 3D Priors Enhanced Diffusion Model

CVPR 2025 1

2024

MinD-3D++: Advancing fMRI-Based 3D Reconstruction with High-Quality Textured Mesh Generation and a Comprehensive Dataset

arXiv 2024

2024

CustAny: Customizing Anything from A Single Example

CVPR 2025 1

2024

Repositioning the Subject within Image

arXiv 2024

2024

MVSFormer++: Revealing the Devil in Transformer's Details for Multi-View Stereo

arXiv 2024

2024

3D StreetUnveiler with Semantic-Aware 2DGS

arXiv 2024

2024

ContextualStory: Consistent Visual Storytelling with Spatially-Enhanced and Storyline Context

arXiv 2024

2024

Unified Lexical Representation for Interpretable Visual-Language Alignment

arXiv 2024

2024

FitDiT: Advancing the Authentic Garment Details for High-fidelity Virtual Try-on

arXiv 2024

2024

Coarse-to-Fine Amodal Segmentation with Shape Prior

ICCV 2023 1

2023

Towards Enhanced Image Inpainting: Mitigating Unwanted Object Insertion and Preserving Color Consistency

CVPR 2025 1

2023

Improving Transformer-based Image Matching by Cascaded Capturing Spatially Informative Keypoints

ICCV 2023 1

2023

Object-Centric Multiple Object Tracking

ICCV 2023 1

2023

Unsupervised Open-Vocabulary Object Localization in Videos

ICCV 2023 1

2023

Rethinking Amodal Video Segmentation from Learning Supervised Signals with Object-centric Representation

ICCV 2023 1

2023

Doubly Robust Proximal Causal Learning for Continuous Treatments

arXiv 2023

2023

Incremental Transformer Structure Enhanced Image Inpainting with Masking Positional Encoding

CVPR 2022 1

2022

Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

CVPR 2021 1

2020

AI Challenger : A Large-scale Dataset for Going Deeper in Image Understanding

arXiv 2017

2017

Affiliations

No known affiliations.

Frequent co-authors

10

from 30 papers