0

Fan Wang

Papers
36

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
36papers

Authored papers

36

TAROT: Test-driven and Capability-adaptive Curriculum Reinforcement Fine-tuning for Code Generation with Large Language Models

arXiv 2026

2026

Uni3C: Unifying Precisely 3D-Enhanced Camera and Human Motion Controls for Video Generation

arXiv 2025

2025

VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding

arXiv 2025

2025

DyDiT++: Dynamic Diffusion Transformers for Efficient Visual Generation

arXiv 2025

2025

RealisDance-DiT: Simple yet Strong Baseline towards Controllable Character Animation in the Wild

arXiv 2025

2025

Few-Step Distillation for Text-to-Image Generation: A Practical Guide

arXiv 2025

2025

RynnVLA-002: A Unified Vision-Language-Action and World Model

arXiv 2025

2025

BlockVid: Block Diffusion for High-Quality and Consistent Minute-Long Video Generation

arXiv 2025

2025

RynnVLA-001: Using Human Demonstrations to Improve Robot Manipulation

arXiv 2025

2025

WorldVLA: Towards Autoregressive Action World Model

arXiv 2025

2025

UniLumos: Fast and Unified Image and Video Relighting with Physics-Plausible Feedback

arXiv 2025

2025

RynnEC: Bringing MLLMs into Embodied World

arXiv 2025

2025

Lumos-1: On Autoregressive Video Generation from a Unified Model Perspective

arXiv 2025

2025

EOC-Bench: Can MLLMs Identify, Recall, and Forecast Objects in an Egocentric World?

arXiv 2025

2025

Benchmarking Multimodal Mathematical Reasoning with Explicit Visual Dependency

arXiv 2025

2025

A Survey on Large Language Models for Code Generation

arXiv 2024

2024

MVGenMaster: Scaling Multi-View Generation from Any Image via 3D Priors Enhanced Diffusion Model

CVPR 2025 1

2024

Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation

arXiv 2024

2024

MentalGLM Series: Explainable Large Language Models for Mental Health Analysis on Chinese Social Media

arXiv 2024

2024

A Survey on Mixture of Experts

arXiv 2024

2024

KaSA: Knowledge-Aware Singular-Value Adaptation of Large Language Models

arXiv 2024

2024

Large-vocabulary forensic pathological analyses via prototypical cross-modal contrastive learning

arXiv 2024

2024

LlamaDuo: LLMOps Pipeline for Seamless Migration from Service LLMs to Small-Scale Local LLMs

arXiv 2024

2024

SHMT: Self-supervised Hierarchical Makeup Transfer via Latent Diffusion Models

arXiv 2024

2024

SC4D: Sparse-Controlled Video-to-4D Generation and Motion Transfer

arXiv 2024

2024

Continuous-Multiple Image Outpainting in One-Step via Positional Query and A Diffusion-based Approach

arXiv 2024

2024

Beyond Appearance: a Semantic Controllable Self-Supervised Learning Framework for Human-Centric Visual Tasks

CVPR 2023 1

2023

SCT: A Simple Baseline for Parameter-Efficient Fine-Tuning via Salient Channels

arXiv 2023

2023

MetaModulation: Learning Variational Feature Hierarchies for Few-Shot Learning with Fewer Tasks

arXiv 2023

2023

OmniSeg3D: Omniversal 3D Segmentation via Hierarchical Contrastive Learning

CVPR 2024 1

2023

RegionBLIP: A Unified Multi-modal Pre-training Framework for Holistic and Regional Comprehension

arXiv 2023

2023

Making Vision Transformers Efficient from A Token Sparsification View

CVPR 2023 1

2023

Q-TOD: A Query-driven Task-oriented Dialogue System

arXiv 2022

2022

Proactive Interaction Framework for Intelligent Social Receptionist Robots

arXiv 2020

2020

Know More about Each Other: Evolving Dialogue Strategy via Compound Assessment

know-more-about-each-other-evolving-dialogue

2019

PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable

plato-pre-trained-dialogue-generation-model-1

2019

Affiliations

No known affiliations.

Frequent co-authors

10

from 36 papers