Fan Wang
- Papers
- 36
Cite
Notes
Only stored in your browser.
Authored papers
36TAROT: Test-driven and Capability-adaptive Curriculum Reinforcement Fine-tuning for Code Generation with Large Language Models
arXiv 2026
Uni3C: Unifying Precisely 3D-Enhanced Camera and Human Motion Controls for Video Generation
arXiv 2025
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding
arXiv 2025
DyDiT++: Dynamic Diffusion Transformers for Efficient Visual Generation
arXiv 2025
RealisDance-DiT: Simple yet Strong Baseline towards Controllable Character Animation in the Wild
arXiv 2025
Few-Step Distillation for Text-to-Image Generation: A Practical Guide
arXiv 2025
RynnVLA-002: A Unified Vision-Language-Action and World Model
arXiv 2025
BlockVid: Block Diffusion for High-Quality and Consistent Minute-Long Video Generation
arXiv 2025
RynnVLA-001: Using Human Demonstrations to Improve Robot Manipulation
arXiv 2025
WorldVLA: Towards Autoregressive Action World Model
arXiv 2025
UniLumos: Fast and Unified Image and Video Relighting with Physics-Plausible Feedback
arXiv 2025
RynnEC: Bringing MLLMs into Embodied World
arXiv 2025
Lumos-1: On Autoregressive Video Generation from a Unified Model Perspective
arXiv 2025
EOC-Bench: Can MLLMs Identify, Recall, and Forecast Objects in an Egocentric World?
arXiv 2025
Benchmarking Multimodal Mathematical Reasoning with Explicit Visual Dependency
arXiv 2025
A Survey on Large Language Models for Code Generation
arXiv 2024
MVGenMaster: Scaling Multi-View Generation from Any Image via 3D Priors Enhanced Diffusion Model
CVPR 2025 1
Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation
arXiv 2024
MentalGLM Series: Explainable Large Language Models for Mental Health Analysis on Chinese Social Media
arXiv 2024
A Survey on Mixture of Experts
arXiv 2024
KaSA: Knowledge-Aware Singular-Value Adaptation of Large Language Models
arXiv 2024
Large-vocabulary forensic pathological analyses via prototypical cross-modal contrastive learning
arXiv 2024
LlamaDuo: LLMOps Pipeline for Seamless Migration from Service LLMs to Small-Scale Local LLMs
arXiv 2024
SHMT: Self-supervised Hierarchical Makeup Transfer via Latent Diffusion Models
arXiv 2024
SC4D: Sparse-Controlled Video-to-4D Generation and Motion Transfer
arXiv 2024
Continuous-Multiple Image Outpainting in One-Step via Positional Query and A Diffusion-based Approach
arXiv 2024
Beyond Appearance: a Semantic Controllable Self-Supervised Learning Framework for Human-Centric Visual Tasks
CVPR 2023 1
SCT: A Simple Baseline for Parameter-Efficient Fine-Tuning via Salient Channels
arXiv 2023
MetaModulation: Learning Variational Feature Hierarchies for Few-Shot Learning with Fewer Tasks
arXiv 2023
OmniSeg3D: Omniversal 3D Segmentation via Hierarchical Contrastive Learning
CVPR 2024 1
RegionBLIP: A Unified Multi-modal Pre-training Framework for Holistic and Regional Comprehension
arXiv 2023
Making Vision Transformers Efficient from A Token Sparsification View
CVPR 2023 1
Q-TOD: A Query-driven Task-oriented Dialogue System
arXiv 2022
Proactive Interaction Framework for Intelligent Social Receptionist Robots
arXiv 2020
Know More about Each Other: Evolving Dialogue Strategy via Compound Assessment
know-more-about-each-other-evolving-dialogue
PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable
plato-pre-trained-dialogue-generation-model-1
Affiliations
Frequent co-authors
10from 36 papers