Jiangning Zhang
- Papers
- 32
Cite
Notes
Only stored in your browser.
Authored papers
32PixVerve: Advancing Native UHR Image Generation to 100MP with a Large-Scale High-Quality Dataset
arXiv 2026
The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook
arXiv 2026
Towards Customized Multimodal Role-Play
arXiv 2026
L2P: Unlocking Latent Potential for Pixel Generation
arXiv 2026
One-to-All Animation: Alignment-Free Character Animation and Image Pose Transfer
arXiv 2025
OpenVE-3M: A Large-Scale High-Quality Dataset for Instruction-Guided Video Editing
arXiv 2025
Soul: Breathe Life into Digital Human for High-fidelity Long-term Multimodal Animation
arXiv 2025
Transform Trained Transformer: Accelerating Naive 4K Video Generation Over 10times
arXiv 2025
DiP: Taming Diffusion Models in Pixel Space
arXiv 2025
VisMem: Latent Vision Memory Unlocks Potential of Vision-Language Models
arXiv 2025
Visual Multi-Agent System: Mitigating Hallucination Snowballing via Visual Flow
arXiv 2025
StrandDesigner: Towards Practical Strand Generation with Sketch Guidance
arXiv 2025
Are They the Same? Exploring Visual Correspondence Shortcomings of Multimodal LLMs
ICCV 2025
SVFR: A Unified Framework for Generalized Video Face Restoration
CVPR 2025 1
Decouple and Track: Benchmarking and Improving Video Diffusion Transformers for Motion Transfer
ICCV 2025
Visual Document Understanding and Question Answering: A Multi-Agent Collaboration Framework with Test-Time Scaling
arXiv 2025
MobileMamba: Lightweight Multi-Receptive Visual Mamba Network
CVPR 2025 1
MambaAD: Exploring State Space Models for Multi-class Unsupervised Anomaly Detection
arXiv 2024
AdaCLIP: Adapting CLIP with Hybrid Learnable Prompts for Zero-Shot Anomaly Detection
arXiv 2024
LLaVA-KD: A Framework of Distilling Multimodal Large Language Models
arXiv 2024
SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation
arXiv 2024
EMOv2: Pushing 5M Vision Model Frontier
arXiv 2024
TIMotion: Temporal and Interactive Framework for Efficient Human-Human Motion Generation
CVPR 2025 1
CustAny: Customizing Anything from A Single Example
CVPR 2025 1
Learning Multi-view Anomaly Detection
arXiv 2024
A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection
arXiv 2024
Point Cloud Mamba: Point Cloud Learning via State Space Model
arXiv 2024
MotionMaster: Training-free Camera Motion Transfer For Video Generation
arXiv 2024
LLaVA-VSD: Large Language-and-Vision Assistant for Visual Spatial Description
arXiv 2024
FitDiT: Advancing the Authentic Garment Details for High-fidelity Virtual Try-on
arXiv 2024
Phasic Content Fusing Diffusion Model with Directional Distribution Consistency for Few-Shot Model Adaption
ICCV 2023 1
Rethinking Mobile Block for Efficient Attention-based Models
ICCV 2023 1
Affiliations
Frequent co-authors
10from 32 papers