Ran He
- Papers
- 34
Cite
Notes
Only stored in your browser.
Authored papers
34Coloring the Noise: Adversarial Sobolev Alignment for Faithful Image Super Resolution
arXiv 2026
TIP: Token Importance in On-Policy Distillation
arXiv 2026
Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding
arXiv 2026
FlashPrefill: Instantaneous Pattern Discovery and Thresholding for Ultra-Fast Long-Context Prefilling
arXiv 2026
ResRL: Boosting LLM Reasoning via Negative Sample Projection Residual Reinforcement Learning
arXiv 2026
UniPrefill: Universal Long-Context Prefill Acceleration via Block-wise Dynamic Sparsification
arXiv 2026
The Trinity of Consistency as a Defining Principle for General World Models
arXiv 2026
On-Policy Self-Distillation for Reasoning Compression
arXiv 2026
VideoDetective: Clue Hunting via both Extrinsic Query and Intrinsic Relevance for Long Video Understanding
arXiv 2026
Omni-Diffusion: Unified Multimodal Understanding and Generation with Masked Discrete Diffusion
arXiv 2026
VITA-E: Natural Embodied Interaction with Concurrent Seeing, Hearing, Speaking, and Acting
arXiv 2025
DiCo: Revitalizing ConvNets for Scalable and Efficient Diffusion Modeling
arXiv 2025
Adapting Vision-Language Models Without Labels: A Comprehensive Survey
arXiv 2025
Video-SafetyBench: A Benchmark for Safety Evaluation of Video LVLMs
arXiv 2025
VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction
arXiv 2025
LoRA-Pro: Are Low-Rank Adapters Properly Optimized?
arXiv 2024
A Hard-to-Beat Baseline for Training-free CLIP-based Adaptation
arXiv 2024
ZeroSmooth: Training-free Diffuser Adaptation for High Frame Rate Video Generation
arXiv 2024
T2Vid: Translating Long Text into Multi-Image is the Catalyst for Video-LLMs
arXiv 2024
ZePo: Zero-Shot Portrait Stylization with Faster Sampling
arXiv 2024
Breaking the Low-Rank Dilemma of Linear Attention
CVPR 2025 1
STAMP: Outlier-Aware Test-Time Adaptation with Stable Memory Replay
arXiv 2024
InfiMM-HD: A Leap Forward in High-Resolution Multimodal Understanding
arXiv 2024
Towards Eliminating Hard Label Constraints in Gradient Inversion Attacks
arXiv 2024
Visual Anchors Are Strong Information Aggregators For Multimodal Large Language Model
arXiv 2024
RMT: Retentive Networks Meet Vision Transformers
CVPR 2024 1
Improving Zero-Shot Generalization for CLIP with Synthesized Prompts
ICCV 2023 1
Portrait Diffusion: Training-free Face Stylization with Chain-of-Painting
arXiv 2023
Thought Propagation: An Analogical Approach to Complex Reasoning with Large Language Models
arXiv 2023
TALL: Thumbnail Layout for Deepfake Video Detection
ICCV 2023 1
ScaleCrafter: Tuning-free Higher-Resolution Visual Generation with Diffusion Models
arXiv 2023
AdaptGuard: Defending Against Universal Attacks for Model Adaptation
ICCV 2023 1
Vision Transformer with Super Token Sampling
CVPR 2023 1
PSGAN: Pose and Expression Robust Spatial-Aware GAN for Customizable Makeup Transfer
psgan-pose-and-expression-robust-spatial
Affiliations
Frequent co-authors
10from 34 papers