0

Yinan He

Papers
18

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
18papers

Authored papers

18

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

arXiv 2025

2025

InternVideo2.5: Empowering Video MLLMs with Long and Rich Context Modeling

arXiv 2025

2025

VideoChat-R1: Enhancing Spatio-Temporal Perception via Reinforcement Fine-Tuning

arXiv 2025

2025

ShotBench: Expert-Level Cinematic Understanding in Vision-Language Models

arXiv 2025

2025

VRBench: A Benchmark for Multi-Step Reasoning in Long Narrative Videos

ICCV 2025

2025

ExpVid: A Benchmark for Experiment Video Understanding & Reasoning

arXiv 2025

2025

VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic Faithfulness

arXiv 2025

2025

Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models

arXiv 2025

2025

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

arXiv 2025

2025

VideoMamba: State Space Model for Efficient Video Understanding

arXiv 2024

2024

VideoChat-Flash: Hierarchical Compression for Long-Context Video Modeling

arXiv 2024

2024

Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment

CVPR 2025 1

2024

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

arXiv 2024

2024

VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking

CVPR 2023 1

2023

MVBench: A Comprehensive Multi-modal Video Understanding Benchmark

CVPR 2024 1

2023

LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models

arXiv 2023

2023

Unmasked Teacher: Towards Training-Efficient Video Foundation Models

ICCV 2023 1

2023

UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer

arXiv 2022

2022

Affiliations

No known affiliations.

Frequent co-authors

10

from 18 papers