Longyin Wen

Papers: 10

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile: Semantic Scholar

Attribution policy →

10papers

Authored papers

Vidi: Large Multimodal Models for Video Understanding and Editing

arXiv 2025

2025

Where do Large Vision-Language Models Look at when Answering Questions?

arXiv 2025

2025

SuperEdit: Rectifying and Facilitating Supervision for Instruction-Based Image Editing

ICCV 2025

2025

CyberV: Cybernetics for Test-time Scaling in Video Understanding

arXiv 2025

2025

CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts

arXiv 2024

2024

Multi-Reward as Condition for Instruction-based Image Editing

arXiv 2024

2024

Accurate and Fast Compressed Video Captioning

ICCV 2023 1

2023

Detection, Tracking, and Counting Meets Drones in Crowds: A Benchmark

CVPR 2021 1

2021

Towards Real-World Prohibited Item Detection: A Large-Scale X-ray Benchmark

ICCV 2021 10

2021

Detection and Tracking Meet Drones Challenge

arXiv 2020

2020

Affiliations

No known affiliations.

Frequent co-authors

from 10 papers

Fan Chen

Sijie Zhu

Xin Gu

Chia-Wen Kuo

Dawei Du

Libo Zhang

Ming Li

Heng Fan

Pengfei Zhu

QinGhua Hu