Longyin Wen
- Papers
- 10
Cite
Notes
Only stored in your browser.
Authored papers
10Vidi: Large Multimodal Models for Video Understanding and Editing
arXiv 2025
SuperEdit: Rectifying and Facilitating Supervision for Instruction-Based Image Editing
ICCV 2025
CyberV: Cybernetics for Test-time Scaling in Video Understanding
arXiv 2025
Where do Large Vision-Language Models Look at when Answering Questions?
arXiv 2025
Multi-Reward as Condition for Instruction-based Image Editing
arXiv 2024
CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts
arXiv 2024
Accurate and Fast Compressed Video Captioning
ICCV 2023 1
Towards Real-World Prohibited Item Detection: A Large-Scale X-ray Benchmark
ICCV 2021 10
Detection, Tracking, and Counting Meets Drones in Crowds: A Benchmark
CVPR 2021 1
Detection and Tracking Meet Drones Challenge
arXiv 2020
Affiliations
Frequent co-authors
10from 10 papers