Fengyun Rao
- Papers
- 10
Cite
Notes
Only stored in your browser.
Authored papers
10OmniPro: A Comprehensive Benchmark for Omni-Proactive Streaming Video Understanding
arXiv 2026
Stage-adaptive Token Selection for Efficient Omni-modal LLMs
arXiv 2026
ObjEmbed: Towards Universal Multimodal Object Embeddings
arXiv 2026
Learning Cross-View Object Correspondence via Cycle-Consistent Mask Prediction
arXiv 2026
R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization
arXiv 2025
FlexSelect: Flexible Token Selection for Efficient Long Video Understanding
arXiv 2025
WeThink: Toward General-purpose Vision-Language Reasoning via Reinforcement Learning
arXiv 2025
TempFlow-GRPO: When Timing Matters for GRPO in Flow Models
arXiv 2025
Number it: Temporal Grounding Videos like Flipping Manga
CVPR 2025 1
Visual Perception by Large Language Model's Weights
arXiv 2024
Affiliations
Frequent co-authors
10from 10 papers