Shicheng Li
- Papers
- 12
Cite
Notes
Only stored in your browser.
Authored papers
12MiMo-V2-Flash Technical Report
arXiv 2026
MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining
arXiv 2025
MiMo-VL Technical Report
arXiv 2025
GroundingME: Exposing the Visual Grounding Gap in MLLMs through Multi-Dimensional Evaluation
arXiv 2025
TEMPLE:Temporal Preference Learning of Video LLMs via Difficulty Scheduling and Pre-SFT Alignment
arXiv 2025
TimeChat-Online: 80% Visual Tokens are Naturally Redundant in Streaming Videos
arXiv 2025
MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine
arXiv 2024
PunchBench: Benchmarking MLLMs in Multimodal Punchline Comprehension
arXiv 2024
TempCompass: Do Video LLMs Really Understand Videos?
arXiv 2024
VITATECS: A Diagnostic Dataset for Temporal Concept Understanding of Video-Language Models
arXiv 2023
TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding
CVPR 2024 1
TESTA: Temporal-Spatial Token Aggregation for Long-form Video-Language Understanding
arXiv 2023
Affiliations
Frequent co-authors
10from 12 papers