Sicong Leng
- Papers
- 13
Cite
Notes
Only stored in your browser.
Authored papers
13World Model for Robot Learning: A Comprehensive Survey
arXiv 2026
RynnBrain: Open Embodied Foundation Models
arXiv 2026
InterLV-Search: Benchmarking Interleaved Multimodal Agentic Search
arXiv 2026
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding
arXiv 2025
LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling
arXiv 2025
RynnVLA-001: Using Human Demonstrations to Improve Robot Manipulation
arXiv 2025
VL-Cogito: Progressive Curriculum Reinforcement Learning for Advanced Multimodal Reasoning
arXiv 2025
MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources
arXiv 2025
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs
arXiv 2024
Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss
arXiv 2024
AGLA: Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention
arXiv 2024
The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio
arXiv 2024
BenchX: A Unified Benchmark Framework for Medical Vision-Language Pretraining on Chest X-Rays
arXiv 2024
Affiliations
Frequent co-authors
10from 13 papers