Sicong Leng

Papers: 13

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile: Semantic Scholar

Attribution policy →

13papers

Authored papers

World Model for Robot Learning: A Comprehensive Survey

arXiv 2026

2026

RynnBrain: Open Embodied Foundation Models

arXiv 2026

2026

InterLV-Search: Benchmarking Interleaved Multimodal Agentic Search

arXiv 2026

2026

VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding

arXiv 2025

2025

LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling

arXiv 2025

2025

RynnVLA-001: Using Human Demonstrations to Improve Robot Manipulation

arXiv 2025

2025

VL-Cogito: Progressive Curriculum Reinforcement Learning for Advanced Multimodal Reasoning

arXiv 2025

2025

MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources

arXiv 2025

2025

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

arXiv 2024

2024

Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss

arXiv 2024

2024

AGLA: Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention

arXiv 2024

2024

The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio

arXiv 2024

2024

BenchX: A Unified Benchmark Framework for Medical Vision-Language Pretraining on Chest X-Rays

arXiv 2024

2024

Affiliations

No known affiliations.

Frequent co-authors

from 13 papers

Deli Zhao

Xin Li

Lidong Bing

Hang Zhang

Shijian Lu

Kehan Li

Yuming Jiang

Zesen Cheng

Bohan Hou

Jianfei Yang