0

Yilun Chen

Papers
17

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
17papers

Authored papers

17

InternVLA-A1: Unifying Understanding, Generation and Action for Robotic Manipulation

arXiv 2026

2026

MM-ACT: Learn from Multimodal Parallel Generation to Act

arXiv 2025

2025

StreamVLN: Streaming Vision-and-Language Navigation via SlowFast Context Modeling

arXiv 2025

2025

InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation

arXiv 2025

2025

Evolving Symbolic 3D Visual Grounder with Weakly Supervised Reflection

arXiv 2025

2025

InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy

arXiv 2025

2025

A Data-Centric Revisit of Pre-Trained Vision Models for Robot Learning

CVPR 2025 1

2025

X-VLA: Soft-Prompted Transformer as Scalable Cross-Embodiment Vision-Language-Action Model

arXiv 2025

2025

GRUtopia: Dream General Robots in a City at Scale

arXiv 2024

2024

MMScan: A Multi-Modal 3D Scene Dataset with Hierarchical Grounded Language Annotations

arXiv 2024

2024

Grounded 3D-LLM with Referent Tokens

arXiv 2024

2024

TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes

arXiv 2024

2024

What Makes CLIP More Robust to Long-Tailed Pre-Training Data? A Controlled Study for Transferable Insights

arXiv 2024

2024

PointLLM: Empowering Large Language Models to Understand Point Clouds

arXiv 2023

2023

Chat-Scene: Bridging 3D Scene and Large Language Models with Object Identifiers

arXiv 2023

2023

FocalFormer3D : Focusing on Hard Instance for 3D Object Detection

arXiv 2023

2023

VIMI: Vehicle-Infrastructure Multi-view Intermediate Fusion for Camera-based 3D Object Detection

arXiv 2023

2023

Affiliations

No known affiliations.

Frequent co-authors

10

from 17 papers