Xinlei Chen
- Papers
- 22
Cite
Notes
Only stored in your browser.
Authored papers
22RLinf-USER: A Unified and Extensible System for Real-World Online Policy Learning in Embodied AI
arXiv 2026
Embodied-R: Collaborative Framework for Activating Embodied Spatial Reasoning in Foundation Models via Reinforcement Learning
arXiv 2025
VS-Bench: Evaluating VLMs for Strategic Reasoning and Decision-Making in Multi-Agent Environments
arXiv 2025
PRE-Mamba: A 4D State Space Model for Ultra-High-Frequent Event Camera Deraining
arXiv 2025
UrbanVideo-Bench: Benchmarking Vision-Language Models on Embodied Intelligence with Video Data in Urban Spaces
arXiv 2025
Meta CLIP 2: A Worldwide Scaling Recipe
arXiv 2025
Open3DVQA: A Benchmark for Comprehensive Spatial Reasoning with Multimodal Large Language Model in Open Space
arXiv 2025
Revisiting Feature Prediction for Learning Visual Representations from Video
arXiv preprint 2024 2
Scaling Proprioceptive-Visual Learning with Heterogeneous Pre-trained Transformers
arXiv 2024
On the Surprising Effectiveness of Attention Transfer for Vision Transformers
arXiv 2024
Learning to (Learn at Test Time): RNNs with Expressive Hidden States
arXiv 2024
Massive Activations in Large Language Models
arXiv 2024
ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders
CVPR 2023 1
Learning to (Learn at Test Time)
arXiv 2023
R-MAE: Regions Meet Masked Autoencoders
arXiv 2023
Masked Autoencoders Are Scalable Vision Learners
CVPR 2022 1
An Empirical Study of Training Self-Supervised Vision Transformers
ICCV 2021 10
Improved Baselines with Momentum Contrastive Learning
arXiv 2020
Exploring Simple Siamese Representation Learning
CVPR 2021 1
Understanding Self-supervised Learning with Dual Deep Networks
arXiv 2020
Towards VQA Models That Can Read
towards-vqa-models-that-can-read-1
Microsoft COCO Captions: Data Collection and Evaluation Server
arXiv 2015
Affiliations
Frequent co-authors
10from 22 papers