Xiaolong Wang
- Papers
- 38
Cite
Notes
Only stored in your browser.
Authored papers
38Learning to Discover at Test Time
arXiv 2026
When Helpers Become Hazards: A Benchmark for Analyzing Multimodal LLM-Powered Safety in Daily Life
arXiv 2026
Enabling Stroke-Level Structural Analysis of Hieroglyphic Scripts without Language-Specific Priors
arXiv 2026
GMT: General Motion Tracking for Humanoid Whole-Body Control
arXiv 2025
One-Minute Video Generation with Test-Time Training
CVPR 2025 1
M3: 3D-Spatial MultiModal Memory
arXiv 2025
End-to-End Test-Time Training for Long Context
arXiv 2025
In-N-On: Scaling Egocentric Manipulation with in-the-wild and on-task Data
arXiv 2025
Effective and Transparent RAG: Adaptive-Reward Reinforcement Learning for Decision Traceability
arXiv 2025
Ming-UniVision: Joint Image Understanding and Generation with a Unified Continuous Tokenizer
arXiv 2025
NVILA: Efficient Frontier Visual Language Models
CVPR 2025 1
A Simulation Benchmark for Autonomous Racing with Large-Scale Human Data
arXiv 2024
FlexEdit: Marrying Free-Shape Masks to VLLM for Flexible Image Editing
arXiv 2024
ActiView: Evaluating Active Perception Ability for Multimodal Large Language Models
arXiv 2024
Learning to (Learn at Test Time): RNNs with Expressive Hidden States
arXiv 2024
Dynamic Gaussians Mesh: Consistent Mesh Reconstruction from Dynamic Scenes
arXiv 2024
PointLLM: Empowering Large Language Models to Understand Point Clouds
arXiv 2023
COLMAP-Free 3D Gaussian Splatting
CVPR 2024 1
Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models
CVPR 2023 1
FeatureNeRF: Learning Generalizable NeRFs by Distilling Foundation Models
ICCV 2023 1
Open X-Embodiment: Robotic Learning Datasets and RT-X Models
arXiv 2023
GenSim: Generating Robotic Simulation Tasks via Large Language Models
arXiv 2023
Learning to (Learn at Test Time)
arXiv 2023
DexArt: Benchmarking Generalizable Dexterous Manipulation with Articulated Objects
CVPR 2023 1
GNFactor: Multi-Task Real Robot Learning with Generalizable Neural Feature Fields
arXiv 2023
Fine-Grained Cross-View Geo-Localization Using a Correlation-Aware Homography Estimator
fine-grained-cross-view-geo-localization
Exploring Large Language Models for Communication Games: An Empirical Study on Werewolf
arXiv 2023
Temporal Difference Learning for Model Predictive Control
arXiv 2022
On Pre-Training for Visuo-Motor Control: Revisiting a Learning-from-Scratch Baseline
arXiv 2022
GroupViT: Semantic Segmentation Emerges from Text Supervision
CVPR 2022 1
MoDem: Accelerating Visual Model-Based Reinforcement Learning with Demonstrations
arXiv 2022
Visual Reinforcement Learning with Self-Supervised 3D Representations
arXiv 2022
Stabilizing Deep Q-Learning with ConvNets and Vision Transformers under Data Augmentation
NeurIPS 2021 12
Rethinking Self-supervised Correspondence Learning: A Video Frame-level Similarity Perspective
ICCV 2021 10
Learning Continuous Image Representation with Local Implicit Image Function
CVPR 2021 1
Joint-task Self-supervised Learning for Temporal Correspondence
joint-task-self-supervised-learning-for-1
Test-Time Training with Self-Supervision for Generalization under Distribution Shifts
arXiv 2019
Non-local Neural Networks
non-local-neural-networks-1
Affiliations
Frequent co-authors
10from 38 papers