0

Xiaolong Wang

Papers
38

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
38papers

Authored papers

38

Learning to Discover at Test Time

arXiv 2026

2026

When Helpers Become Hazards: A Benchmark for Analyzing Multimodal LLM-Powered Safety in Daily Life

arXiv 2026

2026

Enabling Stroke-Level Structural Analysis of Hieroglyphic Scripts without Language-Specific Priors

arXiv 2026

2026

GMT: General Motion Tracking for Humanoid Whole-Body Control

arXiv 2025

2025

One-Minute Video Generation with Test-Time Training

CVPR 2025 1

2025

M3: 3D-Spatial MultiModal Memory

arXiv 2025

2025

End-to-End Test-Time Training for Long Context

arXiv 2025

2025

In-N-On: Scaling Egocentric Manipulation with in-the-wild and on-task Data

arXiv 2025

2025

Effective and Transparent RAG: Adaptive-Reward Reinforcement Learning for Decision Traceability

arXiv 2025

2025

Ming-UniVision: Joint Image Understanding and Generation with a Unified Continuous Tokenizer

arXiv 2025

2025

NVILA: Efficient Frontier Visual Language Models

CVPR 2025 1

2024

A Simulation Benchmark for Autonomous Racing with Large-Scale Human Data

arXiv 2024

2024

FlexEdit: Marrying Free-Shape Masks to VLLM for Flexible Image Editing

arXiv 2024

2024

ActiView: Evaluating Active Perception Ability for Multimodal Large Language Models

arXiv 2024

2024

Learning to (Learn at Test Time): RNNs with Expressive Hidden States

arXiv 2024

2024

Dynamic Gaussians Mesh: Consistent Mesh Reconstruction from Dynamic Scenes

arXiv 2024

2024

PointLLM: Empowering Large Language Models to Understand Point Clouds

arXiv 2023

2023

COLMAP-Free 3D Gaussian Splatting

CVPR 2024 1

2023

Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models

CVPR 2023 1

2023

FeatureNeRF: Learning Generalizable NeRFs by Distilling Foundation Models

ICCV 2023 1

2023

Open X-Embodiment: Robotic Learning Datasets and RT-X Models

arXiv 2023

2023

GenSim: Generating Robotic Simulation Tasks via Large Language Models

arXiv 2023

2023

Learning to (Learn at Test Time)

arXiv 2023

2023

DexArt: Benchmarking Generalizable Dexterous Manipulation with Articulated Objects

CVPR 2023 1

2023

GNFactor: Multi-Task Real Robot Learning with Generalizable Neural Feature Fields

arXiv 2023

2023

Fine-Grained Cross-View Geo-Localization Using a Correlation-Aware Homography Estimator

fine-grained-cross-view-geo-localization

2023

Exploring Large Language Models for Communication Games: An Empirical Study on Werewolf

arXiv 2023

2023

Temporal Difference Learning for Model Predictive Control

arXiv 2022

2022

On Pre-Training for Visuo-Motor Control: Revisiting a Learning-from-Scratch Baseline

arXiv 2022

2022

GroupViT: Semantic Segmentation Emerges from Text Supervision

CVPR 2022 1

2022

MoDem: Accelerating Visual Model-Based Reinforcement Learning with Demonstrations

arXiv 2022

2022

Visual Reinforcement Learning with Self-Supervised 3D Representations

arXiv 2022

2022

Stabilizing Deep Q-Learning with ConvNets and Vision Transformers under Data Augmentation

NeurIPS 2021 12

2021

Rethinking Self-supervised Correspondence Learning: A Video Frame-level Similarity Perspective

ICCV 2021 10

2021

Learning Continuous Image Representation with Local Implicit Image Function

CVPR 2021 1

2020

Joint-task Self-supervised Learning for Temporal Correspondence

joint-task-self-supervised-learning-for-1

2019

Test-Time Training with Self-Supervision for Generalization under Distribution Shifts

arXiv 2019

2019

Non-local Neural Networks

non-local-neural-networks-1

2017

Affiliations

No known affiliations.

Frequent co-authors

10

from 38 papers