0

Jiajun Wu

Papers
47

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
47papers

Authored papers

47

World Model for Robot Learning: A Comprehensive Survey

arXiv 2026

2026

ESI-Bench: Towards Embodied Spatial Intelligence that Closes the Perception-Action Loop

arXiv 2026

2026

RAGEN-2: Reasoning Collapse in Agentic RL

arXiv 2026

2026

Learning from Trials and Errors: Reflective Test-Time Planning for Embodied LLMs

arXiv 2026

2026

Neuro-Symbolic Decoding of Neural Activity

arXiv 2026

2026

IQuest-Coder-V1 Technical Report

arXiv 2026

2026

RealWonder: Real-Time Physical Action-Conditioned Video Generation

arXiv 2026

2026

Theory of Space: Can Foundation Models Construct Spatial Beliefs through Active Exploration?

arXiv 2026

2026

InCoder-32B: Code Foundation Model for Industrial Scenarios

arXiv 2026

2026

Talk2Move: Reinforcement Learning for Text-Instructed Object-Level Geometric Transformation in Scenes

arXiv 2026

2026

RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning

arXiv 2025

2025

Re-thinking Temporal Search for Long-Form Video Understanding

CVPR 2025 1

2025

BEHAVIOR Robot Suite: Streamlining Real-World Whole-Body Manipulation for Everyday Household Activities

arXiv 2025

2025

FluidNexus: 3D Fluid Reconstruction and Prediction from a Single Video

CVPR 2025 1

2025

WonderZoom: Multi-Scale 3D World Generation

arXiv 2025

2025

Explain Before You Answer: A Survey on Compositional Visual Reasoning

arXiv 2025

2025

Spatial Mental Modeling from Limited Views

arXiv 2025

2025

Chain-of-Experts: Unlocking the Communication Power of Mixture-of-Experts Models

arXiv 2025

2025

Taming generative video models for zero-shot optical flow extraction

arXiv 2025

2025

ENACT: Evaluating Embodied Cognition with World Modeling of Egocentric Interaction

arXiv 2025

2025

Why Is Spatial Reasoning Hard for VLMs? An Attention Mechanism Perspective on Focus Areas

arXiv 2025

2025

Evaluating Real-World Robot Manipulation Policies in Simulation

arXiv 2024

2024

WonderWorld: Interactive 3D Scene Generation from a Single Image

CVPR 2025 1

2024

Generalizable Humanoid Manipulation with 3D Diffusion Policies

arXiv 2024

2024

Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making

arXiv 2024

2024

HourVideo: 1-Hour Video-Language Understanding

arXiv 2024

2024

TRANSIC: Sim-to-Real Policy Transfer by Learning from Online Correction

arXiv 2024

2024

Diffusion Self-Distillation for Zero-Shot Customized Image Generation

CVPR 2025 1

2024

IKEA Manuals at Work: 4D Grounding of Assembly Instructions on Internet Videos

arXiv 2024

2024

Visually Descriptive Language Model for Vector Graphics Reasoning

arXiv 2024

2024

View-Invariant Policy Learning via Zero-Shot Novel View Synthesis

arXiv 2024

2024

Foundation Models in Robotics: Applications, Challenges, and the Future

arXiv 2023

2023

ULIP-2: Towards Scalable Multimodal Pre-training for 3D Understanding

CVPR 2024 1

2023

VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models

arXiv 2023

2023

Holodeck: Language Guided Generation of 3D Embodied AI Environments

CVPR 2024 1

2023

ZeroNVS: Zero-Shot 360-Degree View Synthesis from a Single Image

CVPR 2024 1

2023

Open X-Embodiment: Robotic Learning Datasets and RT-X Models

arXiv 2023

2023

SkyScript: A Large and Semantically Diverse Vision-Language Dataset for Remote Sensing

arXiv 2023

2023

Language-Informed Visual Concept Learning

arXiv 2023

2023

3D Copy-Paste: Physically Plausible Object Insertion for Monocular 3D Detection

3d-copy-paste-physically-plausible-object

2023

Mini-BEHAVIOR: A Procedurally Generated Benchmark for Long-horizon Decision-Making in Embodied AI

arXiv 2023

2023

Patched Denoising Diffusion Models For High-Resolution Image Synthesis

arXiv 2023

2023

Disentanglement via Latent Quantization

disentanglement-via-latent-quantization

2023

Motion Question Answering via Modular Motion Programs

arXiv 2023

2023

SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations

sdedit-guided-image-synthesis-and-editing

2021

End-to-End Optimization of Scene Layout

end-to-end-optimization-of-scene-layout

2020

The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and Sentences From Natural Supervision

the-neuro-symbolic-concept-learner

2019

Affiliations

No known affiliations.

Frequent co-authors

10

from 47 papers