Xiaoyu Li
- Papers
- 24
Cite
Notes
Only stored in your browser.
Authored papers
24General365: Benchmarking General Reasoning in Large Language Models Across Diverse and Challenging Tasks
arXiv 2026
VerseCrafter: Dynamic Realistic Video World Model with 4D Geometric Control
arXiv 2026
CubeComposer: Spatio-Temporal Autoregressive 4K 360° Video Generation from Perspective Video
arXiv 2026
LongCat-Next: Lexicalizing Modalities as Discrete Tokens
arXiv 2026
LongCat-Flash-Thinking-2601 Technical Report
arXiv 2026
LARY: A Latent Action Representation Yielding Benchmark for Generalizable Vision-to-Action Alignment
arXiv 2026
LongCat-Flash-Prover: Advancing Native Formal Reasoning via Agentic Tool-Integrated Reinforcement Learning
arXiv 2026
Ming-Omni: A Unified Multimodal Model for Perception and Generation
arXiv 2025
GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors
ICCV 2025
Sci-Fi: Symmetric Constraint for Frame Inbetweening
arXiv 2025
ToonComposer: Streamlining Cartoon Production with Generative Post-Keyframing
arXiv 2025
GenCompositor: Generative Video Compositing with Diffusion Transformer
arXiv 2025
AMO-Bench: Large Language Models Still Struggle in High School Math Competitions
arXiv 2025
Q-Eval-100K: Evaluating Visual Quality and Alignment Level for Text-to-Vision Content
CVPR 2025 1
BlobCtrl: A Unified and Flexible Framework for Element-level Image Generation and Editing
arXiv 2025
StereoCrafter: Diffusion-based Generation of Long and High-fidelity Stereoscopic 3D from Monocular Videos
arXiv 2024
ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis
arXiv 2024
CV-VAE: A Compatible Video VAE for Latent Generative Video Models
arXiv 2024
NVComposer: Boosting Generative Novel View Synthesis with Multiple Sparse and Unposed Images
CVPR 2025 1
DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation
CVPR 2025 1
DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos
CVPR 2025 1
Grams: Gradient Descent with Adaptive Momentum Scaling
arXiv 2024
Text2NeRF: Text-Driven 3D Scene Generation with Neural Radiance Fields
arXiv 2023
UV Volumes for Real-time Rendering of Editable Free-view Human Performance
CVPR 2023 1
Affiliations
Frequent co-authors
10from 24 papers