0

Xiaoyu Li

Papers
24

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
24papers

Authored papers

24

General365: Benchmarking General Reasoning in Large Language Models Across Diverse and Challenging Tasks

arXiv 2026

2026

VerseCrafter: Dynamic Realistic Video World Model with 4D Geometric Control

arXiv 2026

2026

CubeComposer: Spatio-Temporal Autoregressive 4K 360° Video Generation from Perspective Video

arXiv 2026

2026

LongCat-Next: Lexicalizing Modalities as Discrete Tokens

arXiv 2026

2026

LongCat-Flash-Thinking-2601 Technical Report

arXiv 2026

2026

LARY: A Latent Action Representation Yielding Benchmark for Generalizable Vision-to-Action Alignment

arXiv 2026

2026

LongCat-Flash-Prover: Advancing Native Formal Reasoning via Agentic Tool-Integrated Reinforcement Learning

arXiv 2026

2026

Ming-Omni: A Unified Multimodal Model for Perception and Generation

arXiv 2025

2025

GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors

ICCV 2025

2025

Sci-Fi: Symmetric Constraint for Frame Inbetweening

arXiv 2025

2025

ToonComposer: Streamlining Cartoon Production with Generative Post-Keyframing

arXiv 2025

2025

GenCompositor: Generative Video Compositing with Diffusion Transformer

arXiv 2025

2025

AMO-Bench: Large Language Models Still Struggle in High School Math Competitions

arXiv 2025

2025

Q-Eval-100K: Evaluating Visual Quality and Alignment Level for Text-to-Vision Content

CVPR 2025 1

2025

BlobCtrl: A Unified and Flexible Framework for Element-level Image Generation and Editing

arXiv 2025

2025

StereoCrafter: Diffusion-based Generation of Long and High-fidelity Stereoscopic 3D from Monocular Videos

arXiv 2024

2024

ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis

arXiv 2024

2024

CV-VAE: A Compatible Video VAE for Latent Generative Video Models

arXiv 2024

2024

NVComposer: Boosting Generative Novel View Synthesis with Multiple Sparse and Unposed Images

CVPR 2025 1

2024

DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation

CVPR 2025 1

2024

DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos

CVPR 2025 1

2024

Grams: Gradient Descent with Adaptive Momentum Scaling

arXiv 2024

2024

Text2NeRF: Text-Driven 3D Scene Generation with Neural Radiance Fields

arXiv 2023

2023

UV Volumes for Real-time Rendering of Editable Free-view Human Performance

CVPR 2023 1

2022

Affiliations

No known affiliations.

Frequent co-authors

10

from 24 papers