0

Yang Zhou

Papers
40

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
40papers

Authored papers

40

WorldCam: Interactive Autoregressive 3D Gaming Worlds with Camera Pose as a Unifying Geometric Representation

arXiv 2026

2026

Expert-Choice Routing Enables Adaptive Computation in Diffusion Language Models

arXiv 2026

2026

RubricHub: A Comprehensive and Highly Discriminative Rubric Dataset via Automated Coarse-to-Fine Generation

arXiv 2026

2026

DrivingGen: A Comprehensive Benchmark for Generative Video World Models in Autonomous Driving

arXiv 2026

2026

DriveDreamer-Policy: A Geometry-Grounded World-Action Model for Unified Generation and Planning

arXiv 2026

2026

NextFlow: Unified Sequential Modeling Activates Multimodal Understanding and Generation

arXiv 2026

2026

EditCtrl: Disentangled Local and Global Control for Real-Time Generative Video Editing

arXiv 2026

2026

Generative AI for Autonomous Driving: Frontiers and Opportunities

arXiv 2025

2025

Attention Distillation: A Unified Approach to Visual Characteristics Transfer

CVPR 2025 1

2025

Kinetics: Rethinking Test-Time Scaling Laws

arXiv 2025

2025

SeRL: Self-Play Reinforcement Learning for Large Language Models with Limited Data

arXiv 2025

2025

LangCoop: Collaborative Driving with Language

arXiv 2025

2025

$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning

arXiv 2025

2025

Alchemist: Unlocking Efficiency in Text-to-Image Model Training via Meta-Gradient Data Selection

arXiv 2025

2025

M3-Bench: Multi-Modal, Multi-Hop, Multi-Threaded Tool-Using MLLM Agent Benchmark

arXiv 2025

2025

Efficient Training of Diffusion Mixture-of-Experts Models: A Practical Recipe

arXiv 2025

2025

Breaking the Exploration Bottleneck: Rubric-Scaffolded Reinforcement Learning for General LLM Reasoning

arXiv 2025

2025

OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling

arXiv 2025

2025

Aether: Geometric-Aware Unified World Modeling

ICCV 2025

2025

WinT3R: Window-Based Streaming Reconstruction with Camera Token Pool

arXiv 2025

2025

VeriGUI: Verifiable Long-Chain GUI Dataset

arXiv 2025

2025

LLM Inference Unveiled: Survey and Roofline Model Insights

arXiv 2024

2024

MagicPIG: LSH Sampling for Efficient LLM Generation

arXiv 2024

2024

The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio

arXiv 2024

2024

OpenEMMA: Open-Source Multimodal Model for End-to-End Autonomous Driving

arXiv 2024

2024

Progressive Autoregressive Video Diffusion Models

arXiv 2024

2024

CAS-ViT: Convolutional Additive Self-attention Vision Transformers for Efficient Mobile Applications

arXiv 2024

2024

AutoTrust: Benchmarking Trustworthiness in Large Vision Language Models for Autonomous Driving

arXiv 2024

2024

BenchX: A Unified Benchmark Framework for Medical Vision-Language Pretraining on Chest X-Rays

arXiv 2024

2024

Region Attention Transformer for Medical Image Restoration

arXiv 2024

2024

Sirius: Contextual Sparsity with Correction for Efficient LLMs

arXiv 2024

2024

UrFound: Towards Universal Retinal Foundation Models via Knowledge-Guided Masked Modeling

arXiv 2024

2024

From Generalist to Specialist: Adapting Vision Language Models via Task-Specific Visual Instruction Tuning

arXiv 2024

2024

Self-Evolutionary Large Language Models through Uncertainty-Enhanced Preference Optimization

arXiv 2024

2024

ContactGen: Generative Contact Modeling for Grasp Generation

contactgen-generative-contact-modeling-for

2023

Learning Navigational Visual Representations with Semantic Map Supervision

ICCV 2023 1

2023

DRMC: A Generalist Model with Dynamic Routing for Multi-Center PET Image Synthesis

arXiv 2023

2023

Modular Degradation Simulation and Restoration for Under-Display Camera

arXiv 2022

2022

A Survey on Explainable Reinforcement Learning: Concepts, Algorithms, Challenges

arXiv 2022

2022

Rethinking Performance Gains in Image Dehazing Networks

arXiv 2022

2022

Affiliations

No known affiliations.

Frequent co-authors

10

from 40 papers