0

Qian Zhang

Papers
21

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
21papers

Authored papers

21

HorizonStream: Long-Horizon Attention for Streaming 3D Reconstruction

arXiv 2026

2026

EvoClaw: Evaluating AI Agents on Continuous Software Evolution

arXiv 2026

2026

RAD-2: Scaling Reinforcement Learning in a Generator-Discriminator Framework

arXiv 2026

2026

AlphaDrive: Unleashing the Power of VLMs in Autonomous Driving via Reinforcement Learning and Reasoning

arXiv 2025

2025

Epona: Autoregressive Diffusion World Model for Autonomous Driving

ICCV 2025

2025

MGVQ: Could VQ-VAE Beat VAE? A Generalizable Tokenizer with Multi-group Quantization

arXiv 2025

2025

Multimodal Mamba: Decoder-only Multimodal State Space Model via Quadratic to Linear Distillation

arXiv 2025

2025

ScEdit: Script-based Assessment of Knowledge Editing

arXiv 2025

2025

GoalFlow: Goal-Driven Flow Matching for Multimodal Trajectories Generation in End-to-End Autonomous Driving

CVPR 2025 1

2025

OmniMamba: Efficient and Unified Multimodal Understanding and Generation via State Space Models

arXiv 2025

2025

InfiniteVL: Synergizing Linear and Sparse Attention for Highly-Efficient, Unlimited-Input Vision-Language Models

arXiv 2025

2025

DiffusionDrive: Truncated Diffusion Model for End-to-End Autonomous Driving

CVPR 2025 1

2024

Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

arXiv 2024

2024

VADv2: End-to-End Vectorized Autonomous Driving via Probabilistic Planning

arXiv 2024

2024

Senna: Bridging Large Vision-Language Models and End-to-End Autonomous Driving

arXiv 2024

2024

OccRWKV: Rethinking Efficient 3D Semantic Occupancy Prediction with Linear Complexity

arXiv 2024

2024

Boost 3D Reconstruction using Diffusion-based Monocular Camera Calibration

ICCV 2025

2024

ComDrive: Comfort-Oriented End-to-End Autonomous Driving

arXiv 2024

2024

DrivingWorld: Constructing World Model for Autonomous Driving via Video GPT

arXiv 2024

2024

VAR-CLIP: Text-to-Image Generator with Visual Auto-Regressive Modeling

arXiv 2024

2024

ViG: Linear-complexity Visual Sequence Learning with Gated Linear Attention

arXiv 2024

2024

Affiliations

No known affiliations.

Frequent co-authors

10

from 21 papers