Xiaohui Shen
- Papers
- 12
Cite
Notes
Only stored in your browser.
Authored papers
12Vidi: Large Multimodal Models for Video Understanding and Editing
arXiv 2025
Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation
arXiv 2025
FlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matching
arXiv 2024
ViTamin: Designing Scalable Vision Models in the Vision-Language Era
CVPR 2024 1
MaskBit: Embedding-free Image Generation via Bit Tokens
arXiv 2024
Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models and Time-Dependent Layer Normalization
arXiv 2024
A Simple Video Segmenter by Tracking Objects Along Axial Trajectories
arXiv 2023
Towards Open-Ended Visual Recognition with Large Language Model
arXiv 2023
Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP
convolutions-die-hard-open-vocabulary
Adversarial Open Domain Adaptation for Sketch-to-Photo Synthesis
arXiv 2021
EnlightenGAN: Deep Light Enhancement without Paired Supervision
arXiv 2019
Free-Form Image Inpainting with Gated Convolution
free-form-image-inpainting-with-gated-1
Affiliations
Frequent co-authors
10from 12 papers