Zhuowen Tu
- Papers
- 20
Cite
Notes
Only stored in your browser.
Authored papers
20PixARMesh: Autoregressive Mesh-Native Single-View Scene Reconstruction
arXiv 2026
Talk2Move: Reinforcement Learning for Text-Instructed Object-Level Geometric Transformation in Scenes
arXiv 2026
Goldfish: Monolingual Language Models for 350 Languages
arXiv 2024
VideoNSA: Native Sparse Attention Scales Video Understanding
arXiv 2025
Lay-Your-Scene: Natural Scene Layout Generation with Diffusion Transformers
arXiv 2025
TokenCompose: Text-to-Image Diffusion with Token-level Supervision
tokencompose-text-to-image-diffusion-with
Single-Stage Diffusion NeRF: A Unified Approach to 3D Generation and Reconstruction
ICCV 2023 1
BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual Questions
arXiv 2023
Distilling Large Vision-Language Model with Out-of-Distribution Generalizability
ICCV 2023 1
Patched Denoising Diffusion Models For High-Resolution Image Synthesis
arXiv 2023
Object-Centric Multiple Object Tracking
ICCV 2023 1
When Is Multilinguality a Curse? Language Modeling for 250 High- and Low-Resource Languages
arXiv 2023
Open-Vocabulary Universal Image Segmentation with MaskCLIP
arXiv 2022
On the Feasibility of Cross-Task Transfer with Model-Based Reinforcement Learning
arXiv 2022
Text Spotting Transformers
CVPR 2022 1
Co-Scale Conv-Attentional Image Transformers
ICCV 2021 10
Pose Recognition with Cascade Transformers
CVPR 2021 1
ViTGAN: Training GANs with Vision Transformers
vitgan-training-gans-with-vision-transformers-1
Aggregated Residual Transformations for Deep Neural Networks
aggregated-residual-transformations-for-deep-1
Holistically-Nested Edge Detection
holistically-nested-edge-detection-1
Affiliations
Frequent co-authors
10from 20 papers