Alan Yuille
- Papers
- 50
Cite
Notes
Only stored in your browser.
Authored papers
50A Very Big Video Reasoning Suite
arXiv 2026
LychSim: A Controllable and Interactive Simulation Framework for Vision Research
arXiv 2026
CoPE: Clipped RoPE as A Scalable Free Lunch for Long Context LLMs
arXiv 2026
Meissa: Multi-modal Medical Agentic Intelligence
arXiv 2026
RadGPT: Constructing 3D Image-Text Tumor Datasets
ICCV 2025
Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation
arXiv 2025
Medical World Model: Generative Simulation of Tumor Evolution for Treatment Planning
arXiv 2025
Play to Generalize: Learning to Reason Through Game Play
arXiv 2025
Are Vision Language Models Ready for Clinical Diagnosis? A 3D Medical Benchmark for Tumor-centric Visual Question Answering
arXiv 2025
PhyGDPO: Physics-Aware Groupwise Direct Preference Optimization for Physically Consistent Text-to-Video Generation
arXiv 2025
World-in-World: World Models in a Closed-Loop World
arXiv 2025
Vision-Language-Vision Auto-Encoder: Scalable Knowledge Distillation from Diffusion Models
arXiv 2025
4D-Animal: Freely Reconstructing Animatable 3D Animals from Videos
arXiv 2025
Scaling Laws in Patchification: An Image Is Worth 50,176 Tokens And More
arXiv 2025
Spatial457: A Diagnostic Benchmark for 6D Spatial Reasoning of Large Multimodal Models
arXiv 2025
EigenLoRAx: Recycling Adapters to Find Principal Subspaces for Resource-Efficient Adaptation and Inference
arXiv 2025
HDR-GS: Efficient High Dynamic Range Novel View Synthesis at 1000x Speed via Gaussian Splatting
arXiv 2024
FlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matching
arXiv 2024
Generative World Explorer
arXiv 2024
Text-Driven Tumor Synthesis
arXiv 2024
Label Critic: Design Data Before Models
arXiv 2024
Radiative Gaussian Splatting for Efficient X-ray Novel View Synthesis
arXiv 2024
ViTamin: Designing Scalable Vision Models in the Vision-Language Era
CVPR 2024 1
Autoregressive Pretraining with Mamba in Vision
arXiv 2024
M-VAR: Decoupled Scale-wise Autoregressive Modeling for High-Quality Image Generation
arXiv 2024
Efficient Large Multi-modal Models via Visual Context Compression
arXiv 2024
ImageNet3D: Towards General-Purpose Object-Level 3D Understanding
arXiv 2024
Compositional 4D Dynamic Scenes Understanding with Physics Priors for Video Question Answering
arXiv 2024
iNeMo: Incremental Neural Mesh Models for Robust Class-Incremental Learning
arXiv 2024
A Bayesian Approach to OOD Robustness in Image Classification
CVPR 2024 1
Label-Free Liver Tumor Segmentation
CVPR 2023 1
CLIP-Driven Universal Model for Organ Segmentation and Tumor Detection
ICCV 2023 1
Sequential Modeling Enables Scalable Learning for Large Vision Models
CVPR 2024 1
Rejuvenating image-GPT as Strong Visual Representation Learners
arXiv 2023
A Simple Video Segmenter by Tracking Objects Along Axial Trajectories
arXiv 2023
NOVUM: Neural Object Volumes for Robust Object Classification
arXiv 2023
3D-Aware Neural Body Fitting for Occlusion Robust 3D Human Pose Estimation
ICCV 2023 1
PoseExaminer: Automated Testing of Out-of-Distribution Robustness in Human Pose and Shape Estimation
CVPR 2023 1
Delving into Masked Autoencoders for Multi-Label Thorax Disease Classification
arXiv 2022
Unleashing the Power of Visual Prompting At the Pixel Level
arXiv 2022
Masked Autoencoders Enable Efficient Knowledge Distillers
CVPR 2023 1
SwapMix: Diagnosing and Regularizing the Over-Reliance on Visual Context in Visual Question Answering
CVPR 2022 1
Masked Feature Prediction for Self-Supervised Visual Pre-Training
CVPR 2022 1
iBOT: Image BERT Pre-Training with Online Tokenizer
arXiv 2021
TransMix: Attend to Mix for Vision Transformers
CVPR 2022 1
PartImageNet: A Large, High-Quality Dataset of Parts
arXiv 2021
DetectoRS: Detecting Objects with Recursive Feature Pyramid and Switchable Atrous Convolution
detectors-detecting-objects-with-recursive
Micro-Batch Training with Batch-Channel Normalization and Weight Standardization
arXiv 2019
Adversarial Attacks and Defences Competition
arXiv 2018
Generation and Comprehension of Unambiguous Object Descriptions
generation-and-comprehension-of-unambiguous-1
Affiliations
Frequent co-authors
10from 50 papers