0

Alan Yuille

Papers
50

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
50papers

Authored papers

50

A Very Big Video Reasoning Suite

arXiv 2026

2026

LychSim: A Controllable and Interactive Simulation Framework for Vision Research

arXiv 2026

2026

CoPE: Clipped RoPE as A Scalable Free Lunch for Long Context LLMs

arXiv 2026

2026

Meissa: Multi-modal Medical Agentic Intelligence

arXiv 2026

2026

RadGPT: Constructing 3D Image-Text Tumor Datasets

ICCV 2025

2025

Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation

arXiv 2025

2025

Medical World Model: Generative Simulation of Tumor Evolution for Treatment Planning

arXiv 2025

2025

Play to Generalize: Learning to Reason Through Game Play

arXiv 2025

2025

Are Vision Language Models Ready for Clinical Diagnosis? A 3D Medical Benchmark for Tumor-centric Visual Question Answering

arXiv 2025

2025

PhyGDPO: Physics-Aware Groupwise Direct Preference Optimization for Physically Consistent Text-to-Video Generation

arXiv 2025

2025

World-in-World: World Models in a Closed-Loop World

arXiv 2025

2025

Vision-Language-Vision Auto-Encoder: Scalable Knowledge Distillation from Diffusion Models

arXiv 2025

2025

4D-Animal: Freely Reconstructing Animatable 3D Animals from Videos

arXiv 2025

2025

Scaling Laws in Patchification: An Image Is Worth 50,176 Tokens And More

arXiv 2025

2025

Spatial457: A Diagnostic Benchmark for 6D Spatial Reasoning of Large Multimodal Models

arXiv 2025

2025

EigenLoRAx: Recycling Adapters to Find Principal Subspaces for Resource-Efficient Adaptation and Inference

arXiv 2025

2025

HDR-GS: Efficient High Dynamic Range Novel View Synthesis at 1000x Speed via Gaussian Splatting

arXiv 2024

2024

FlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matching

arXiv 2024

2024

Generative World Explorer

arXiv 2024

2024

Text-Driven Tumor Synthesis

arXiv 2024

2024

Label Critic: Design Data Before Models

arXiv 2024

2024

Radiative Gaussian Splatting for Efficient X-ray Novel View Synthesis

arXiv 2024

2024

ViTamin: Designing Scalable Vision Models in the Vision-Language Era

CVPR 2024 1

2024

Autoregressive Pretraining with Mamba in Vision

arXiv 2024

2024

M-VAR: Decoupled Scale-wise Autoregressive Modeling for High-Quality Image Generation

arXiv 2024

2024

Efficient Large Multi-modal Models via Visual Context Compression

arXiv 2024

2024

ImageNet3D: Towards General-Purpose Object-Level 3D Understanding

arXiv 2024

2024

Compositional 4D Dynamic Scenes Understanding with Physics Priors for Video Question Answering

arXiv 2024

2024

iNeMo: Incremental Neural Mesh Models for Robust Class-Incremental Learning

arXiv 2024

2024

A Bayesian Approach to OOD Robustness in Image Classification

CVPR 2024 1

2024

Label-Free Liver Tumor Segmentation

CVPR 2023 1

2023

CLIP-Driven Universal Model for Organ Segmentation and Tumor Detection

ICCV 2023 1

2023

Sequential Modeling Enables Scalable Learning for Large Vision Models

CVPR 2024 1

2023

Rejuvenating image-GPT as Strong Visual Representation Learners

arXiv 2023

2023

A Simple Video Segmenter by Tracking Objects Along Axial Trajectories

arXiv 2023

2023

NOVUM: Neural Object Volumes for Robust Object Classification

arXiv 2023

2023

3D-Aware Neural Body Fitting for Occlusion Robust 3D Human Pose Estimation

ICCV 2023 1

2023

PoseExaminer: Automated Testing of Out-of-Distribution Robustness in Human Pose and Shape Estimation

CVPR 2023 1

2023

Delving into Masked Autoencoders for Multi-Label Thorax Disease Classification

arXiv 2022

2022

Unleashing the Power of Visual Prompting At the Pixel Level

arXiv 2022

2022

Masked Autoencoders Enable Efficient Knowledge Distillers

CVPR 2023 1

2022

SwapMix: Diagnosing and Regularizing the Over-Reliance on Visual Context in Visual Question Answering

CVPR 2022 1

2022

Masked Feature Prediction for Self-Supervised Visual Pre-Training

CVPR 2022 1

2021

iBOT: Image BERT Pre-Training with Online Tokenizer

arXiv 2021

2021

TransMix: Attend to Mix for Vision Transformers

CVPR 2022 1

2021

PartImageNet: A Large, High-Quality Dataset of Parts

arXiv 2021

2021

DetectoRS: Detecting Objects with Recursive Feature Pyramid and Switchable Atrous Convolution

detectors-detecting-objects-with-recursive

2020

Micro-Batch Training with Batch-Channel Normalization and Weight Standardization

arXiv 2019

2019

Adversarial Attacks and Defences Competition

arXiv 2018

2018

Generation and Comprehension of Unambiguous Object Descriptions

generation-and-comprehension-of-unambiguous-1

2015

Affiliations

No known affiliations.

Frequent co-authors

10

from 50 papers