0

Anima Anandkumar

Caltech professor and former NVIDIA director of ML research; pioneer in tensor methods and AI for science.

Role
professor
Currently at
Independent
Papers
38

Cite

Notes

Only stored in your browser.

38papers

Authored papers

38

Operator Learning Using Weak Supervision from Walk-on-Spheres

arXiv 2026

2026

R-KV: Redundancy-aware KV Cache Compression for Training-Free Reasoning Models Acceleration

arXiv 2025

2025

Guided Diffusion Sampling on Function Spaces with Applications to PDEs

arXiv 2025

2025

HeadInfer: Memory-Efficient LLM Inference by Head-wise Offloading

arXiv 2025

2025

Robust Representation Consistency Model via Contrastive Denoising

arXiv 2025

2025

Principled Approaches for Extending Neural Architectures to Function Spaces for Operator Learning

arXiv 2025

2025

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

arXiv 2024

2024

A Unified Model for Compressed Sensing MRI Across Undersampling Patterns

CVPR 2025 1

2024

LeanAgent: Lifelong Learning for Formal Theorem Proving

arXiv 2024

2024

Pretraining Codomain Attention Neural Operators for Solving Multiphysics PDEs

arXiv 2024

2024

Mini-Sequence Transformer: Optimizing Intermediate Memory for Long Sequences Training

arXiv 2024

2024

CARE: a Benchmark Suite for the Classification and Retrieval of Enzymes

arXiv 2024

2024

Fully Attentional Networks with Self-emerging Token Labeling

fully-attentional-networks-with-self-emerging

2024

ChatGPT Based Data Augmentation for Improved Parameter-Efficient Debiasing of LLMs

arXiv 2024

2024

Multi-Modal Self-Supervised Learning for Surgical Feedback Effectiveness Assessment

arXiv 2024

2024

Improving Diffusion Inverse Problem Solving with Decoupled Noise Annealing

CVPR 2025 1

2024

T-Stitch: Accelerating Sampling in Pre-Trained Diffusion Models with Trajectory Stitching

arXiv 2024

2024

DPOT: Auto-Regressive Denoising Operator Transformer for Large-Scale PDE Pre-Training

arXiv 2024

2024

Automating Feedback Analysis in Surgical Training: Detection, Categorization, and Assessment

arXiv 2024

2024

Voyager: An Open-Ended Embodied Agent with Large Language Models

arXiv 2023

2023

LeanDojo: Theorem Proving with Retrieval-Augmented Language Models

leandojo-theorem-proving-with-retrieval

2023

Eureka: Human-Level Reward Design via Coding Large Language Models

arXiv 2023

2023

FB-BEV: BEV Representation from Forward-Backward View Transformations

ICCV 2023 1

2023

VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion

CVPR 2023 1

2023

Prismer: A Vision-Language Model with Multi-Task Experts

arXiv 2023

2023

Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte Carlo

arXiv 2023

2023

A Text-guided Protein Design Framework

arXiv 2023

2023

MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge

arXiv 2022

2022

Multi-modal Molecule Structure-text Model for Text-based Retrieval and Editing

arXiv 2022

2022

Diffusion Models for Adversarial Purification

arXiv 2022

2022

Fast Sampling of Diffusion Models via Operator Learning

arXiv 2022

2022

VIMA: General Robot Manipulation with Multimodal Prompts

arXiv 2022

2022

Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models

arXiv 2022

2022

Bongard-HOI: Benchmarking Few-Shot Visual Reasoning for Human-Object Interactions

CVPR 2022 1

2022

SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers

NeurIPS 2021 12

2021

Neural Operator: Learning Maps Between Function Spaces

arXiv 2021

2021

Panoptic SegFormer: Delving Deeper into Panoptic Segmentation with Transformers

CVPR 2022 1

2021

ZerO Initialization: Initializing Neural Networks with only Zeros and Ones

zero-initialization-initializing-residual

2021

Affiliations

Currently at

Independent

professor · community

Previously

Frequent co-authors

10

from 38 papers