Bryan Catanzaro

VP of Applied Deep Learning Research at NVIDIA; led Megatron-LM and many NVIDIA training-stack contributions.

Role: researcher
Currently at: NVIDIA
Scholar: scholar.google.com/citations
Papers: 25

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile: scholar.google.com/citations

Attribution policy →

25papers

Authored papers

25

C-RADIOv4 (Tech Report)

arXiv 2026

Llama-Nemotron: Efficient Reasoning Models

arXiv 2025

Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

arXiv 2025

Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities

arXiv 2025

Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models

arXiv 2025

A2SB: Audio-to-Audio Schrodinger Bridges

arXiv 2025

FeatSharp: Your Vision Model Features, Sharper

arXiv 2025

Pretraining Large Language Models with NVFP4

arXiv 2025

OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM

arXiv 2025

RLP: Reinforcement as a Pretraining Objective

arXiv 2025

An Empirical Study of Mamba-based Language Models

arXiv 2024

Nemotron-CC: Transforming Common Crawl into a Refined Long-Horizon Pretraining Dataset

arXiv 2024

Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities

arXiv 2024

RADIO Amplified: Improved Baselines for Agglomerative Vision Foundation Models

arXiv 2024

TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization

arXiv 2024

Nemotron-4 340B Technical Report

arXiv 2024

Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic Data

arXiv 2024

Compact Language Models via Pruning and Knowledge Distillation

arXiv 2024

RAVEN: In-Context Learning with Retrieval-Augmented Encoder-Decoder Language Models

arXiv 2023

BigVGAN: A Universal Neural Vocoder with Large-Scale Training

arXiv 2022

Speech Denoising in the Waveform Domain with Self-Attention

arXiv 2022

Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis

ICLR 2021 1

Partial Convolution based Padding

arXiv 2018

High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs

high-resolution-image-synthesis-and-semantic-1

Deep Speech 2: End-to-End Speech Recognition in English and Mandarin

arXiv 2015

Affiliations

Currently at

researcher · infra

Frequent co-authors

10

from 25 papers

Jan Kautz

10 shared papers

Andrew Tao

8 shared papers

Mohammad Shoeybi

8 shared papers

Mostofa Patwary

7 shared papers

Rafael Valle

7 shared papers

Wei Ping

7 shared papers

Pavlo Molchanov

6 shared papers

Zhifeng Kong

6 shared papers

Deepak Narayanan

5 shared papers

Sanjeev Satheesh

5 shared papers