0

Bryan Catanzaro

VP of Applied Deep Learning Research at NVIDIA; led Megatron-LM and many NVIDIA training-stack contributions.

Role
researcher
Currently at
NVIDIA
Papers
25

Cite

Notes

Only stored in your browser.

25papers

Authored papers

25

C-RADIOv4 (Tech Report)

arXiv 2026

2026

Llama-Nemotron: Efficient Reasoning Models

arXiv 2025

2025

Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

arXiv 2025

2025

Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities

arXiv 2025

2025

Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models

arXiv 2025

2025

A2SB: Audio-to-Audio Schrodinger Bridges

arXiv 2025

2025

FeatSharp: Your Vision Model Features, Sharper

arXiv 2025

2025

OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM

arXiv 2025

2025

RLP: Reinforcement as a Pretraining Objective

arXiv 2025

2025

Pretraining Large Language Models with NVFP4

arXiv 2025

2025

An Empirical Study of Mamba-based Language Models

arXiv 2024

2024

Nemotron-CC: Transforming Common Crawl into a Refined Long-Horizon Pretraining Dataset

arXiv 2024

2024

Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities

arXiv 2024

2024

RADIO Amplified: Improved Baselines for Agglomerative Vision Foundation Models

arXiv 2024

2024

TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization

arXiv 2024

2024

Nemotron-4 340B Technical Report

arXiv 2024

2024

Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic Data

arXiv 2024

2024

Compact Language Models via Pruning and Knowledge Distillation

arXiv 2024

2024

RAVEN: In-Context Learning with Retrieval-Augmented Encoder-Decoder Language Models

arXiv 2023

2023

BigVGAN: A Universal Neural Vocoder with Large-Scale Training

arXiv 2022

2022

Speech Denoising in the Waveform Domain with Self-Attention

arXiv 2022

2022

Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis

ICLR 2021 1

2020

Partial Convolution based Padding

arXiv 2018

2018

High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs

high-resolution-image-synthesis-and-semantic-1

2017

Deep Speech 2: End-to-End Speech Recognition in English and Mandarin

arXiv 2015

2015

Affiliations

Currently at

NVIDIA

researcher · infra

Frequent co-authors

10

from 25 papers