Bryan Catanzaro
VP of Applied Deep Learning Research at NVIDIA; led Megatron-LM and many NVIDIA training-stack contributions.
- Role
- researcher
- Currently at
- NVIDIA
- Scholar
- scholar.google.com/citations
- Papers
- 25
Cite
Notes
Only stored in your browser.
Authored papers
25C-RADIOv4 (Tech Report)
arXiv 2026
Llama-Nemotron: Efficient Reasoning Models
arXiv 2025
Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning
arXiv 2025
Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities
arXiv 2025
Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models
arXiv 2025
A2SB: Audio-to-Audio Schrodinger Bridges
arXiv 2025
FeatSharp: Your Vision Model Features, Sharper
arXiv 2025
OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM
arXiv 2025
RLP: Reinforcement as a Pretraining Objective
arXiv 2025
Pretraining Large Language Models with NVFP4
arXiv 2025
An Empirical Study of Mamba-based Language Models
arXiv 2024
Nemotron-CC: Transforming Common Crawl into a Refined Long-Horizon Pretraining Dataset
arXiv 2024
Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities
arXiv 2024
RADIO Amplified: Improved Baselines for Agglomerative Vision Foundation Models
arXiv 2024
TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization
arXiv 2024
Nemotron-4 340B Technical Report
arXiv 2024
Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic Data
arXiv 2024
Compact Language Models via Pruning and Knowledge Distillation
arXiv 2024
RAVEN: In-Context Learning with Retrieval-Augmented Encoder-Decoder Language Models
arXiv 2023
BigVGAN: A Universal Neural Vocoder with Large-Scale Training
arXiv 2022
Speech Denoising in the Waveform Domain with Self-Attention
arXiv 2022
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis
ICLR 2021 1
Partial Convolution based Padding
arXiv 2018
High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs
high-resolution-image-synthesis-and-semantic-1
Deep Speech 2: End-to-End Speech Recognition in English and Mandarin
arXiv 2015
Affiliations
Frequent co-authors
10from 25 papers