Mohammad Shoeybi
- Papers
- 9
Cite
Notes
Only stored in your browser.
9papers
Authored papers
9Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning
arXiv 2025
Pretraining Large Language Models with NVFP4
arXiv 2025
RLP: Reinforcement as a Pretraining Objective
arXiv 2025
An Empirical Study of Mamba-based Language Models
arXiv 2024
Nemotron-CC: Transforming Common Crawl into a Refined Long-Horizon Pretraining Dataset
arXiv 2024
Nemotron-4 340B Technical Report
arXiv 2024
Compact Language Models via Pruning and Knowledge Distillation
arXiv 2024
VILA: On Pre-training for Visual Language Models
CVPR 2024 1
RAVEN: In-Context Learning with Retrieval-Augmented Encoder-Decoder Language Models
arXiv 2023
Affiliations
No known affiliations.
Frequent co-authors
10from 9 papers