Ilya Loshchilov
- Papers
- 4
Cite
Notes
Only stored in your browser.
4papers
Authored papers
4Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning
arXiv 2025
Decoupled Weight Decay Regularization
decoupled-weight-decay-regularization-1
A Downsampled Variant of ImageNet as an Alternative to the CIFAR datasets
arXiv 2017
SGDR: Stochastic Gradient Descent with Warm Restarts
arXiv 2016
Affiliations
No known affiliations.
Frequent co-authors
10from 4 papers