0

How Much Progress Has There Been in NVIDIA Datacenter GPUs?

As the role of modern Graphics Processing Units (GPUs) becomes increasingly essential for several computing tasks, analyzing their past and current progress is paramount for determining future constraints on scientific research.

Preview
Year
2026
Hosting
Full text hostedCC-BY-4.0

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2601.20115CC-BY-4.0
TL;DR
Semantic Scholar
Attribution policy →

Abstract

As the role of modern Graphics Processing Units (GPUs) becomes increasingly essential for several computing tasks, analyzing their past and current progress is paramount for determining future constraints on scientific research. This is particularly compelling in the Artificial Intelligence (AI) domain, where rapid technological advancements and fierce global competition have led the United States to recently implement export control regulations limiting international access to advanced AI chips. Consequently, this paper examines technical progress in NVIDIA datacenter GPUs from the mid-2000s through 2025. Our main results identify doubling times of 1.43 and 1.67 years for FP16 and FP32 dense operations, while FP64 doubling times range from 2.05 to 3.79 years. Off-chip memory size and bandwidth have grown at slower rates than computing performance, doubling every 3.29 to 3.41 years, whereas the release prices and power consumption roughly doubled every 5.03 and 15 years, respectively. Moreover, our cross-vendor comparison of the top-performing GPUs per year shows that NVIDIA's performance advantage is narrowing, but not enough to compel a major market shift. Finally, we quantify the potential implications of current U.S. export control regulations and the consequent performance gaps, which the recently proposed policy changes could shrink from 23.6X to 3.54X.