Nvidia plans to build a new supercomputer in the US based on its fourth-generation DGX system, the DGX H100.
The Eos supercomputer will feature 18 ‘SuperPods,’ each of which includes 32-DGX H100 Pods.
Those 576 DGX H100 systems include 4,608 of Nvidia’s new H100 GPUs, 500 Quantum-2 InfiniBand switches, and 360 NVLink switches.
“Eos will offer an incredible 18 exaflops of AI performance, and we expect it to be the world’s fastest AI supercomputer when it’s deployed,” Paresh Kharya, senior director of product management and marketing at Nvidia, said.
Depending on the benchmark, the current world’s fastest AI supercomputer is the Department of Energy’s Perlmutter supercomputer. Capable of four exaflops of AI performance, it features 6,159 Nvidia A100 GPUs and 1,536 AMD Epyc CPUs.
Facebook/Meta are currently building what it expects to be the world’s fastest AI supercomputer, the AI Research SuperCluster, based on the third-gen Nvidia DGX A100 system.
But Eos will outperform both Perlmutter and the RSC, as well as Nvidia’s existing Selene supercomputer, which will be retired.
Nvidia said that for “traditional scientific computing,” Eos is capable of 275 petaflops of performance. Assuming Nvidia is referring to the standard Linpack benchmark, that puts it behind the 442 petaflops Fugaku supercomputer, officially the world’s fastest supercomputer.
However, China is believed to have secretly launched two exascale supercomputers last year.
Later this year, the US expects to launch two systems capable of more than an exaflops of performance – under the LINPACK benchmark, not the AI benchmark used by Meta.
The first, Frontier, is expected to be capable of more than 1.5 exaflops, and will feature 9,000 AMD Epyc CPUs and 36,000 AMD Radeon Instinct MI200 GPUs.
It will be followed by Aurora, an oft-delayed system that could exceed 2 exaflops. It will boast 18,000 Intel Xeon Sapphire Rapids CPUs, and 54,000 Intel Xe GPUs.