NVIDIA Builds AI SuperComputer DGX GH200

Yana Khare 02 Jun, 2023

3 min read

NVIDIA, a leader in artificial intelligence (AI) technology, has made an exciting announcement that is set to revolutionize the field. The company has introduced the NVIDIA DGX GH200, a groundbreaking AI supercomputer. The state-of-the-art NVIDIA GH200 Grace Hopper Superchips and the NVLink Switch System power it. This cutting-edge technology opens up new possibilities for developing large-scale generative AI language models, recommender systems, and data analytics workloads. Let’s delve into the details of this game-changing innovation.

The Power of Shared Memory: NVIDIA DGX GH200 Supercomputer

NVIDIA introduces the NVIDIA DGX GH200, a groundbreaking AI supercomputer powered by the state-of-the-art NVIDIA GH200 Grace Hopper Superchips and the NVLink Switch System.

NVIDIA has developed a remarkable AI supercomputer with a massive shared memory space. The supercomputer utilizes the NVLink interconnect technology and the NVLink Switch System. By combining 256 GH200 Superchips, these supercomputers can perform as a single GPU. Therefore, delivering an astounding one exaflop of performance and an unprecedented 144 terabytes of shared memory. Compared to the previous generation NVIDIA DGX A100 the DGX GH200 offers nearly 500 times more memory. Thus, enabling enhanced AI capabilities.

Expanding the AI Frontier: Jensen Huang’s Vision

Jensen Huang is the founder and CEO of NVIDIA. He recognizes the significance of generative AI, LLMs, and recommender systems in driving the modern economy. He emphasizes that the AI supercomputer integrates NVIDIA’s most advanced accelerated computing and networking technologies. Consequently, pushing the boundaries of AI innovation. This breakthrough has the potential to redefine the future of various industries.

Also Read: Artificial General Intelligence: Definition, Scope, and ChatGPT as an Early AGI

Grace Hopper Superchips: A Leap in CPU-GPU Interconnect Technology

The GH200 Superchips is a blend of an Arm-based NVIDIA Grace CPU and an NVIDIA H100 Tensor Core GPU in a single package. Therefore, eliminating the need for a traditional CPU-to-GPU PCIe connection. Leveraging the NVLink-C2C chip interconnects, these Superchips significantly increase the bandwidth between the GPU and CPU by seven times compared to the latest PCIe technology. This advancement reduces interconnect power consumption by over five times and provides a solid foundation for DGX GH200 supercomputers with a 600GB Hopper architecture GPU building block.

Learn More: CPU vs GPU: Why GPUs are More Suited for Deep Learning?

Uniting GPUs as One: NVIDIA DGX GH200 Architecture

The DGX GH200 is the first supercomputer to combine Grace Hopper Superchips with the NVIDIA NVLink Switch System, an innovative interconnect that enables seamless collaboration among all GPUs in a DGX GH200 system. With 48 times more NVLink bandwidth than the previous generation, this architecture empowers developers with the simplicity of programming a single GPU while harnessing the power of a massive AI supercomputer.

Industry Leaders Embrace the Power of DGX GH200

Major technology players such as Google Cloud, Meta, and Microsoft are among the first to gain access to the DGX GH200, eager to explore its potential for generative AI workloads. NVIDIA also plans to share the DGX GH200 design as a blueprint with cloud service providers and hyper scalers, enabling them to customize the technology for their specific infrastructure requirements. The DGX GH200’s capabilities have caught the attention of industry leaders who anticipate breakthroughs in their respective fields.

Also Read: Meta Reveals AI Chips to Revolutionize Computing

NVIDIA Helios: An AI Supercomputer for NVIDIA’s Internal Research

NVIDIA is investing in its AI supercomputer, NVIDIA Helios, based on the DGX GH200 technology. Helios will feature four DGX GH200 systems interconnected with NVIDIA Quantum-2 InfiniBand networking, optimizing data throughput for training large AI models. With a staggering 1,024 Grace Hopper Superchips, Helios will empower NVIDIA’s researchers and development teams to push the boundaries of AI innovation. The supercomputer is expected to be operational by the end of the year.

Full-Stack Solution: NVIDIA Software for AI Workloads

The DGX GH200 supercomputers come equipped with NVIDIA software. Thus, offering a comprehensive, turnkey solution for handling the most demanding AI and data analytics workloads. NVIDIA Base Command software provides AI workflow management, enterprise-grade cluster management, and libraries that accelerate compute storage and network infrastructure. Additionally, the system software is optimized for running AI workloads effectively. The package also includes NVIDIA AI Enterprise, which offers a wide range of frameworks, pre-trained models, and development tools to streamline the development and deployment of production AI across various domains.

Our Say

The unveiling of the NVIDIA DGX GH200 AI supercomputer signifies a significant milestone in AI technology. With its unparalleled shared memory capacity, innovative CPU-GPU interconnects technology. Additionally, the ability to combine GPUs into a single entity, the DGX GH200, opens up new horizons for AI research and applications. Leading industry players and cloud service providers eagerly embrace this revolutionary technology, recognizing its potential to transform operations. As NVIDIA continues to push the boundaries of innovation, the DGX GH200 promises to catalyze groundbreaking discoveries and advancements in the field.