Contact Us

Resources

High Performance Computing
Clear all
From H100, GH200 to GB200: How NVIDIA Builds AI Supercomputers with SuperPod

From H100, GH200 to GB200: How NVIDIA Builds AI Supercomputers with SuperPod

How NVIDIA's SuperPod architecture evolves from the H100 to the GH200 and GB200? This article explains innovations in GPU and CPU interconnects, including NVLink and NVSwitch, which enable AI supercomputers with up to 576 GPUs to accelerate AI model training, inference, and HPC workloads.
Jason
Sep 25, 2024
Elon Musk Begins Training xAI LLM With 100,000 Liquid-Cooled NVIDIA H100 GPUs

Elon Musk Begins Training xAI LLM With 100,000 Liquid-Cooled NVIDIA H100 GPUs

Elon Musk's xAI begins training on the world's most powerful AI supercomputer with 100,000 liquid-cooled NVIDIA H100 GPUs, surpassing industry giants in computational power.
Brandon
Jul 24, 2024
The Evolution of AI Capabilities and the Power of 100,000 H100 Clusters

The Evolution of AI Capabilities and the Power of 100,000 H100 Clusters

We delve into the intricacies of large AI training clusters and their surrounding infrastructure. Building these clusters is far more complex than merely investing money. Achieving high utilization is challenging due to the high failure rates of various components, especially networking. We will explore power challenges, reliability, checkpointing, networking topology options, parallel schemes, rack layouts, and the overall bill of materials.
Jason
Jul 4, 2024
Next-Gen Data Centers: Embracing Liquid Cooling

Next-Gen Data Centers: Embracing Liquid Cooling

As these centers grow in power and scale, cooling has become a major bottleneck. Liquid cooling technology, with its superior efficiency and environmental benefits, is emerging as a crucial solution for supercomputing centers.
Claire
Jun 28, 2024
Comprehensive Guide to High-Performance Networking

Comprehensive Guide to High-Performance Networking

An in-depth exploration of high-performance networking technologies, including InfiniBand, RDMA, and TCP/IP optimizations, featuring insights from AWS, Microsoft, and Alibaba Cloud, and highlighting NADDOD's cutting-edge contributions to data center efficiency and scalability.
Jason
Jun 25, 2024

We use cookies to ensure you get the best experience on our website. Continued use of this website indicates your acceptance of our cookie policy.