Skip to main content
Foundational Building Blocks for Generative AI Infrastructure

In the era of AI, a unit of compute is no longer measured by just the number of servers. Interconnected GPUs, CPUs, memory, storage, and these resources across multiple nodes in racks construct today's artificial Intelligence. The infrastructure requires high-speed and low-latency network fabrics, and carefully designed cooling technologies and power delivery to sustain optimal performance and efficiency for each data center environment. Supermicro’s SuperCluster solution provides foundational building blocks for rapidly evolving Generative AI and Large Language Models (LLMs).

  • Complete Integration at Scale

    Design and build of full racks and clusters with a global manufacturing capacity of up to 5,000 racks per month

  • Test, Validate, Deploy with On-site Service

    Proven L11, L12 testing processes thoroughly validate the operational effectiveness and efficiency before shipping

  • Liquid Cooling/Air Cooling

    Fully integrated liquid-cooling or air cooling solution with GPU & CPU cold plates, Cooling Distribution Units and Manifolds

  • Supply and Inventory Management

    One-stop-shop to deliver fully integrated racks fast and on-time to reduce time-to-solution for rapid deployment

Generative AI SuperCluster

The full turn-key data center solution accelerates time-to-delivery for mission-critical enterprise use cases, and eliminates the complexity of building a large cluster, which previously was achievable only through the intensive design tuning and time-consuming optimization of supercomputing.

Highest Density

With 32 NVIDIA HGX H100/H200 8-GPU, 4U Liquid-cooled Systems (256 GPUs) in 5 Racks

  • Doubling compute density through Supermicro’s custom liquid-cooling solution with up to 40% reduction in electricity cost for data center
  • 256 NVIDIA H100/H200 GPUs in one scalable unit
  • 20TB of HBM3 with H100 or 36TB of HBM3e with H200 in one scalable unit
  • 1:1 networking to each GPU to enable NVIDIA GPUDirect RDMA and Storage for training large language model with up to trillions of parameters
  • Customizable AI data pipeline storage fabric with industry leading parallel file system options
  • Supports NVIDIA Quantum-2 InfiniBand and Spectrum-X Ethernet platform
  • Certified for NVIDIA AI Enterprise Platform including NVIDIA NIM microservices
NVIDIA HGX H100/H200 8-GPU
NVIDIA HGX H100/H200 8-GPU
Download Datasheet

Compute Node

Supermicro 4U Liquid-Cooled 8-GPU System (SYS-421GE-TNHR2-LCC or AS -4125GS-TNHR2-LCC)
32 NVIDIA HGX H100/H200 8-GPU, 4U Liquid-cooled Systems (256 GPUs) in 5 Racks

Proven Design

With 32 NVIDIA HGX H100/H200 8-GPU, 8U Air-cooled Systems (256 GPUs) in 9 Racks

  • Proven industry leading architecture for large scale AI infrastructure deployments
  • 256 NVIDIA H100/H200 GPUs in one scalable unit
  • 20TB of HBM3 with H100 or 36TB of HBM3e with H200 in one scalable unit
  • 1:1 networking to each GPU to enable NVIDIA GPUDirect RDMA and Storage for training large language model with up to trillions of parameters
  • Customizable AI data pipeline storage fabric with industry leading parallel file system options
  • Supports NVIDIA Quantum-2 InfiniBand and Spectrum-X Ethernet platform
  • Certified for NVIDIA AI Enterprise Platform including NVIDIA NIM microservices
NVIDIA HGX H100/H200 8-GPU
NVIDIA HGX H100/H200 8-GPU
Download Datasheet

Compute Node

Supermicro 8U Air-Cooled 8-GPU System (SYS-821GE-TNHR or AS -8125GS-TNHR)
32 NVIDIA HGX H100/H200 8-GPU, 8U Air-cooled Systems (256 GPUs) in 9 Racks

Cloud-Scale Inference

With 256 NVIDIA GH200 Grace Hopper Superchips, 1U MGX Systems in 9 Racks

  • Unified GPU and CPU memory for cloud-scale high volume, low-latency, and high batch size inference
  • 1U Air-cooled NVIDIA MGX Systems in 9 Racks, 256 NVIDIA GH200 Grace Hopper Superchips in one scalable unit
  • Up to 144GB of HBM3e + 480GB of LPDDR5X, enough capacity to fit a 70B+ parameter model in one node
  • 400Gb/s InfiniBand or Ethernet non-blocking networking connected to spine-leaf network fabric
  • Customizable AI data pipeline storage fabric with industry leading parallel file system options
  • NVIDIA AI Enterprise Ready including NVIDIA NIM microservices
NVIDIA GH200 Grace Hopper Superchip
NVIDIA GH200 Grace Hopper Superchip
Download Datasheet

Compute Node

Supermicro 1U GH200 Grace Hopper Superchip System
256 NVIDIA GH200 Grace Hopper Superchips, 1U MGX Systems in 9 Racks

Enterprise 3D + AI

With 32 4U PCIe GPU Air-Cooled Systems (up to 256 NVIDIA L40S GPUs) in 5 Racks

  • Maximize multi-workload performance for enterprise AI-enabled workflows. Optimized for NVIDIA Omniverse with OpenUSD.
  • 256 NVIDIA L40S GPUs in one scalable unit
  • 12TB of GPU memory and 32TB of system memory in one scalable unit
  • Scale-out with 400Gb/s NVIDIA Spectrum™-X Ethernet
  • Customizable data storage fabric with industry leading parallel file system options
  • Certified for NVIDIA Omniverse Enterprise with included Enterprise Support Services
NVIDIA L40S
NVIDIA L40S
Download Datasheet

Compute Node

Supermicro 1U GH200 Grace Hopper Superchip System
32 4U PCIe GPU Air-Cooled Systems (up to 256 NVIDIA L40S GPUs) in 5 Racks
Featured Resources

Certain products may not be available in your region