Skip to main content

Supermicro High Performance Computing (HPC) Solutions

Overview

Technological advances in the recent era such as artificial intelligence, data analytics, and machine learning have led to a growing demand for computational power. At Supermicro, we understand that our customers need solutions that are capable of keeping up with today's compute demands, and that are proven to give them a greater competitive advantage. To take on these challenges, Supermicro offers customized solution designs that meet and exceed even the most daunting challenges in HPC design and deployment.

From practical and efficient clusters to supercomputers, our extensive lineup of high performance servers and storage products allow us to create unique configurations to tackle any elastic workload. Supermicro's First-to-Market strategy guarantees that our customers always benefit from the latest and greatest technologies.

Supermicro is actively innovating in building HPC solutions. From design to implementation, we optimize every aspect of each solution. Our advantages include a wide range of building blocks, from motherboard design, to system configuration, to fully integrated rack and liquid cooling systems. Using these tremendous array of versatile building blocks, we focus on providing solutions tailored to the customers’ need. At Supermicro, we take pride in building HPC solutions from the ground up.

Supermicro can tailor HPC solutions to meet any variety of workloads: compute intensive, high throughput, or high capacity storage applications used in different industries. Supermicro HPC systems can be bundled with a variety of open source platform and commercial applications, making it a truly turnkey solution.

Highlights

  • First-To Market Advantage. Supermicro is always the first to release the newest server platform with new technology. First-to-Market strategy guarantees that our customers always benefit from the latest and greatest technologies.
  • High Density. Supermicro's unique line-up of high performance servers enables us to build powerful HPC systems that can reach extremely high density with minimal facility foot print.
  • Scalability. Extremely scalable cluster design with hyper-scale server, highly scalable switching fabric, high capacity power and cooling.
  • Energy Efficient. Platinum rated power supplies, and liquid cooling options facilitate cost savings in energy and favor advantages in PUE.
  • Cost Effective. Supermicro has configurable solutions to a great granularity that can adapt to any budget. This results in an incredibly appealing TCO.

HPC Reference Architecture

1. HPC – High Performance Computing FatTwin with Omni-Path.

This HPC RA utilizes Supermicro's high density FatTwin server along with Supermicro's Omni-Path switch in a Fat Tree topology. This design can achieve Petaflops of computational power, and along with large memory bank, it can undertake the most demanding HPC tasks. The premium FatTwin system offers high density compute performance, storage capacity and power efficiency.

HPC Rack - FatTwin

Compute Power:

  • System: SYS-F619P2-RT (4U 8Nodes)
  • CPU: Dual Intel Xeon Scalable Processors, up to 56 Cores/ Node (3584 cores/ 42U Rack)
  • Memory: Up to 1.5TB ECC 3DS LRDIMM, 1.5TB ECC RDIMM, DDR4 up to 2666MHz
  • Ultimate Scalability: 64 Compute Nodes per 42U-Rack, 48 1U standalone switches can support up to 1,536 compute nodes!

Networking:

  • Switch: Intel 100G 48-port Omni-Path TOR switch with management card. High switching capacity of 9.6 Tb/s total fabric bandwidth
  • L 2/3 Switch: 1/10 Gb Ethernet Superswitch, 48 x 1 Gbps and 4 x SPF+ 10 Gbps Ethernet ports
  • Cost Efficiency: The FatTwin architecture makes each HPC cluster extremeley cost efficient!

Cooling:

  • RDHx (rear Door Heat Exchangers) chiiler doors for highly efficient cooling, up to 75kW cooling capacity per rack
FatTwin 64 Nodes

2. Reference Architecture – TwinPro plus Dragonfly/InfiniBand

This design utilizes the speed and efficiency of the TwinPro along with Mellanox Infiniband networking in a dragonfly topology. The dragonfly topology minimizes network diameter and maximizes bandwidth allowing for a speedy and robust MPI system to be built on top.

HPC Rack - TwinPro

Compute Power:

  • System: SYS-2028TP-HTR (2U 4Nodes, total 64 Nodes/ 42U Rack)
  • CPU: Intel Xeon Processor E5-2600 v4/v3 family, Up to 44 cores/node (2,816 cores per 42U Rack)
  • Memory: Up to 2TB† ECC 3DS LRDIMM, 512GB ECC RDIMM, and 64 GB DDR4 LRDIMM-2666.

Networking:

  • Switch: Mellanox InfiniBand FDR/ EDR/HDR, 1 U Switch, 36 QSFP28 ports
  • Extremely High Bandwidth: The Dragonfly architecture along with InfiniBand switches ensure low latency and the maximum effective fabric bandwidth by eliminating congestion hot spots!

Cooling:

  • Highly efficient DCLC (Direct Contact Liquid Cooling) options available, 35kW+ cooling capacity per rack
TwinPro 64 Nodes

Turnkey HPC Appliance for AI

As HPC and AI intertwine more and more, the future of computation with high precision is getting brighter. Fascinating and completely new discoveries in some of the most complex HPC domains are enabled by AI's deep learning capabilities! To keep up with the increasing demand of computational power for applications that combine HPC with AI, Supermicro offers versatile cluster solutions with high scalability and configurability.

HPC Rack - AI

Compute Power:

  • System: SYS-4029GP-TVRT (Supermicro Super Server, 4U 8 GPU)
  • GPU: 8x Nvidia Tesla V100 SXM2(16GB) MAX: Up to 32GB
  • NVLINK: High-speed interconnect 300GB/s per GPU
  • Memory: DDR4-2666 32GB, total 384GB
  • CPU: Intel Xeon Gold 6154 CPU @ 3.00GHz

Solution SKU:

  • SRS-14UGPU-AIV1-01: 14U Rack, 2 x 4U GPU server with 16 V100
  • SRS-24UGPU-AIV1-01: 24U Rack, 4 x 4U GPU server with 32 V100

Networking:

  • InfiniBand Switch: (Mellanox InfiniBand FDR/ EDR/HDR, 1 U Switch, 36 QSFP28 ports)
  • L 2/3 Switch: SSE-G3648BR (1/10 Gb Ethernet Superswitch). 48 x 1 Gbps and 4 x SPF+ 10 Gbps Ethernet ports
HPC AI

AI Benchmarks:

Supermicro HPC team has accomplished remarkable results in benchmarking AI performance. The performance of Supermicro AI appliances have been evaluated using various and widely used Deep Learning algorithms such as VGG, Inception V3, ResNet 50 etc. Most of the times, our appliances have outperformed the popular and current choices of AI clusters that are available in the market. For example, the following TensorFlow benchmark (using ResNet 50) shows successful processing of almost 18000 images per second.

TensorFlow Benchmark
TensorFlow Benchmark Graph

Our HPC team has also compared the performance of Supermicro AI appliances with other leading market leaders of AI hardware provider, and the results were significantly impressive and better. For example, the following Caffe2 benchmark (using ResNet 50) shows that Supermicro AI cluster solution can process significantly more images per second compared to Facebook:

Caffe2 Benchmark
Caffe2 Benchmark Graph - Supermicro vs Facebook

Turnkey HPC appliance for ANSYS:

The Ansys software suite provides a standardized platform for engineers to design and run simulations in an efficient manner. Supermicro has teamed up with Ansys to provide hardware tailored specifically to the Ansys application in order to get the optimum performance from Ansys products. With Supermicro's Ansys Solution, the turn-around time between designing, running, and seeing the results of a simulation are dramatically reduced, which will allow engineering teams to expedite the development process.

The Supermicro Ansys Solution is designed to efficiently utilize the Ansys software suite and comes in a few different packages, from small and cost effective to a large cluster with ample horsepower. The Supermicro Ansys Solution contains high performance components which have been carefully selected to minimize bottlenecking in even the most strenuous simulations to ensure that the end user will get full utilization of each hardware component no matter what the circumstance. Hardware features include:

  • Top end CPUs for maximum computation power
  • Large memory allows for increasingly large datasets
  • SSD/NVMe drives for quick and efficient storage
  • High end networking for seamless synchronization at a cluster level
  • GPUs for complex mathematical computation
  • Fully integrated hardware/software stack for plug-and-play
  • Easy to deploy, hassle free management
 
  Small Cluster Large Cluster (1U server) Large Cluster (2U server)
Model      
Compute Node Up to 10 Up to 32 Up to 32
Form Factor (Per Node) Server 1U Ultra 1U Ultra 2U TwinPro
CPU 2x SKL 6134 3.2 GHz 2x SKL 6134 3.2 GHz 2x SKL 6134 3.2 GHz
Memory 256 GB 256 GB 256 GB
Drive 1.6 TB SSD 1.6 TB SSD 1.6 TB SSD
Total Core Up to 160 Up to 512 Up to 512
Total Memory Up to 2560 GB Up to 8192 GB Up to 8192 GB
Total Storage Up to 16 TB Up to 52.1 TB Up to 51.2 TB
Master Node 1 2 2
Server Server 1U Ultra 1U Ultra 1U Ultra
CPU 2x SK 6134 3.2 GHz 2x SK 6134 3.2 GHz 2x SK 6134 3.2 GHz
Memory 256GB 256GB 256GB
Drive 1.6TB SSD 1.6TB SSD 2TB SSD
Graph NVIDIA Quadro GP100 16GB GDDR5
Switches 24 Ports 10GbE Omni-Path or EDR or FDR
IPMI 52 Ports
Cabinet 14U 42U
PDU 1x2U 30A 2x 50A 208 3-Phase Metered PDU

HPC Storage Solutions

IBM Spectrum Scale

Spectrum Scale provides a datacenter with a flexible storage platform that allows the end user access to enormous amounts of data at incredibly high speeds. The secret to Scale's performance is a tiered storage hierarchy which prioritizes frequently used data to the fastest storage tiers while maintaining the rest of the data on more cost-effective storage devices for on-demand access. IBM Spectrum Scale intelligent storage rules allow for the user to customize the use of their hardware depending on their needs.

Supermicro offers Specturm Scale based Solutions for HPC Applications.

IBM Specture Scale diagram

Lustre File System

Lustre is a parallel file system that provides speeds necessary for HPC workloads from petabytes of storage. Lustre Solution allows for thousands of clients to access storage devices on demand. This solution is made possible by decoupling the metadata and data on servers, which means that the customer can design and tailor their cluster depending on the workload that they will be running. Lustre has been time-tested in a number of the world's largest datacenters, in fact, Lustre currently powers 75% of top 100 supercomputers on Earth. Supermicro has paired up with Intel and BGI in order to provide the BGI laboratories with a Lustre system capable of 8Gb/s speeds!

Lustre File System diagram

HPC Server Platforms

FatTwin™

  • Best TCO with Highest Performance-per-Watt/per-Dollar
  • FatTwin™ represents a revolution in Green Computing and is highly efficient by design
  • This system supports customers' critical applications while reducing Data Center TCO in order to help preserve the environment, and extends the compute and storage capabilities
  • Due to its shared components, the FatTwin™ improves cost-effectiveness and reliability, while its modular architecture makes it flexible to configure and easy to maintain

TwinPro²™

  • Twin technology provide exceptional throughput, storage, networking, I/O, memory and processing capabilities
  • Performance, Flexibility, Efficiency
  • Competitive advantage for High-End Enterprise, HPC and Cloud computing environments

GPU Server

  • Designed for HPC, AI, Big Data Analytics, Astrophysics, Business Intelligence
  • Supports the latest GPU Tesla V100
  • Performance, Flexibility, Efficiency
  • DCLC optional

Ultra Server

  • Ultra Super Servers are designed to deliver the highest performance, flexibility, scalability and serviceability to demanding IT environments, and to power mission-critical Enterprise workloads
  • The perfect fit for diverse workloads and applications and can be easily reconfigured for multiple Enterprise and Data Center applications in Virtualization, Big Data, Analytics and Cloud Computing

SuperBlade®

  • Maximum Density, affordability, reduced management cost, optimal ROI and high scalability
  • Supports up to 205W TDP Intel® Xeon® Scalable processors
  • UP, DP and 4-way blade servers
  • Hot-swap U.2 NVMe, up to 8 drives per blade server
  • 100G EDR InfiniBand, 100G Intel® Omni-Path, and 25G/10G/1G Ethernet switches
  • Redundant AC/DC power supplies
  • Battery Backup Power (BBP®) modules
  • Supermicro RSD and Redfish RESTful APIs supported

HPC Fabric (Omni-Path, InfiniBand)

Supermicro Omni-path (OPA)

Intel's Omni-Path technology is the next generation of networking. With low latency and high throughput, the Intel Omni-Path switches offer the best of both worlds. We recommend Supermicro Intel® Omni-Path to any HPC customer who is concerned about their network performance being an issue.

Mellanox InfiniBand (IB)

Mellanox’s InfiniBand switches are another excellent choice when it comes to high speed interconnect for HPC. HPC requires low latency and high throughput in networking and that is exactly what InfiniBand offers.

Supermicro HPC server platforms provide built-in Mellanox EDR or FDR adaptors or optional SIOMs.

Supermicro HPC Cluster Integration and Benchmarking

HPC solutions can only perform as good as their weakest link. Therefore, from cache coherence and check pointing to resource management, it is important that each hardware component is performing at optimum to ensure that the entire cluster will meet expectations. Supermicro has developed a unique testing suite that starts at the component level and builds out. This testing suite allows our engineers to catch and troubleshoot any discrepancies in performance that may be found and to make appropriate changes to ensure the overall quality of the final cluster.

The Supermicro testing suite is designed to cover the entire product from end-to-end, meaning that it encompasses both fine-grained details as well as cluster-level testing. This process reduces the amount of time and effort, allowing our product to reach the customer site ready to plug-and-play.

The Supermicro Benchmarking team is here to show off our products to you! To give our customers a gauge of the raw power our products are capable of, we offer an array of fine grained, per-component base, as well as cluster level benchmarking. The benchmarking numbers generated by our team help out our end customers in understanding the full potential of the product that they will be using and give insight into the capabilities of their cluster. Our benchmarking process is custom designed around the software and workloads that the customer will be using to operate their cluster. The process gives further assurance that the cluster will be ready to go as soon as it is unpacked and plugged in.

The list below contains our standard benchmarks, but we are happy to do others as well. If you have a benchmark that you are interested in but do not see on the list, please feel free to contact us for inquiry.

Table 1. HPC software stack
Cluster Software Stack
Deep Learning Environment Frameworks Caffe, Caffe2, Caffe-MPI, Chainer, Microsoft CNTK, Keras, MXNet, Tensorflow, Theano, PyTorch
Libraries cnDNN, NCCL, cuBLAS
User Access NVIDIA DIGITS
Programming Environment Development & Performance Tools Intel Parallel Studio XS Cluster Edition PGI Cluster Development Kit GNU Toolchain NVIDIA CUDA
Scientific and Communication Libraries Intel MPI MVAPICH2, MVAPICH IBM Spectrum LSF Open MPI
Debuggers Intel IDB PGI PGDBG GNU GDB
Schedulers, File Systems and Management Resource Management/Job Scheduling Adaptive Computive Moab, Maui TORQUE SLURM Altair PBS Professional IBM Spectrum LSF Grid Engine
File Systems Lustre NFS GPFS Local (ext3, ext4, XFS)
Cluster Management Beowulf, xCat, OpenHPC, Rocks, Bright Cluster Manager for HPC including support for NVIDIA Data Center GPU Manager
Operating Systems and Drivers Drivers & Network Mgmt. Accelerator Software Stack and Drives OFED, OPA
Operating Systems Linux (RHEL, CentOS, SUSE Enterprise, Ubuntu, etc.)
Table 2. Cluster Test
Cluster Test Suite
Cluster Level Test WRF, NAMD
Benchmark Test HPL, HPCC, HPCG, Igatherv, DGEMM, PTrans, FFT
Parallel Storage Test IOR, MDTest
System Component Test CPU Stress-ng, CPU-Stress
Memory STREAM, SAT
Network Netperf, Iperf, OSU benchmark
HDD FIO, IOZone, DD, hdparm

Cooling Technologies

Our HPC products generate a lot of computation power, which consequently produces excess heat. Luckily, Supermicro has teamed up with the best in the business and is happy to offer cooling equipment from our partners.

We offer liquid cooling systems, the most innovative cooling technology for HPC installations. Today, as the number of servers continues to multiply and power usage grows exponentially, liquid cooling remains the most efficient cooling solution. It enables high-density server racks and increases data center power usage effectiveness (PUE)—a winning combination. Forced-air cooling has outlived its effectiveness, and the industry now gravitates toward warm-water cooling to tap next-generation server technologies while lowering power usage.

In fact, liquid cooling lowers TCO by 40 to 50 percent compared to forced-air cooling. It has also been shown to yield a ten-fold improvement in server density without harmful tradeoffs. Other benefits include whisper-quiet operation and no stranded power. There is no need to provision power for fan energy at maximum speed.

Advanced HPC specializes in liquid cooling systems for data centers, servers, workstations and high-performance PCs. For HPC installations, our direct-to-chip technology applies liquid cooling technology directly on the chip itself. Since liquid is 4,000 times better at storing and transferring heat than air, our solutions provide immediate and measurable benefits to large and small data centers alike.

Chiller Door – RDHx

Chiller Door - RDHx

The RDHx chiller door is designed to handle a dynamic workload. It runs cold when your servers are hot, but when your servers are idle, it will dial back the amount of cooling in order to save energy. This design is possible due to the chiller door’s sensor technology, which is responsible for monitoring the heat from the exhaust of the rack and react accordingly. The chiller door is an effective solution in keeping your servers cold while keeping your energy bill low!

Per Node 1U Ultra Titanium 1U Ultra Platinum 1U Server (Comp #1) 1U Server (Comp #2)
Power Consumption (Watts) 445 454 466 477
Power Saved by using 1U Ultra Titanium (Watts) 0 9 21 32
TCO Saved over 4 years by using one 1U Ultra Titanium ($) $0 $135 $315 $480
TCO Saved per 10,000 Ultra servers ($M) * $1.4M $3.2M $4.8M
Table 3: Ultra Titanium System Power and TCO Savings

Direct Liquid Cooling – DCLC

Direct Liquid Cooling – DCLC

DCLC liquid cooling is our state of the art solution in cooling. The liquid cooling system is capable of regulating the overall temperature of hardware components in a more effective and efficient manner than any air chilling system is capable of. The liquid cooling system can be installed in any datacenter with no additional infrastructure changes and can be installed in the same racks as air cooled servers. The liquid cooling system allows CPU/GPUs to be run at maximum wattage and utilizes a low pressure, redundant pumping system to keep your equipment reliably cold.

DCLC Temperature Chart

Supermicro Solutions on Top 500 / Green 500

The Green 500

Green 500 logo

Supermicro® is a global leader in high-performance, high-efficiency server, storage technology and Green Computing. The company is committed to protecting the environment through its “We Keep IT Green®” initiative and provides customers with the most energy-efficient, environmentally-friendly solutions available on the market.

The Green500 ranks the world’s top 500 most energy-efficient supercomputers. The list below are the latest Green500 supercomputers using Supermicro systems.

Green500 List – 2018
Green500 List – 2017
Green500 List – 2016
Green500 List – 2015
  • #18 Okinawa Institute of Science and Technology Sango - Supermicro SBI-7228R-T2F, Xeon E5-2680v3 12C 2.5GHz, Infiniband FDR, Intel Xeon Phi 7120P
  • #27 Max-Planck Institute for biophysical Chemistry IO - Supermicro Superserver 1027GR-TSF, Intel Xeon E5-2620v2 6C 2.1GHz, Infiniband QDR, NVIDIA K20
  • #151 National Centre for Nuclear Research Swierk Computing Centre - Supermicro TwinBlade SBI-7227R/Bull DLC B720, Intel Xeon E5-2680v2/E5-2650 v3 10C 2.8GHz, Infiniband QDR/FDR
  • #332 National Research Centre Kurchatov Institute HPC4 - Supermicro Twin^2, Xeon E5-2680v3 12C 2.5GHz, Infiniband FDR, NVIDIA Tesla K80
  • Additional Solutions Listed as End Customer Deployments

Supermicro GPU Server Solutions

Supermicro Xeon Phi Server Solutions

The Top 500 Supercomputer Sites

Top 500 Supercomputing Sites logo

Supermicro® is a global leader in high-performance, high-efficiency server, storage technology and Green Computing. The company leads the industry in breadth of computing solutions designed for High Performance Computing (HPC) and Supercomputing applications.

The Top500 ranks the world’s top energy-efficient supercomputers every June and November. The list below highlights Supermicro’s latest heterogeneous system architecture (CPU/GPU) supercomputing solutions featuring GPU & Xeon Phi coprocessor technologies.

Top500 List – 2018
Top500 List – 2017
Top500 List – 2016
Top500 List – 2015