移至主內容

Supermicro® Total Solution for Machine Learning

Supermicro and Canonical have partnered to deliver solutions that feature TensorFlow machine learning.

This solution is built and validated with Supermicro SuperServers, SuperStorage systems, and Supermicro Ethernet switches that are optimized for performance and designed to provide the highest levels of reliability, quality and scalability.

Canonical, the company behind Ubuntu, helps organizations make the most of Ubuntu. The Canonical Distribution of Kubernetes (CDK) is pure upstream Kubernetes tested across the widest range of clouds. Canonical also provides a rich ecosystem of tools, libraries, services, modern metrics, and monitoring tools to make CDK easy to consume so you can innovate faster.

Kubeflow is an open source project dedicated to providing easy-to-use Machine Learning (ML) resources on top of a Kubernetes cluster. Most prominently, Kubeflow eases the installation of TensorFlow and provides the mechanisms for leveraging GPUs attached to the underlying host in the execution of ML jobs submitted to it. TensorFlow is an open source software library for high performance numerical computation. Its flexible architecture allows easy deployment of computation across a variety of platforms (CPUs, GPUs, TPUs), and from desktops to clusters of servers to mobile and edge devices.

Supermicro + Canonical Machine Learning Certified Platforms

Build your Machine Learning solution with Supermicro and Canonical.

Supermicro & Canonical+Tensorflow Rack Diagram

Highlights

  • Validated reference architectures
  • Certified components
  • Scale out – One rack to many racks
  • Greenest Servers for the Cloud – Save hundreds of dollars per server
  • Lowest Cost – Best Performance / Watt / $ / ft²
  • Start as a pro by leveraging expertize support and services

Enterprise support for Canonical Distribution of Kubernetes and Kubeflow is provided by Canonical in partnership with Supermicro whereby customers gain access to a global pool of knowledge & expertise. The partnership offers a Discovery and Design Service - together, we design your infrastructure to the required size and specifications.

Supermicro Total Solutions for Canonical Machine Learning

DescriptionCanonical Machine Learning Solution
# of Cores216 Cores
Total Memory3072 GB
Raw Storage24 TB
Height19U

SKU Details

 Machine Learning SKUQty
Components Used
Infrastructure NodeSYS-6019U-TN4RT3
Cloud NodeSYS-2029GP-TR6
Cloud Node Data DisksU.2 NVMe Drives (2 TB)12 (2 Per Node x 6)
HDS-IUN2-SSDPE2KX020T8
Cloud Node GPUNVIDIA Tesla V100 16GB GPUs12 (2 per node x 6)
GPU-NVTV100-16
Software Licenses
Ubuntu Advantage Advanced (3 Years)SVC-CNC-SVR-AS9
Ubuntu Kubernetes DiscovererSVC-CNCFC-FOB1
Services
DataCenter Design, Validation and Bootstrapping Services*15
Supermicro Rack Integration Service**1
Supermicro Onsite Support12
* Consult Supermicro for pricing and quotation of this service.
** Racking & Cabling (with 3rd party switches); Racking & Cabling Engineering Drawing; Supermicro will not be responsible for 3rd party switch configuration.

Networking Options (with 10, 25, or 40GB Data Switches)

Reference configurations include two types of Ethernet switches - one for consolidation of management/IPMI traffic and another for use in networking data traffic. The 1GbE management switch is common to all three networking options. The data switch options range from a 10Gbps, 25Gbps, to a 40Gbps switch.

 10 GbE Data Network with Cumulus OS325 GbE Data Network with SMIS OS40 GbE Data Network with Cumulus OS3Qty
Management SwitchSSE-G3648BRSSE-G3648BRSSE-G3648BR2
Data SwitchSSE-X3648SRSSE-F3548SRSSE-C3632SR2
Infrastructure Node NICsAOC-STGF-I2S-OAOC-S25G-M2SAOC-S40GI2Q6 (2 Per Node x3)
Cloud Node NICsAOC-STGF-I2S-OAOC-S25G-M2SAOC-S40GI2Q12 (2 Per Node x6)
Software Licenses (for networking)
Management Switch Software perpetual license with 3 yr service /supportSFT-CLSNWPL-1G-3Y1SFT-SMCPL1G2SFT-CLSNWPL-1G-3Y12
Data Switch Software perpetual license with 3 yr service /supportSFT-CLSNWPL-10G-3Y1Included with switch2SFT-CLSNWPL-100G-3Y12

1 – The 10 GbE Data switch and 40 GbE Data switch options require a Cumulus OS for all switches in the solution. The Cumulus OS licenses for both the data and management switches are obtained through Supermicro using the provided SKUs.

2 – The 25 GbE Data switch option requires the Supermicro (SMIS) OS for all switches in the solution. The SMIS OS for the management switch is obtained using the provided SKU. The SMIS OS for the data switch is included with the switch.

3 – Cumulus Linux is a powerful open network operating system that allows you to automate, customize and scale:
www.cumulusnetworks.com/products/cumulus-linux/

Additional Services that can be purchased

Additional Services
Supermicro Onsite Integration Service**
Ubuntu Advantage Server - Essential / Standard / Advanced Server
Ubuntu Advantage Professional Service
Ubuntu Bootstack Program
Ubuntu BootStack Professional Services
Ubuntu Advantage Travel & Expenses
Ubuntu BootStack CEPH / Swift Storage Add-on

** Requires SOW Onsite Survey; Onsite logistics; Racking & Cabling (with 3rd party switches); Racking & Cabling Engineering Drawing

Network Component Details

(Cluster Role: Data Switch)
  • 48x 10Gb Ethernet ports - SFP+
  • 6x 40 Gb Ethernet ports - QSFP+
  • RJ-45 (for console cable)
  • RJ-45 1G Ethernet Management Port
  • USB
  • Switching Capacity: 1440 Gbps
  • Wire-speed Layer 3 Routing
  • 1:1 Non-blocking connectivity
  • Regular Airflow option available
  • Dual redundant hot-swappable power supplies
  • 1U form factor
(Cluster Role: Data Switch)
  • 48x 25Gb Ethernet ports - SFP28
  • 6x 100Gb Ethernet ports - QSFP28
  • RJ-45 (for console cable)
  • RJ-45 1G Ethernet Management Port
  • USB
  • Switching Capacity: 3.6 Tb
  • 1:1 Non-blocking connectivity
  • Regular Airflow option available
  • Dual redundant hot-swappable power supplies
  • 1U form factor
(Cluster Role: Data Switch)
  • 32x Ethernet QSFP28 ports – either 40Gbps or 100Gbps
  • 1x 10Gb Ethernet SFP+ port
  • RJ-45 Gb Ethernet management port
  • RJ-45 serial console
  • Type A USB 2.0 port
  • Full Duplex 3.2Tbps Switching Capacity
  • Regular Airflow option available
  • Dual redundant hot-swappable power supplies
  • 1U form factor
(Cluster Role: Management Switch)
  • 48x 1Gbps Ethernet RJ45 ports
  • 4x 10Gbps Ethernet SFP+ ports
  • RJ-45 Gb Ethernet management port (for console cable)
  • Type A USB 2.0 port
  • Aggregated switching Capacity - 176 Gbps
  • Non-blocking, wire-speed Layer 3 Routing
  • Regular Airflow option available
  • Second (redundant) hot-swappable power supply - optional
  • 1U form factor

Certified Nodes for Canonical Machine Learning Solution

(Cluster Role: Infrastructure Node)
  • 28 (2 * 14) Cores
  • 384 (12 * 32) GB RAM
  • 12 TB (2x 6TB) SATA HDD
  • Intel DC P4510 500GB, NVMe PCI-E 3.1 (Cache)
  • 2 RJ45 10GBase-T Ethernet ports
(Cluster Role: Cloud Node)
  • 36 (2 * 18) Cores
  • 512 (16 * 32) GB RAM
  • 8TB (2x 4TB), SATA HDD (OS)
  • Intel DC P4610 1.6TB NVMe PCIe 3.1 (Cache)
  • 2x NVIDIA Tesla V100 GPUs
  • Make sure to add data disks, GPUs and "NICs for data" separately
(Cluster Role: Cloud Node)
  • 36 (2 * 18) Cores
  • 512 (16 * 32) GB RAM
  • 8TB (2x 4TB), SATA HDD (OS)
  • Intel DC P4610 1.6TB NVMe PCIe 3.1 (Cache)
  • 2x NVIDIA Tesla V100 GPUs
  • 4 RJ45 10GBase-T Ethernet ports 
  • Make sure to add data disks, GPUs and "NICs for data" separately