Einleitung

Hardware and Configurations of the Cluster

Hardware Übersicht Compute-Sektionen

The cluster consists of three sections:

MPI section for MPI intense applications

MEM section for applications that need a lot of memory

ACC section for applications that use accelerators

The whole system is located at the HPC building (L5|08) on campus Lichtwiese and consists of two stages (the later phase with one extension).

Phase I of Lichtenberg II was operational in December 2020 (in testing since September 2020).
Phase II ofLichtenberg I was operational in February 2015 (and expanded end of 2015), and was decommissioned in May 2021.
Phase I of Lichtenberg I has been in operation since fall 2013 and was decommissioned in April 2020.

Hardware of Phase I Lichtenberg II

643 Compute nodes and 8 Login nodes

  • Processors: in total, ~4,5 PFlop/s computing power (Double Precision, peak – theoretical)
    • Realistically achieved: ca. 3,03 PFlop/s computing power with Linpack benchmark
  • Accelerators: overall 424 TFlop/s computing power (Double Precision/FP64, peak – theoretical)
    and ~6,8 Tensor PFlop/s (Half Precision/FP16)
  • Memory: in total, ~250 TByte main memory
  • All compute and accelerator nodes in one large island:
    • MPI section: 630 nodes (each with 96 CPU cores and 384 GByte main memory)
    • ACC section: 8 nodes (each with 96 CPU cores and 384 GByte main memory)
      • Status: currently, only 2 GPU cards per node
      • 4 nodes, each with 4x Nvidia V100 GPUs
      • 4 nodes, each with 4x Nvidia A100 GPUs
    • MEM section: 2 nodes (each with 96 CPU cores and 1536 GByte main memory)
  • NVIDIA DGX A100
    • Status: in preparation
    • 3 nodes (each with 128 CPU cores, 1024 GByte main memory)
      • 8x NVIDIA A100 Tensor Core GPUs (320 GByte total)
      • Local storage: ca. 19 TByte (Flash, NVME)

Hardware of Phase II (operating since 2015)

632 Compute nodes and 8 Login nodes (deommissioned since 2021-05-31)

  • Processors: in total, ~512 TFlop/s computing power (Double Precision, peak – theoretical)
    • Realistically achieved approx. 460 TFlop/s in Linpack benchmark
  • Accelerators: overall 11,54 TFlop/s computing power (Double Precision, peak – theoretical)
  • Memory: in total, ~44 TByte main memory
  • Compute nodes grouped in 18 islands:
    • 1x MPI island with 84 nodes (in total: 2016 CPU cores, 5376 GByte main memory)
    • 16x MPI islands, each with 32 nodes (in total: 768 CPU cores and 2048 GByte main memory per island)
    • 1x ACC island with 32 nodes (ACC-N) – 3x with GPU accelerators (29x without)

Hardware of Phase I (operating since 2013)

780 compute nodes and 4 login nodes (decommissioned since 2020-04-27)

  • Processors: Overall ~261 TFlop/s computing power (Double Precision, peak – theoretical)
    • realistically achieved approximately 216 TFlop/s computing power with Linpack benchmark
  • Accelerators: Overall ~168 TFlop/s computing power (Double Precision, peak – theoretical)
    • realistically achieved approximately 119 TFlop/s computing power with linpack
  • Memory: overall ~32 TByte main memory
  • The computing nodes are subdivided into 19 islands:
    • 1 x MPI island with 162 nodes (2592 cores, overall 5184 GByte main memory)
    • 2 x MPI island with each 32 nodes (512 cores and 2048 GByte main memory per island
    • 15 x MPI island with each 32 nodes (512 cores and 1024 GByte main memory per island)
    • 1 x ACC island with 44 nodes (ACC-G) and 26 nodes (ACC-M), 4 nodes (MEM)

Filesystems / Storage

The build-up of Lichtenberg II was founded by the acquisition of the first stage of the new storage system in 2018. Upon being operational in 2019, this system superceded the old LB 1 storage system. However, its full capacity and performance will be reached with further stages of Lichtenberg II.

The high speed parallel file system is based on “IBM Spectrum Scale” (formerly known as General Parallel File System), well known for its flexibility.

With the last expansion of 33% in capacity and bandwith, the current system provides 4 PByte disk storage, using SSD (Solid State Disks) for the meta data and ~1300 legacy magnetic disks for raw data.

One of the most notable features of the storage system is its constant distribution of all files and directories over all available disks and SSDs. Unlike before, there is thus almost no performance difference any longer between eg. /work/scratch and /home. In addition, any expansion in storage capacity also eventuates a substantial gain in storage performance.

We currently see an aggregate bandwidth of up to ~120 GByte/s in reading and ~106 GByte/s in writing.

Usage of computing nodes

  • Each node can be used as-is, ie. “single node”, with either one large or several smaller jobs
  • several nodes concurrently by interprocess communication (MPI) via InfiniBand:
    • Lichtenberg 1
      • Each island by itself (i.e. 32 or up to 161 nodes in one MPI job)
      • Across islands with reduced inter-island bandwidth, ie. between nodes of distinct islands (on request)
    • Lichtenberg 2
      • one big island – all MPI compute nodes can reach each other with almost the same speed and latency (non-blocking)