Micro Benchmarks

The following benchmarks are selected to be used on the first day of the competition.

HPC Challenge

HPC Challenge (HPCC) will be used to score the benchmark portion of the competition. A team may execute HPCC as many times as desired during the setup and benchmarking phase, but the HPCC run submitted for scoring will define the hardware baseline for the rest of the competition. In other words, after submitting this benchmark, the same system configuration should be used for the rest of the competition.

The rules described in the Rules section of HPCC web page on code modification does apply.

High Performance LINPACK (HPL)

The teams will compete on High Performance LINPACK (HPL) benchmark for the ‘High LINPACK’ award for the team submitting the highest HPL score. Additional, independent HPL runs (outside the submitted HPCC run) may be considered for the “Highest LINPACK” award if they are performed with exactly the same hardware powered on as used for HPCC run submitted for scoring. While eligible for the Highest LINPACK award, independent HPL runs will NOT count toward the team’s overall score. The HPL run must be submitted on the first day of the competition.

The teams may use any HPL binary.


• The teams need to declare which binary they going to run (by June 5) and provide the binary info + NVIDIA contact (or anyone else) that provided them the binary.
• Due to the Open MPI issue #3003, we advise all student teams to avoid using Open MPI versions between 1.10.3 to 1.10.6 due to the timer bug. This bug can potentially cause HPL to show the calculated results better than the theoretical peak.


HPCG stands for High Performance Conjugate Gradient. It is a self-contained benchmark that generates and solves a synthetic 3D sparse linear system using a local symmetric Gauss-Seidel preconditioned conjugate gradient method. HPCG is a software package that performs a fixed number of symmetric Gauss-Seidel preconditioned conjugate gradient iterations using double precision (64 bit) floating point values. Integer arrays have global and local scope (global indices are unique across the entire distributed memory system, local indices are unique within a memory image). Reference implementation is written in C++ with MPI and OpenMP support. HPCG is being used on the first day of the competition. 30 minutes is the minimum time needed for the official run.

The teams may use any HPCG binary.

Notes: The teams need to declare which binary they going to run (by June 10) and provide the binary info + NVIDIA contact (or anyone else) that provided them the binary.

HPC Applications



The Weather Research and Forecasting (WRF) Model is a mesoscale numerical weather prediction system designed for both atmospheric research and operational forecasting applications. The model serves a wide range of meteorological applications across scales from tens of meters to thousands of kilometers. The effort to develop WRF began in the latter 1990's and was a collaborative partnership of the National Center for Atmospheric Research (NCAR), the National Oceanic and Atmospheric Administration (represented by the National Centers for Environmental Prediction (NCEP) and the Earth System Research Laboratory), the U.S. Air Force, the Naval Research Laboratory, the University of Oklahoma, and the Federal Aviation Administration (FAA).

WRF, as most HPC applications do, consumes time and computational resources. On one hand, running WRF (or other weather applications) uniformly with high-resolution over a large domain is expensive in both time and resources; on the other hand, running WRF at high resolution for a single small domain may not be accurate. For that reason, the capability to work with nested domains was developed for WRF. Using that capability, one can run WRF on a small domain at high resolution, within a larger domain at lower resolution, and so on. For the 2021 ISC SCC, the teams will be running WRF using three nested domains.

To get started: click here.


GPAW is an open source program package for quantum mechanical atomistic simulations. It is based on the density-functional theory (DFT) and the projector-augmented wave (PAW) method, and it includes also more advanced models such as time-dependent density-functional theory and GW-approximation. Physical quantities that can be studied include equilibrium geometries of molecules, crystals, surfaces and various nanostructures, magnetic properties, formation energies, optical spectra, just to name a few.

GPAW supports several different basis sets for discretizing the underlying equations: real-space grids, plane waves, and localized atom-centered functions. GPAW is implemented in the combination of Python and C programming languages, and it is parallelized with MPI and OpenMP. Depending on the input data set, GPAW can scale to thousands of CPUs.

Main Website:
To get Started: click here.

MetaHipMer 2.0

MetaHipMer is a *de novo* metagenome assembler. It takes short fragmented DNA read sequences and assembles them into longer contiguous sequences. The latest version is MetaHipMer 2.0 (MHM2), which is written entirely in UPC++, a C++ library that supports Partitioned Global Address Space (PGAS) programming. UPC++ uses the GASNet-Ex communication layer to deliver low-overhead, fine-grained communication, including Remote Memory Access (RMA) and Remote Procedure Calls (RPCs) within an asynchronous framework. MHM2 scales up to thousands of compute nodes on supercomputers such as NERSC Cori and OLCF Summit.

MHM2 is different from most other scientific HPC applications in that it uses almost no floating point and mostly carries out string manipulations, and it is very memory intensive. It has multiple computation phases that utilize different communication and computation patterns. The most common patterns are related to distributed hash tables, with random point-to-point communication and no locality. Some phases send many small messages and are latency sensitive, whereas other phases are more bandwidth bound. Because of this latency sensitivity, MHM2 will perform poorly on systems with slow interconnects, such as Ethernet.

More information about how MetaHipMer works can be read in the following papers:

- E. Georganas et al., Extreme Scale De Novo Metagenome Assembly, Super Computing (2018). Download the paper ,here.
- Hofmeyr, S., Egan, R., Georganas, E. et al. ,Terabase-scale metagenome coassembly with MetaHipMer. Sci Rep 10, 10689 (2020).

For MHM2 code and User Manual see here.

To get started click here.


Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS) is a classical molecular dynamics code with a focus on materials modeling. LAMMPS has potentials for solid-state materials (metals, semiconductors) and soft matter (biomolecules, polymers) and coarse-grained or mesoscopic systems. It can be used to model atoms or, more generically, as a parallel particle simulator at the atomic, meso, or continuum scale. LAMMPS runs in parallel using MPI. Many of its models have versions that provide accelerated performance on CPUs and GPUs.

To get started click here.

Coding Challenge

The main target is to this year's coding challenge is to analyze MPI_alltoallv patterns within the application. MPI_alltoallv is a an MPI function call in which each rank sends to each other rank up to one message, similar to alltoall, but the size of each message could be different. In this task, you will be using an alltoallv collective profiler to generate MPI traces when running applications. After the run, you will use a webUI GUI to access the results and present them to the judges.

Details can be found :here.

Here is the our Coding Challenge Overview by HPC architect Geoffory Vallee: