About The Competition

The Remote Direct Memory Access (RDMA) over InfiniBand specification is defined and maintained by the IBTA organization (https://www.infinibandta.org) to enable the efficient transfer of data between servers and between servers and storage, without the involvement of the host CPU in the data path. IBTA has also defined RDMA over Ethernet, where the protocol is commonly referred to as RoCE can be used over either InfiniBand or Ethernet.

Comparison Between RDMA and TCP/IP
Comparison Between RDMA and TCP/IP

RDMA has been widely used in high performance computing (HPC), artificial intelligence (AI), cloud and edge computing, telco and financial services, and more. RDMA enables high performance and scalability, ultra-low latency and low CPU overhead of compute- and data-intensive application data transfers.

The RDMA programming tutorial includes an introduction to RDMA verbs programming and their related inputs, outputs, descriptions, and functionality as exposed through the operating system programming interface. Programming with verbs allows for customizing and optimizing the RDMA-aware network.

Unified Communication X (UCX) is an open source, production-grade communication framework developed and maintained by the Unified Communication Framework (UCF) Consortium (https://www.openucx.org). UCX exposes a set of abstract communication primitives that utilize the best of available hardware resources and offloads. These include RDMA (InfiniBand and RoCE), TCP, GPUs, shared memory, and network atomic operations. UCX facilitates rapid development by providing a high-level API, masking the low-level details, while maintaining high-performance and scalability.

UCX Diagram

Unified Collective Communications (UCC) is an open-source project to provide an API and library implementation of collective (group) communication operations for applications in various domains including High-Performance Computing, Artificial Intelligence, Data Center, and I/O (https://www.ucfconsortium.org/projects/ucc/). The goal of UCC is to provide highly performant and scalable collective operations leveraging scalable and topology-aware algorithms, software implementation techniques and In-Network Computing hardware acceleration engines. It collaborates with UCX and utilizes UCX’s highly performant point-to-point communication operations and library utilities.

In-Network Computing is playing a critical role to improve the application performance following the data-centric computing model. The ideas, design, and implementation of UCC are drawn from the experience of multiple collective projects from In-Network Computing technologies.

In this tutorial, you will learn about the foundation of RDMA, how to perform RDMA operations, how to efficiently use RDMA to program your compute- and data-intensive applications, storage applications and AI applications, as well as how to use UCX to deploy RDMA.

OSU is one of the first universities to use RDMA in HPC and storage systems; Tsinghua has the distinction of hosting the champion team of the 2018 RDMA Programming Competition; USTC has the distinction of hosting the champion team of the 2019 and 2020 RDMA Programming Competition; ICT has utilized RDMA in a great number of HPC systems in China. All will share how they used RDMA to reach higher application performance and built champion RDMA programming teams.

The 2021 Advanced Data Center Networks and RDMA Programming workshop is designed for 4.5 days and includes 29 hours of lectures, 6 hours of demo presentations and hands-on programming work, and a 3-day hackathon which aims, through competition, to create a comprehensive introduction to advanced network technologies and RDMA programming for efficient server-to-server and server-to-storage communication.