HPC Advisory Advanced Cloud Subgroup

Cloud Computing Will Usher in a New Era of Science Discovery
The concept of computing “in a cloud” is typically referred as a hosted computational environment (could be local or remote) that can provide elastic compute and storage services for users per demand. Therefore the current usage model of cloud environments is aimed for computational science. Future clouds can be served as environments for distributed science to allow researchers and engineers to share their data with their peers around the globe and allow expensive achieved results to be utilized for more research projects and scientific discoveries.

To allow the shift to the fourth mode of “science discovery” those cloud environments will need not only to provide capability to share the data created by the computational science and the various observations results, but also to be able to provide cost-effective high-performance computing capabilities, similar to that of today’s leading supercomputers, in order to be able to rapidly and effectively analyze the data flood. Moreover, an important criteria of clouds need to be fast provisioning of the cloud resources, both compute and storage, in order to service many users, many different analysis and be able to suspend tasks and bring them back to life in a fast manner. Reliability is another concern, and clouds need to be able to be “self healing” clouds where failing components can be replaced by spares or on-demand resources to guarantee constant access and resource availability.

Case: From Computational Science to Science Discovery: The Next Computing Landscape

HPC as a Service
One of the main advantages of HPC clusters is the flexibility and efficiency they bring to their user. With the increase in the number of applications being served by HPC systems, new systems need to server multiple users and multiple applications. Traditional HPC systems typically served a single application at a given time, but in order to maintain high flexibility HPC a new concept of HPC as a Service (HPCaaS) has been developed. The HPC Advisory Council has been one of the first organizations to perform research activities and to provide guidelines for OEMs and end-users for developing HPCaaS clusters.

Smart scheduling strategies for HPCaaS are essential in order to be able to host multiple applications simultaneously while maintaining or even increasing the total systems productivity.

Case: Scheduling Strategies for HPC as a Service (HPCaaS) for Bio-Science Applications - PDF

HPC in a Cloud
In the past, high-performance computing has not been a good candidate for cloud computing due to its requirement for tight integration between servers’ nodes via low-latency interconnects.  The performance overhead associated with host virtualization, a prerequisite technology for migrating local applications to the cloud, quickly erodes application scalability and efficiency in an HPC context.  Furthermore, HPC has been slow to adopt virtualization, not only due to the performance overhead, but also because HPC servers generally run fully-utilized, and therefore do not benefit through consolidation. The performance overhead inherent in virtualization has, in turn, made for slow adoption of low-latency interconnects by cloud providers as part of their service offering. Instead, the primary focus has been for non mission-critical or non-performance-demanding applications.

Image Courtesy: 451 Group

The HPC Advisory Council performs studies to explore and assess the performance overheads of high-performance applications in cloud environments.  In those studies, the HPC advisory council provides a deep analysis of the performance overhead associated with running high-performance applications over high speed networks in a cloud environment, and it addresses the needs for virtualization in HPC clouds.

Case: HPC in a Cloud - PDF