The HPC Advisory Council supports a variety of end-user and OEM-based conferences throughout the year, regularly submits multi-vendor technical presentations for end-user audiences, and provides technical support for demonstrations.
Presenters: Gilad Shainer (HPC Advisory Council), Tong Liu (Mellanox technologies), Jeffrey Layton (Dell); Joshua Mora (AMD)
Abstract: One of the main advantages of HPC clusters is the flexibility and efficiency they bring to their user. With the increasing number of applications being served by HPC systems, new systems need to serve multiple users and multiple applications. Traditional HPC systems typically served a single application at a given time, but to maintain maximum flexibility a new concept of "HPC as a Service" (HPCaaS) has been developed. HPCaaS includes the capability of using clustered servers and storage as resource pools, a web interface for users to submit their job requests, and a smart scheduling mechanism that can schedule multiple different applications simultaneously on a given cluster taking into considerations the different application characteristics for maximum overall productivity. The paper reviews the concept of HPCaaS and explores a smart scheduling algorithm for a subset of bioscience applications. In this paper we will show that smart scheduling can accommodate multiple applications and multiple jobs simultaneously while increasing the overall system productivity and efficiency.
Presentations:
Presenters: Gilad Shainer (Mellanox Technologies), Dr. Thomas Lippert (Jülich), Axel Koehler (Sun Microsystems), David Scott (Intel), Hugo R. Falter (ParTec)
Abstract: The JuRoPa II, one of the leading European PetaScale supercomputer projects, is currently being constructed at the Forschungszentrum Jülich, in the German state of North Rhine-Westphalia - one of the largest interdisciplinary research centers in Europe. The new systems are being built through an innovative alliance between Mellanox, Bull, Intel, Sun Microsystems, ParTec, and the Jülich Supercomputing Centre; the first such collaboration in the world. This new 'best-of-breed' system, one of Europe's most powerful, will support advanced research in many areas such as health, information, environment, and energy. It consists of two closely coupled clusters, JuRoPA, with more than 200 Teraflop/s performance, and HPC-FF, with more than 100 Teraflop/s. The latter will be dedicated for the European fusion research community. The session will introduce the project and the initiative, how it will effect future supercomputers systems, and how it will contribute to Petascale scalable software development.
Presentations:
Presenters: Gilad Shainer (Mellanox Technologies), William Lu (Platform Computing), Joshua Mora (AMD), Peter Lillian (Dell)
Presentations:
Presenters: Gilad Shainer (Mellanox), Tong Liu (Mellanox), Jacob Liberman (Dell), Jeff Layton (Dell), Onur Celebioglu (Dell), Scot A. Schultz (AMD), Joshua Mora (AMD), David Cownie (AMD)
Abstract: From concept to engineering, and from design to test and manufacturing, the automotive industry relies on powerful virtual development solutions. CFD and crash simulations are performed in an effort to secure quality and speed up the development process. The recent trends in cluster environments, such as multi-core CPUs and interconnect consolidation are changing the dynamics of clustered-based simulations. Software applications are being reshaped for higher parallelism, as well as hardware configurations for solving the new demands on scalability and efficiency. Both "productivity-aware" and "power-aware" have become metrics for new systems design and implementations. The system architecture needs to be capable of providing higher productivity for present and future simulations; while maintaining low power consumption. In this paper we cover best practices for achieving maximum productivity while sustaining or reducing power consumption on LS-DYNA simulations.
Presentations:
Presenters: Gilad Shainer (HPC Advisory Council chairman), Jennifer Koerv (AMD), Donnie Bell (Dell), Sharan Kalwani (GM), Lynn Lewis (Microsoft), Stan Posey (Panasas), Lee Porter (ParTec), Arend DittmerJoshua (Penguin Computing)
Abstract: The presentation introduces the HPC Advisory Council mission, activities and future plans.
Presentations:
Presenters: Gilad Shainer & Tong Liu (Mellanox Technologies), Joshua Mora (AMD), Jacob Liberman (Dell), Owen Brazell (Schlumberger)
Abstract: Schlumberger's ECLIPSE Reservoir Engineering software is a widely used oil and gas reservoir numerical simulation suite. Like many other High Performance Computing (HPC) applications, ECLIPSE runs in a complex ecosystem of hardware and software components. Maximizing ECLIPSE performance requires a deep understanding of how each component impacts the overall solution. However, as new hardware and software comes to market, design decisions are often based on assumptions or projections rather than empirical testing. This presentation removes the guesswork from cluster design for ECLIPSE by providing best practices for increased performance and productivity. It includes scalability testing, interconnect performance comparisons, job placement strategies, and power efficiency considerations. It also introduces an ongoing collaboration between Dell, AMD, and Mellanox dedicated to publishing timely application-specific best practices and performance data.
Presentations:
Presenters: Gilad Shainer & Tong Liu (Mellanox Technologies), Joshua Mora (AMD), Jacob Liberman (Dell), John Michalakes (National Center for Atmospheric Research)
Abstract: The Weather Research and Forecast (WRF) Model is a fully functioning modeling system for atmospheric research and operational weather prediction communities. With an emphasis on efficiency, portability, maintainability, scalability and productivity, WRF has been successfully deployed over the years on a wide variety of HPC clustered compute nodes connected with high speed interconnects - currently the most used system architecture for high-performance computing. As such, understanding WRF dependency on the various clustering elements, such as the CPU, interconnects and the software libraries are crucial for enabling efficient predictions and high productivity. Our results identify WRF's communication-sensitive points and demonstrate WRF's dependency on high-speed networks and fast CPU to CPU communication. Both factors are critical to maintaining scalability and increasing productivity when adding cluster nodes. We conclude with specific recommendations for improving WRF performance, scalability, and productivity as measured in jobs per day. Because proprietary hardware and software can quickly erode cluster architecture's favorable economics, we will restrict our investigation to standards based hardware and open source software readily available to typical research institutions.
Presentations:
Abstract: "Ranger" is the largest computing system in the world for open science research. Located at the Texas Advanced Supercomputing Center, Ranger serves NSF TeraGrid researchers and academic institutions. It is the most powerful commodity-based system that does not utilize specialized accelerators; only off-the-shelf CPUs and InfiniBand interconnect technology (Sun Data Center Switch 3456) to provide the 579 Teraflops of compute power, and therefore does not require new application development. The system consists of more than 15K sockets and centralized networking infrastructure that connects all the sockets in a full fat-tree configuration. As we enter the exascale computing era, the numbers of expected sockets will grow in a magnitude of order and the networking infrastructure could evolve into a mash or hybrid one. Lessons learned from Ranger will provide a solid foundation for building future extreme scale computing infrastructures and will be the main focus of this session.
Presentations:
Abstract: Future HPC systems will span tens-of-thousands of nodes, all connected together via high-speed connectivity solutions to form multi-Petaflop clusters. With the growing size of clusters and CPU cores per cluster node, not only the traditional demands from the cluster interconnect increase dramatically, but new demands are introduced. The interconnect needs to provide balanced throughput and latency, to address IO requirements of each CPU core, while maintaining high network utilization. Moreover, the overall number of communication links grows with the size of the cluster and link data errors have become a growing concern for large-scale platforms, as they tend to have an adverse affect on the performance. The session will drive a discussion on the needed communication capabilities, static and dynamic routing, congestion control and handling networks errors. The session will also present models and simulations results on the new adaptive routing implementation for InfiniBand networks.
Presentations:
Abstract: Quad Data Rate (QDR) InfiniBand delivers a new level of performance that complements today\'s multi-core compute nodes. Appro has combined these technologies in the Appro Xtreme-X1 supercomputing clusters providing a reliable and scalable architecture that unites high performance capacity computing with superior fault-tolerant capability computing. We will discuss benchmarks and our experience in building the first Intel Cluster-Ready certified 40Gb/s InfiniBand cluster utilizing redundant Mellanox ConnectX IB 40Gb/s QDR running on Intel Xeon Quad-Core processors in a redundant and manageable framework. Representatives from Intel, Appro and Mellanox will be on hand for questions.
Presentations:For questions or comments, please contact info@hpcadvisorycouncil.com