All posts by brian

Platform Computing and Instrumental Extend Strategic Partnership to Advance High Performance Cloud Computing for Government Sector

As U.S. government agencies and departments evaluate the potential cost savings, service level improvements and greater resource utilization offered by various cloud computing models, there is a recognized need for a technology-agnostic platform that can support and integrate legacy, heterogeneous HPC environments while also managing a wide-range of hardware, operating systems and virtual machines. In order to maximize prior technology investments, government agencies must invest in technologies that prevent vendor lock-in and that work with multiple types of operating systems.

With that, I wanted to draw your attention to a partnership between Platform and Instrumental to advance High Performance Cloud Computing for the Government Sector.  The partnership enhances Platform’s global service capabilities and gives users an end-to-end, full service solution that maximizes the value of Platform’s private cloud management and HPC cloud-enabling software solutions, Platform ISF and Platform ISF Adaptive Cluster.  The full press release can be seen here

Dave Ellis
Principal Technologist
Instrumental, Inc.

HPC in 2010: A look at technology, issues, and opportunities

insideHPC talked with Gilad Shainer, the Chairman of the HPC Advisory Council, about what the organziation does, how its grown, and how it is helping catalyze developments for users and HPC businesses.  In addition talking about what the Council has already accomplished, Gilad also talks about the new research and focus areas that they are kicking off. The  Council’s mission is to help everyone, so papers, inf0rmation, best practices, and so on are all available for download at their website.

Video is posted here: http://insidehpc.com/hpc-2010-technology-issues-opportunities/

SC09 – Building the Fastest Networking Demonstration on the Show Floor

Now that SC09 is done we can look back on the council activities and achievements during the largest HPC show in the world. As we did during SC08 and ISC’09, the council members invested a huge amount of effort and brought together the fastest networking solution – a 120Gb/s InfiniBand network demonstration, as part of SCinet. You can see the networking diagram below and I would like to thank the following organization that helped to make this amazing demo a reality: AMD, Avago, Colfax Intl, Dell, HP, IBM, InfiniBand Trade Association, Koi Computers, LSI, Los Alamos National Laboratory, Luxtera, Mellanox Technologies, Microsoft, NVIDIA, Oak Ridge National Laboratory, RAID, Scalable Graphics, SGI, Sun Microsystems, Texas Advanced Computing Center, The University of Utah Center for High-Performance Computing, and Voltaire. 

SCinet Network Diagram

One of our council members, Scalable Graphics, helped to create a highly visual demo of a Peugeot car rendering in 3D (see picture below). The demo included an interactive component, where viewers wear glasses that tracked their position and moved the image accordantly.

3D Visualization Demo

Also during the conference, we held our semi-annual meeting, where we reviewed our plans for 2010. To recap, we will be building off our successful HPC Advisory Council Workshop in China and we will be hosting 4 workshops next year; the first one in March will be hosted in Switzerland together with the Swiss Supercomputing Center, and the second in May as part of the International Supercomputing Conference. More info can be found on the council main web pages.

Best regards,

Gilad, HPC Advisory Council Chairman.

Interconnect analysis: InfiniBand and 10GigE in High-Performance Computing

InfiniBand and Ethernet are the leading interconnect solutions for connecting servers and storage systems in high-performance computing and in enterprise (virtualized or not) data centers. Recently, the HPC Advisory Council has put together the most comprehensive database for high-performance computing applications to help users understand the performance, productivity, efficiency and scalability differences between InfiniBand and 10 Gigabit Ethernet.

In summary, there are a large number of HPC applications that need the lowest possible latency for best performance or the highest bandwidth (for example Oil&Gas applications as well as weather related applications). There are some HPC applications that are not latency sensitive. For example, gene sequencing and some bioinformatics applications are not sensitive to latency and scale well with TCP-based networks including GigE and 10GigE. For HPC converged networks, putting HPC message passing traffic and storage traffic on a single TCP network may not provide enough data throughput for either. Finally, there is a number of examples that show 10GigE has limited scalability for HPC applications and InfiniBand proves to be a better performance, price/performance, and power solution than 10GigE.

The complete report can be found under the HPC Advisory Council case studies or by clicking here.

IEEE Cluster 2009

The HPC Advisory Council participated in the “Workshop on High Performance Interconnects for Distributed Computing (HPI-DC’09)” part of the IEEE Cluster 2009 conference. Several members (Joshua More from AMD, Jeff Layton from Dell, and me) presented research results on “Scheduling Strategies for HPC as a Service (HPCaaS)”. You can find the presentation under the Content page/conference at the HPC Advisory Council main page.

The workshop was well organized by Ada Gavrilovska (Georgia Tech) and Pavan Balaji (Argonne National Lab) with the help from Steve Poole (Oak Ridge National Lab). Other interesting sessions were given by Nagi Rao (ORNL) on wide area InfiniBand, James Hofmann (Naval Research Lab) on Large Data project, Hari Subramoni (The Ohio State University) on InfiniBand RDMA over Ethernet (LLE) and others.

The next council event is the HPC China workshop. More data on the workshop is posted at – http://www.hpcadvisorycouncil.com/events/china_workshop/

Gilad Shainer HPC Advisory Council Chairman
gilad@hpcadvisorycouncil.com

ROI through efficiency and utilization

High-performance computing provides an invaluable role in research, product development and education. It helps accelerate time to market, and provides significant cost reductions in product development and tremendous flexibility. One strength in high-performance computing is the ability to achieve best sustained performance by driving the CPU performance towards its limits. Over the past decade, high-performance computing has migrated from supercomputers to commodity clusters. More than eighty percent of the world’s Top500 compute system installations in June 2009 were clusters. The driver for this move appears to be a combination of Moore’s Law (enabling higher performance computers at lower costs) and the ultimate drive for the best cost/performance and power/performance. Cluster productivity and flexibility are the most important factors for a cluster’s hardware and software configuration.

A deeper examination of the world’s Top500 systems based on commodity clusters shows two main interconnect solutions that are being used to connect the servers for creating those compute powerful systems – InfiniBand and Ethernet. If we divide the systems according to the interconnect family, we will see that the same CPUs, memory speed and other settings are common between the two groups. The only difference between the two groups, besides the interconnect, is the system efficiency, or how many of CPU cycles can be dedicated to the application work, and how many of them will be wasted. The below graph list the systems according to their interconnect setting, and their measured efficiency.

 top500

As seen, systems connected with Ethernet achieves an average 50% efficiency, which means that 50% of the CPU cycles are wasted on non-application work or are idle, waiting for data to arrive.  Systems connected with InfiniBand achieve an above 80% efficiency average, which means that less than 20% of the CPU cycles are wasted. Moreover, the latest InfiniBand based systems have demonstrated up to 94% efficiency (the best Ethernet connected systems demonstrated 63% efficiency).

People might argue that the Linpack benchmark is not the best benchmark for measuring parallel application efficiency, and does not fully utilize the network. The graph results are a clear indication that even for the Linpack application, the network does make a difference, and for better parallel application, the gap will be much higher.

When choosing the system setting, with the notion of maximizing return on investment, one needs to make sure no artificial bottlenecks will be created. Multi-core platforms, parallel applications, large databases etc require fast data exchange and lots of it. Ethernet can become the system bottleneck due to latency/bandwidth and CPU overhead due to the TCP/UDP processing (TOE solutions introduce other issues, sometime more complicated, but this is a topic for another blog) and reduce the system efficiency to 50%. This means that half of the compute system is wasted, and just consumes power and cooling. Same performance capability could have been achieved with half of the servers if they were connected with InfiniBand. More data on different application performance, productivity and ROI, can be found at the HPC Advisory Council web site, under content/best practices.

While InfiniBand will demonstrate higher efficiency and productivity, there are several ways to increase Ethernet efficiency. One of them is optimizing the transport layer to provide zero copy and lower CPU overhead (not by using TOE solutions, as those introduce single points of failure in the system). This capability is known as LLE (low latency Ethernet). More on LLE will be discussed in future blogs.

Gilad Shainer HPC Advisory Council Chairman
gilad@hpcadvisorycouncil.com

HPC in a Cloud

During the ISC’09 conference several of the Council members (AMD, Dell, Mellanox, and Platform) presented its HPC in a Cloud proof of concept based on work that the council performed inside the HPC Advisory Council’s High-Performance Center. We have posted this Bird-of-Feather presentation below. Let us know what you think of this concept.

Cloud computing for HPC?

One of the interesting projects we are dealing with is the feasibility to use cloud computing for high performance computing. I remember a paper on using the Amazon EC2 for HPC, and the conclusion was that some GB of bandwidth are missing between the compute nodes… J  In the past, high-performance computing has not been a good candidate for cloud computing due to its requirement for tight integration between the servers’ nodes via low-latency interconnects.  Moreover, the performance overhead associated with host virtualization, a pre-requisite technology for migrating local applications to the cloud, quickly erodes application scalability and efficiency in an HPC context.  Furthermore, HPC has been slow to adopt virtualization, not only due to the performance overhead, but also because HPC servers generally run fully-utilized, and therefore do not benefit through consolidation.

Not all clouds are the same, nor will be, and while virtualization is needed for enterprise applications, yet for HPC clouds is not a must, and application provisioning can be done on a physical server granularity. Moreover, there are emerging virtualization solutions that reduce the overhead and enable native application performance.

The council had presented some of the first finding from the HPC cloud project at ISC’09 (posted on the advanced topics section at http://www.hpcadvisorycouncil.com/advanced_topics.php). We have submitted a full paper for publication, and hope to post it on the web site soon.

Next phase of the project will be adding the virtualization aspect, in particular Xen and KVM, and explore the effects on application performance, as well as the system utilization and efficiency capabilities.  

Gilad Shainer,
HPC Advisory Council Chairman