Tag Archives: InfiniBand

HPC Advisory Council Forms Worldwide Centers of Excellence

This week we announced the formation of the HPC Advisory Council Centers of Excellence. The HPC Advisory Council Centers of Excellence will provide local support for the HPC Advisory Council’s programs, local workshops and conferences, as well as host local computing centers that can be used to extend such activities.

“We are pleased to be named as one the inaugural HPC Advisory Council’s Centers of Excellence, covering HPC research, outreach and educational activities within Europe,” said Hussein Nasser El-Harake at the Swiss National Supercomputing Centre who serves as the Director of the HPC Advisory Council Center of Excellence in Switzerland. “As part of the HPC Advisory Council’s Center of Excellence, we look forward to advancing awareness of the beneficial capabilities of HPC to new users.”

centers_of_excellence

New system arrived to our HPC center!

Recently we have added new systems into out HPC center, and you see the full list at http://www.hpcadvisorycouncil.com/cluster_center.php.

The newest system is the “Vesta” system (and you can see Pak Lui, the HPC Advisory Council HPC Center Manager  standing next to it in the picture below). Vesta consist of six Dell™ PowerEdge™ R815 nodes, each with four processors AMD Opteron 6172 (Magny-Cours) which mean 48 Cores per node and 288 cores for the entire system. The networking was provided by Mellanox, and we have plugged two adapters per node (Mellanox ConnectX®-2 40Gb/s InfiniBand adapters). All nodes are connected via Mellanox 36-Port 40Gb/s InfiniBand Switch. Furthermore, each node has 128 GB, 1333 MHz memory to make sure we can really get the highest performance from this system.

 

Microsoft has provided us with Windows HPC 2008 v3 preview, so we can check the performance gain versus v2 for example. The system is capable of dual boot – Windows and Linux, and is now available for testing. If you would like to get access, just fill the form on the URL above.

 

 Vesta

In the picture – Pak Lui standing next to Vesta

 

I want to thank Dell, AMD and Mellanox for providing this system to the council!

 

Regards,

Gilad, HPC Advisory Council Chairman

HPC in 2010: A look at technology, issues, and opportunities

insideHPC talked with Gilad Shainer, the Chairman of the HPC Advisory Council, about what the organziation does, how its grown, and how it is helping catalyze developments for users and HPC businesses.  In addition talking about what the Council has already accomplished, Gilad also talks about the new research and focus areas that they are kicking off. The  Council’s mission is to help everyone, so papers, inf0rmation, best practices, and so on are all available for download at their website.

Video is posted here: http://insidehpc.com/hpc-2010-technology-issues-opportunities/

SC09 – Building the Fastest Networking Demonstration on the Show Floor

Now that SC09 is done we can look back on the council activities and achievements during the largest HPC show in the world. As we did during SC08 and ISC’09, the council members invested a huge amount of effort and brought together the fastest networking solution – a 120Gb/s InfiniBand network demonstration, as part of SCinet. You can see the networking diagram below and I would like to thank the following organization that helped to make this amazing demo a reality: AMD, Avago, Colfax Intl, Dell, HP, IBM, InfiniBand Trade Association, Koi Computers, LSI, Los Alamos National Laboratory, Luxtera, Mellanox Technologies, Microsoft, NVIDIA, Oak Ridge National Laboratory, RAID, Scalable Graphics, SGI, Sun Microsystems, Texas Advanced Computing Center, The University of Utah Center for High-Performance Computing, and Voltaire. 

SCinet Network Diagram

One of our council members, Scalable Graphics, helped to create a highly visual demo of a Peugeot car rendering in 3D (see picture below). The demo included an interactive component, where viewers wear glasses that tracked their position and moved the image accordantly.

3D Visualization Demo

Also during the conference, we held our semi-annual meeting, where we reviewed our plans for 2010. To recap, we will be building off our successful HPC Advisory Council Workshop in China and we will be hosting 4 workshops next year; the first one in March will be hosted in Switzerland together with the Swiss Supercomputing Center, and the second in May as part of the International Supercomputing Conference. More info can be found on the council main web pages.

Best regards,

Gilad, HPC Advisory Council Chairman.

ROI through efficiency and utilization

High-performance computing provides an invaluable role in research, product development and education. It helps accelerate time to market, and provides significant cost reductions in product development and tremendous flexibility. One strength in high-performance computing is the ability to achieve best sustained performance by driving the CPU performance towards its limits. Over the past decade, high-performance computing has migrated from supercomputers to commodity clusters. More than eighty percent of the world’s Top500 compute system installations in June 2009 were clusters. The driver for this move appears to be a combination of Moore’s Law (enabling higher performance computers at lower costs) and the ultimate drive for the best cost/performance and power/performance. Cluster productivity and flexibility are the most important factors for a cluster’s hardware and software configuration.

A deeper examination of the world’s Top500 systems based on commodity clusters shows two main interconnect solutions that are being used to connect the servers for creating those compute powerful systems – InfiniBand and Ethernet. If we divide the systems according to the interconnect family, we will see that the same CPUs, memory speed and other settings are common between the two groups. The only difference between the two groups, besides the interconnect, is the system efficiency, or how many of CPU cycles can be dedicated to the application work, and how many of them will be wasted. The below graph list the systems according to their interconnect setting, and their measured efficiency.

 top500

As seen, systems connected with Ethernet achieves an average 50% efficiency, which means that 50% of the CPU cycles are wasted on non-application work or are idle, waiting for data to arrive.  Systems connected with InfiniBand achieve an above 80% efficiency average, which means that less than 20% of the CPU cycles are wasted. Moreover, the latest InfiniBand based systems have demonstrated up to 94% efficiency (the best Ethernet connected systems demonstrated 63% efficiency).

People might argue that the Linpack benchmark is not the best benchmark for measuring parallel application efficiency, and does not fully utilize the network. The graph results are a clear indication that even for the Linpack application, the network does make a difference, and for better parallel application, the gap will be much higher.

When choosing the system setting, with the notion of maximizing return on investment, one needs to make sure no artificial bottlenecks will be created. Multi-core platforms, parallel applications, large databases etc require fast data exchange and lots of it. Ethernet can become the system bottleneck due to latency/bandwidth and CPU overhead due to the TCP/UDP processing (TOE solutions introduce other issues, sometime more complicated, but this is a topic for another blog) and reduce the system efficiency to 50%. This means that half of the compute system is wasted, and just consumes power and cooling. Same performance capability could have been achieved with half of the servers if they were connected with InfiniBand. More data on different application performance, productivity and ROI, can be found at the HPC Advisory Council web site, under content/best practices.

While InfiniBand will demonstrate higher efficiency and productivity, there are several ways to increase Ethernet efficiency. One of them is optimizing the transport layer to provide zero copy and lower CPU overhead (not by using TOE solutions, as those introduce single points of failure in the system). This capability is known as LLE (low latency Ethernet). More on LLE will be discussed in future blogs.

Gilad Shainer HPC Advisory Council Chairman
gilad@hpcadvisorycouncil.com

Inauguration of 1st European Petaflop Computer in Jülich, Germany

On Tuesday, May 26, the Research Center Jülich reached a significant milestone of German and European supercomputing with the inauguration of two new supercomputers: the supercomputer JUROPA and the fusion machine HPC FF. The symbolic start of the systems were triggered by the German Federal Minister for Education and Research, Prof. Dr. Annette Schavan, the Prime Minister of North Rhine-Westphalia, Dr. Jürgen Rüttgers, and Prof. Dr. Achim Bachem, Chairman of the Board of Directors at Research Center Jülich as well as high-ranking international guests from academia, industry and politics.

JUROPA (which stands for Juelich Research on Petaflop Architectures) will be used Pan-European-wide by more than 200 research groups to run their data-intensive applications. JUROPA is based on a cluster configuration of Sun Blade servers, Intel Nehalem processors, Mellanox 40Gb/s InfiniBand and Cluster Operation Software ParaStation from ParTec Cluster Competence Center GmbH. The system was jointly developed by experts of the Jülich Supercomputing Center and implemented with partner companies Bull, Sun, Intel, Mellanox and ParTec. It consists of 2,208 compute nodes with a total computing power of 207 Teraflops and was sponsored by the Helmholtz Community. Prof. Dr. Dr. Thomas Lippert, Head of Jülich Supercomputing Center, explains the HPC Installation in Jülich in the video below.

HPC-FF (High Performance Computing – for Fusion), drawn up by the team headed by Dr. Thomas Lippert, director of the Jülich Supercomputing Centre, was optimized and implemented together with the partner companies Bull, SUN, Intel, Mellanox and ParTec. This new best-of-breed system, one of Europe’s most powerful, will support advanced research in many areas such as health, information, environment, and energy. It consists of 1,080 computing nodes each equipped with two Nehalem EP Quad Core processors from Intel. Their total computing power of 101 teraflop/s corresponds, at the present moment, to 30th place in the list of the world’s fastest supercomputers. The combined cluster will achieve 300 teraflops/s computing power and will be included in the rating of the Top500 list, published this month at ISC’09 in Hamburg, Germany.

40Gb/s InfiniBand from Mellanox is used as the system interconnect. The administrative infrastructure is based on NovaScale R422-E2 servers from French supercomputer manufacturer Bull, who supplied the compute hardware and the SUN ZFS/Lustre Filesystem. The cluster operating system “ParaStation V5″ is supplied by Munich software company ParTec. HPC-FF is being funded by the European Commission (EURATOM), the member institutes of EFDA, and Forschungszentrum Jülich.

Complete System facts: 3288 compute nodes ; 79 TB main memory; 26304 cores; 308 Teraflops peak performance.

Gilad Shainer,
HPC Advisory Council Chairman
shainer@mellanox.com

The HPC Advisory Council Cluster Center – update

Recently we have completed a small refresh in the cluster center. The Cluster Center offers an environment for developing, testing, benchmarking and optimizing products free of charge. The center, located in Sunnyvale, California, provides on-site technical support and enables secure sessions onsite or remotely. The Cluster Center provides a unique ability to access the latest clustering technology, sometimes even before it reaches public availability.

In the last few weeks, we have completed the installation of a Windows HPC Server 2008 cluster, and now it is available for testing (via the Vulcan cluster). We have also received the Scyld ClusterWare™ HPC cluster management solution from Penguin Computing (a member company) and installed it on the Osiris cluster.

Scyld was designed to make the deployment and management of Linux clusters as easy as the deployment and management of a single system. A Scyld ClusterWare cluster consists of a master node and compute nodes. The master node is the central point of control for the entire cluster. Compute nodes appear as attached processor and memory resources. More information on Scyld can be found here.

Adding Scyld to Osiris helps the Council with the best practices research activities that provide guidelines to end-users on how to maximize productivity for various applications using 20 and 40Gb/s InfiniBand 20 or 10 Gigabit Ethernet. I would like to thank Matt Jacobs and Joshua Bernstein from Penguin Computing for their donation and support during the Scyld installation.

Best regards,
Gilad Shainer
Chairman of the HPC Advisory Council