This week we announced the formation of the HPC Advisory Council Centers of Excellence. The HPC Advisory Council Centers of Excellence will provide local support for the HPC Advisory Council’s programs, local workshops and conferences, as well as host local computing centers that can be used to extend such activities.
“We are pleased to be named as one the inaugural HPC Advisory Council’s Centers of Excellence, covering HPC research, outreach and educational activities within Europe,” said Hussein Nasser El-Harake at the Swiss National Supercomputing Centre who serves as the Director of the HPC Advisory Council Center of Excellence in Switzerland. “As part of the HPC Advisory Council’s Center of Excellence, we look forward to advancing awareness of the beneficial capabilities of HPC to new users.”
The new HPC|GPU subgroup has been working recently to create first best practices around the new technology from NVIDIA – GPUDirect. Here is some background on GPUDirect: the system architecture of a GPU-CPU server requires the CPU to initiate and manage memory transfers between the GPU and the network. The new GPUDirect technology enables Tesla and Fermi GPUs to transfer data to pinned system memory that a RDMA capable network is able to read and send without the involvement of the CPU in the data path. The result is an increase in overall system performance and efficiency by reducing the GPU to GPU communication latency (by 30% as was published by some vendors). The HPC|GPU subgroup is first to release benchmarks results of application using GPUDirect. The application that was chosen for the testing was Amber, a molecular dynamics software package. Testing with 8 nodes cluster demonstrated up to 33% performance increase using GPUDirect. If you want to read more – check out the HPC|GPU page – http://www.hpcadvisorycouncil.com/subgroups_hpc_gpu.php.
Wanted to let you know that we have extended the high-performance applications best practices to:
1. Extend the applications performance, optimization and profiling guidelines to cover nearly 30 different applications, both commercial and open source – http://www.hpcadvisorycouncil.com/best_practices.php
2. We have added the first case using RoCE (RDMA over Converged Ethernet) to the performance, optimization and profiling guidelines page. It is under the same link as in item 1
3. New – installations guides – for those who asked to get a detailed description on where to get the application from, what is needed to be installed, how to install on a cluster, and how to actually run the application – it is now posted under the HPC|Works subgroup – http://www.hpcadvisorycouncil.com/subgroups_hpc_works.php. We will be focusing on open source applications, which sometime it challenging to really find this info. At the moment we have installations guides for BQCD, Espresso and NAMD, and more will come in the near future.
If you would like to propose new applications to be covered under the performance, optimization and profiling guidelines, or to be added to the installations guides, please let us know via email@example.com.
For those who missed the announcement, our 2nd Annual China High-Performance Computing Workshop will be on October 27th, 2010 in Beijing, China in conjunction with the HPC China National Annual Conference on High-Performance Computing. The Call for presentations as well as workshop sponsorships are now open – http://www.hpcadvisorycouncil.com/events/2010/china_workshop/. The workshop will focus on efficient high-performance computing through best practices, future system capabilities through new hardware, software and computing environments and high-performance computing user experience.
The workshop will be opened with keynote presentations by Prof. Dhabaleswar K. (DK) Panda who leads the Network-Based Computing Research Group at The Ohio State University (USA) and Dr. HUO Zhigang from the National Center for Intelligent Computing (China). The keynotes will be followed by distinguished speakers from the academia and the industry. The workshop will bring together system managers, researchers, developers, computational scientists and industry affiliates to discuss recent developments and future advancements in High-Performance Computing.
And again – Call for Presentations and Sponsorships are now Open, so if you are interested, let us know. For the preliminary agenda and schedule, please refer to the workshop website. The workshop is free to HPC China attendees and to the HPC Advisory Council members. Registration is required and can be made at the HPC Advisory Council China Workshop website.
Recently we have added new systems into out HPC center, and you see the full list at http://www.hpcadvisorycouncil.com/cluster_center.php.
The newest system is the “Vesta” system (and you can see Pak Lui, the HPC Advisory Council HPC Center Manager standing next to it in the picture below). Vesta consist of six Dell™ PowerEdge™ R815 nodes, each with four processors AMD Opteron 6172 (Magny-Cours) which mean 48 Cores per node and 288 cores for the entire system. The networking was provided by Mellanox, and we have plugged two adapters per node (Mellanox ConnectX®-2 40Gb/s InfiniBand adapters). All nodes are connected via Mellanox 36-Port 40Gb/s InfiniBand Switch. Furthermore, each node has 128 GB, 1333 MHz memory to make sure we can really get the highest performance from this system.
Microsoft has provided us with Windows HPC 2008 v3 preview, so we can check the performance gain versus v2 for example. The system is capable of dual boot – Windows and Linux, and is now available for testing. If you would like to get access, just fill the form on the URL above.
In the picture – Pak Lui standing next to Vesta
I want to thank Dell, AMD and Mellanox for providing this system to the council!
Gilad, HPC Advisory Council Chairman
Each year, scientists participating in the Scientific Discovery through Advanced Computing Program (SciDAC), along with other researchers from the computational science community gather at the annual SciDAC conference to present scientific results, discuss new technologies and discover new approaches to collaboration. The SciDAC 2010 Conference is being held this week in Chattanooga, Tennessee. Thomas Zacharia, Oak Ridge National Laboratory, chairs the conference this year.
I participated in the first 2 days of SciDAC. While the weather there is not ideal to say the least… I have enjoyed the presentations and discussions. The Undersecretary Steven Koonin (undersecretary for science, department of energy) commented on DoE future goals – reducing Oil consumption in transportation by 35%, reducing greenhouse gas emissions 17% by 2020, and 83% by 2050, maintain technical base and exploit simulation capabilities.
Many of the discussions were on Exascale computing – what is needed to be done in order to get there – systems, applications etc, with the goal to have the first system by 2018. Our HPC|Scale subgroup will try to help make the right steps toward this goal with exploration and experiments. I hope to report soon on the subgroup progress.
The second biggest HPC show will start next week, and I am now on a United flight from San Francisco to Germany on my way to the conference. On Sunday we will do the HPC Advisory Council European workshop, the second workshop of the year (3 more to come – TeraGrid, HPC China and MEW UK). For the workshop we have gathered some interesting HPC people from around the world – Richard Graham and Steve Poole from Oak Ridge National Laboratory US, Norbert Eicker from the Jülich Supercomputing Centre Germany, Tor Skeie from Simula Research Laboratory Norway, and Abhishek Das from C-DAC India. We will also host interesting talks on HPC in cloud computing, virtualization and HPC, application best practices and more. I will make sure to capture the highlights from the day and share it here.
ISC’10 is also the place where the Top500 list is being published. Published twice a year (ISC and SC), the Top500 list rank the 500 fastest supercomputers in the world. Why is it important for us? Because it can give us indications on trends in the HPC world, capabilities and usage models. It is also a great tool for making predications on when we will see the first ExaScale system etc.
And last but not least, the HPC Advisory Council will give the 2009-2010 awards in 4 categories in the closing session of ISC – Thursday at 1pm Germany time. If you are planning to be in Germany feel free to join us at the workshop, or be there for the award ceremony. Or just drop by and say hello…
I work as a Thermal Application Engineer in 3M’s Electronic Markets Materials Division. For more than 50 years, my group has made fluorochemical heat transfer fluids that have been used for immersion cooling of high value electronics. Some are familiar with the various Fluorinert™-cooled Cray supercomputers but our fluids are also used in tens of thousands of immersion cooled traction inverters and a variety of military platforms.
Evaporative immersion is arguably one of the most efficient ways to implement fluids like ours for cooling electronics. Heat sources on a PCB immersed in the fluid cause that fluid to boil. This captures all heat and allows it to be transferred efficiently to air or water via a secondary condenser. Historically, immersion systems of this type have used sealed pressure vessels with hermetic electrical connections and are evacuated and filled much like refrigeration systems. Because it can be costly to create such an enclosure for computational electronics with a lot of IO, Engineers often dismiss the idea of immersion in the context of commodity datacenter equipment. The concept we are promoting (see attached) eliminates these complexities. Other advantages are summarized below.
- All server level and most rack level cooling hardware are eliminated
– reduced environmental impact (landfill)
– simplified server/rack design
– reduced cooling equipment cost
– no moving parts to fail or leak
- Essentially no thermal limit on server power density
– 4 kW/liter (4 MW/m3) has been demonstrated (>100X typical air cooled and >25X typical supercomputer)
– possibility for reductions in raw material usage (PCB, etc)
- Intrinsic fire protection
We are demonstrating this concept with real computing hardware but because we are using off-the-shelf air cooled components, the power density merits of this technology cannot be realized and the demonstration will lack luster.
We seek partners with challenging-to-cool hardware and an interest in exploring this technology. We believe it could be a transformative technology enabling the next-generation power density goals of the HPC industry.
Readers can learn more at:
or write to me for an Overview Presentation. Thank you and Best Regards,
Phil E. Tuma
3M Electronics Markets
HPC Advisory Council goes to Italy!
Well, before we go to Italy, we have a workshop in Germany as part of the International Supercomputing conference (http://www.hpcadvisorycouncil.com/events/european_workshop/index.php). The registration for the Germany event is being done via the ISC’10 (http://www.supercomp.de/isc10/) registration web site – for more info or issues please contact me.
So once we will be back from Germany, the HPC Advisory Council will visit Italy and participate in the INTERNATIONAL ADVANCED RESEARCH WORKSHOP ON HIGH PERFORMANCE COMPUTING, GRIDS AND CLOUDS (http://www.hpcc.unical.it/hpc2010/). This is an open workshop, free of charge (yes, no registration fees are required for participants of the workshop). The aim of the Workshop is to discuss the future developments in the HPC technologies, and to contribute to assess the main aspects of Grids and Clouds, with special emphasis on solutions to grid and cloud computing deployment. The council will be there and will contribute to the interesting discussions. So if you are in the area, or want to visit Italy in June, join us for the workshop (June 21 – 25).
Recently we have been working on performance optimizations for Platform MPI for the Swedish Meteorological and Hydrological Institute (SMHI).
The application we were testing the MPI optimization for is the “SMHI RCO application” which can use the ScaliMPI or the PMPI (Platform MPI).
At first we have tested the application performance on Scali MPI and Platform MPI and achieved the following results (using 144 ranks, 18 hosts 8 ranks each on the “Helios” cluster). The original Scali MPI based run was 474 seconds and the original Platform-MPI was 550 seconds.
We have modified the Platform MPI and the optimized results with Platform MPI demonstrated 450 seconds for the application run.
We want to thank the HPC Advisory Council for providing the resources for us to evaluate our optimizations and provide a better solution for the customer.