2011 was a great year for the HPC Advisory Council. We have created and published many best practices, established new special interest groups and conducted multiple HPC conferences around the world – Switzerland, Germany, China, Ukraine and in the USA. I would like to thank all who helped with the council activities throughout the year.
We are looking forward to even busier 2012 – we will be doing at least six HPC conferences – starting in Israel in February, Switzerland in March, Germany in June, Spain in September, China in October and USA in December. We might add couple more during 2012.
We continue to collect more training material to post on our web site – please send us anything that makes sense to info at hpcadvisorycouncil dot com. As well as any suggestions or ideas. Below a picture from our latest Stanford HPC conference.
Wishing you all happy holidays and happy new year!
Gilad and the HPC Advisory Council board
Join us on October 25th 2011 for an HPC Workshop “Enabling Discovery with Dell HPC Solutions” in Lexington, KY. The event will include presenters who will share their HPC experiences applicable to all HPC disciplines and will conclude with an HPC Panel of Experts with opinions from Industry leaders. Register here: https://www.etouches.com/HPCLexington
Christine Fronczak, Dell
Those of you who have been reading my blog or listening to my various conference talks since I arrived at VMware last year know I have been arguing there is an important convergence underway between HPC and Enterprise IT requirements. In large part this is being driven by fundamental changes within Enterprise computing, specifically the move towards massive, horizontally-scaled infrastructure to deliver scalable services externally in the cloud or internally for strategic business advantage.
As you’d expect with my background, I’ve been looking at this from an HPC perspective, noting that the HPC community has been heavily invested for decades in scale-out cluster provisioning, management, monitoring, etc. And they have built expertise around hardware architectures to support scale-out computing, most particularly in the design of high-performance interconnect fabrics. These fabrics support the link-level bandwidths and latencies as well as the fabric bisection bandwidths needed to enable both highly-parallel computations as well as ubiquitous, high-performance access to shared resources like storage in a high-scale environment.
Imagine my delight when I came across a position paper espousing the same view titled It’s Time for Low Latency by Stephen Rumble, Ryan Stutsman, Mendel Rosenblum, and John Ousterhout that was presented at HotOS ’11 (http://www.scs.stanford.edu/~rumble/papers/latency_hotos11.pdf). And what a coincidence that the paper’s third author is none other than VMware’s well known co-founder, Mendel Rosenblum.
The paper approaches the importance of low latency from an Enterprise perspective, citing numerous reasons why a high-performance interconnect will benefit the modern datacenter and its applications. The paper is styled as a call to action for OS and system designers to heed the need to pay significant attention to making low latency communication available to current and emerging Enterprise applications, including Big Data applications.
The paper is a quick read and well worth the time.
Josh Simons, VMware
The HPC Advisory Council includes five special interest subgroups:
- HPC|Scale Subgroup – exploring the usage of commodity HPC as a replacement for multi-million dollar mainframes and proprietary based supercomputers with networks and clusters of microcomputers acting in unison to deliver high-end computing services. Chair: Richard Graham
- HPC|Cloud Subgroup – exploring the usage of HPC components as part of the creation of external/public/internal/private cloud computing environments. Chair: William Lu
- HPC|Works Subgroup – providing best practices for building balanced and scalable HPC systems, performance tuning and application guidelines. Chair: Pak Lui
- HPC|Storage Subgroup – demonstrate how to build high-performance storage solutions and their affect on application performance and productivity. One of the main interests of the HPC|Storage subgroup is to explore Lustre based solutions, and to expose more users to the potential of Lustre over high-speed networks. Chair: Hussein Harake
- HPC|GPU Subgroup – exploring usage models of GPU components as part of next generation compute environments and potential optimizations for GPU based computing. Chairs: Sadaf Alam and Gilad Shainer.
If anyone interested in joining and contributing to the groups activities, please contact email@example.com. I do want to thank the groups chairs for their contributions.
About two years ago, we initiated an investigation into new market opportunities for Xyratex. During this investigation we learned that the High Performance Computing (HPC) market was a dynamic market opportunity with a substantial need for better data storage design. We also discovered that the way data storage was being implemented at many of these supercomputing sites was unduly complicated in terms of initial installation, performance optimization and ongoing management. Users had to contend with days and possibly weeks of tweaking to get the system up and running stably. After this initial installation period was complete, the ongoing management of the system was also complicated by varied and disjointed system and management tools. Often administrators would have to contend with debug scenarios that required the application of scarce resources and ultimately the sub optimal performance of their HPC system.
We were surprised at these findings and saw a lot of opportunity for Xyratex to deliver new innovation in terms of performance, availability and ease of management. Xyratex decided to make a significant investment in addressing these needs. This investment included the acquisition of ClusterStor, but didn’t stop there. We have nearly 150 engineers working on the program and we developed a brand new high density application platform that is optimized for performance and availability. Finally, we developed a new management framework that addresses the complexity issues we found in the management of HPC storage clusters.
Today, we announced the ClusterStor™ 3000. This release is the result of that significant investment over the last two years and provides our partners with an innovative new solution for the HPC marketplace. We leveraged our core capabilities, in data storage subsystem design and the Lustre expertise we obtained in the ClusterStor acquisition last year, to develop an HPC solution that provides the best-in-class performance, scale out architecture and unprecedented ease of management.
Ken Claffey, Xyratex
We are proud to announce the release of SPM.Python version 3.110505, and wish to acknowledge the generous support of the HPC Advisory Council in providing access to GPU servers to validate and stress test of our solution.
SPM.Python is a scalable parallel version of the popular Python language. Showcased as a disruptive technology at Supercomputing 2010 and highlighted on StartUp Row at PyCON 2011, it enables users to exploit parallelism across servers, cores and GPUs in a fault tolerant manner.
Using resources at the HPC Advisory Council High Performance Center, we were able to conduct around 20,000 different sets of experiments; most were designed to fail in order to
validate the failure recovery and self-cleaning capabilities of SPM.Python.
With this release, users may launch any standalone application in parallel and in a manner that inherits fault tolerance from SPM.Python, thus freeing the developers to focus on their core application while maximizing utilization of resources and minimizing runtime costs.
Student teams are encouraged to submit proposals to build high performance computing clusters on the convention center floor in real time and push their applications to the limit to win bragging rights as the world’s best. Submissions are now open for the fifth annual Student Cluster Competition (SCC) for at the SC11 conference held in Seattle, WA on Nov. 12 – 18, 2011. Please note that the deadline for teams to enter contest is April 15
The competition pits six teams of undergraduates against one another to see who can build and configure a cluster in 46 hours that accomplishes the most work using “real world” computational codes in the least amount of time. In addition to time constraints, students must work within the parameters of the designated system and power configurations, and use open-source software to solve the applications provided to them.
In addition to showcasing the power of current-generation clusters, one of SCC’s primary goals is to demonstrate to companies and supercomputing labs that the best high performance computing (HPC) candidates might be as close as the university next door. Through the final selection process, the SCC committee focuses on recruiting the best high performance computing talent to compete each year.
Student teams may now submit their applications at the SC11 Submissions Site - https://submissions.supercomputing.org/ ; the SCC deadline is April 15, 2011. Teams will find more information and may refer to a sample Student Cluster Competition submission form.
For additional information, please contact firstname.lastname@example.org
Teams looking for hardware resources for the competition are encourage to contact the council at email@example.com
Gilad, HPC Advisory Council Chairman
Next week we will be doing the HPCAC 2nd Switzerland workshop. Last year around 100 attendees enjoyed the three days of the workshop, the interesting presentations and the technical training. Next week we expect to see higher number of attendees to participate and contribute to the workshop. There will be many very interesting presentations and of course hands-on training at the end of each day.
The complete agenda can be found on the workshop page – http://www.hpcadvisorycouncil.com/events/2011/switzerland_workshop/index.php. If you would like to attend and have not registered yet, please do it as soon as possible. It will be an excellent opportunity to meet some of the people who lead various development efforts in the multiple fields of high performance computing.
Gilad, HPC Advisory Council Chairman
In the last few weeks the HPC|GPU group has made public several interesting testing results. The latest publications can be found on the HPC|GPU Working Group page – http://www.hpcadvisorycouncil.com/subgroups_hpc_gpu.php.
The most recent publication covered the GPU/Node optimum ratio topic, and in particular for the NAMD application (a parallel molecular dynamics code that received the 2002 Gordon Bell Award and designed for high-performance simulation of large biomolecular systems). The group was looking to indentify the desired ratio between how many GPU should be placed in a single node (from 1 to 4) in order to achieve the highest performance. The results indicate that a single GPU per node, and using more nodes is a better configuration performance wise versus packing more GPUs in a single node.
The testing effort covered other topics such as the performance gain versus the application dataset and more. You are encouraged to review the complete results on the group page. The group welcomes new testing ideas and comments – please send them to the group mailing list.
Gilad, HPC Advisory Council Chairman
Congratulation to Pak, the HPC Advisory Council Cluster Center Manager and his beautiful bride Jessica who got married yesterday!
Gilad and Brian