High Performance Computing (HPC)

First of all, why should we care? The big capability machines are designed not only to do science but also to address a set of problems termed Grand Challenge Problems. A Grand Challenge Problem is simplistically defined as… one that cannot be solved in a reasonable amount of time with today’s computers. – Wikipedia. So there’s a race to build machines that can solve these problems that, by their very definition, have significant economic or social impact. Some examples of these problems include or have included: climate modeling, human genome mapping, semiconductor modeling, vision and cognition to name a few. Some of the Grand Challenge problems will, at some point in the future, require machines with performance on the order of 1 ZETTAFLOP (10^21 FLOPS, that’s equal to one BILLION ASCI Red machines!) Again, taking the compound aggregate growth rate of the large machines and extend that trend into the future, we wouldn’t expect to see such a machine until ~ 2029. Just in time for my 70th birthday.

High Performance Computing (HPC) has a broad definition. Most of the time when HPC is mentioned, many think about the large capability machines. To find out about the highest performing machines in the world today, all one needs to do is go to the ‘Top 500 Supercomputer sites’ website at: http://top500.org/lists.

The machines are rated by their LINPACK benchmark number. As one would expect, number one on the list is the machine that achieved the maximum performance.

One note, the first TFLOP machine (ASCI Red), introduced in 1997, was based on the Intel PentiumPro processor. TFLOP was the top rated machine for an as yet unmatched seven consecutive releases of the list .It remained on the Top 500 list until 2005, eight years after being introduced as number one.

Since November 2006, the top rated machine has had a performance of ~280 Trillion Floating Point Operations per Second (10^12 Floating Point Operations per Second – TFLOPS) and consists of 131,002 processors. We expect that the first Peta FLOP machine (10^15 Floating Point Operations per second) will arrive in the 2008/2009 timeframe, assuming that the rate of compound aggregate growth in computing remains consistent with the trend from 1994 to today.

There are some common attributes to these systems. First they require lots of research $$ to allow companies to invest in new technologies…technologies that may be too risky to immediately integrate into high volume products. Government programs help with a significant portion of the R&D, which can make building such a system feasible for a company or group of companies. The business isn’t exactly the most lucrative. There are a few players in this segment. Some do it because building these machines is their business model. Still others do it for the same reason that car companies have racing teams…develop new technologies, test and perfect them in the high performance, high stress environments and, eventually, some of these technologies will make it into the mainstream product(s). Some technologies do make it to mainstream product. One example is game physics algorithms/logic that’s used in high end gaming and game console systems. Also noteworthy are the pipelined vector processing units introduced more than three decades ago by Seymour Cray: today they power the graphics units used in personal computers. Another example is the Unix operating system which began its life in high-end servers and lives on today not just in those servers but even as the core of Apple’s OS-X operating system.

Greater computing demand requires higher processing capability (1-10 TFLOP per processor), improved power efficiency (today’s high end systems consume on the order of 10’s of MegaWatts.), much higher memory capacity and bandwidth requirements that make today’s memory subsystems pale in comparison, greater IO bandwidth, very fast and efficient light-weight kernel OS’s, and applications that can scale up to millions of threads. Pretty challenging but the benefits can be huge.

Not all systems on the TOP 500 list are big, proprietary, capability systems. Some of them use standard, common-off-the-shelf (COTS) components. Have a look at them when you get the chance. They will be the subject of the next blog.

10 Responses to High Performance Computing (HPC)

  1. Prune Wickart says:

    Do you have a pointer to the top500 list itself? The link in the text leads to a list of the conferences at which the list is released. Searching the individual conference pages gets me individual session descriptions, but not the top500 list. The cross-reference at the end of the article leads to a top500 “article” that’s nothing but header.

  2. Alastair Stell says:

    It should be obvious the long term value of massive, multi-core processor blocks lies in eliminating the need for coded applications. By this I mean that rules define the performance of a program; therefore a machine should be given the rules, then within its available resources the machine is allowed to determine how the work is apportioned. In other words, self programming machines that exhibit unlimited scalability. The machine learns, improves and adapts.
    Now bring in another application, define the rules of engagement (betweeen the data and rule sets) and (perhaps with some intervention training) you have inter-operability.
    Instead of optimized programs we now have learned (and remembered) experience. Programs get smarter (with both time and usage). Perhaps we can communicate or even trade the accumulated experience?
    Such an approach is possible today but not at all practical. However, as the number of cores grows, is there really any other direction we can take? Or are we going to wait for microsoft to do this for us? If so we may wait a long time.
    The proof, if you need one, lies in the exact examples you provide. To justify the enormous cost of the machines, research and programming we are talking about very focused applications with a specific economic, social or scientific yield. How about we provide a more generic solution that would enable a vast range of scalable applications?
    The limitation most of us on the hardware/software interface must overcome is our own preconceptions about what a processor complex can, or should, do. This is the leap we must take.

  3. Benny Eitan says:

    Does anybody tracking the volume and the power consumption (including air condition’s power) for these systems?
    Also it will be interesting to get some idea on the usual IT kind of statistics (uptime, % usage, MTBF, etc.).
    I wonder how well Intel based CPUs (263 entries of the latest 500) is doing there.
    Maybe top500.org guys should add these to their data base?

  4. Zach says:

    Over the years i have seen change in use of motherboards like the conversion from IDE to SATA. With careful observation i have noticed current motherboards have multiple connections that are unessary to the structure of modern motherboard. Modern motherboards still emit large amounts of heat for circuits that dont necessarilt need to be running. For example, I personally do not use IDE connection anymore because it is simply obselete. With new SATA in the modern days is would be possible to eliminate IDE, USB, PS2 Mouse and keyboard, and many other obselete plug ins. My question for intel is: What is your company doing to simplify modern motherboards which contain a large amount of completly obselete hardware.
    My suggestions include removing current IDE, PS/2, and other components for a basic computer user.
    Just imagine how simple it would be to have a computer with a simple motherboard, case, power supply, usb, sata, lan, DVI and more. Did you notice how a cd drive was not in that list? Also, think about the large amount of plastic used to make cd’s and how it wasted. If you could turn that plastic into one portable media player and go to kiosk where you could plug your media player into and download your movies, music, and more right there, someone wouldnt have the need to go to a store to buy movies and waste money on plastic cd’s.
    Once again, what is intel doing to reduce the amount of energy and hardware waste that used in modern computer motherboards other than your recylcing groups which have small effects on pinpoint communities?

  5. Wine Lover says:

    Can you tell me how I should think about Itanium vs x86 for HPC? Which should I choose and why?
    Which is faster on a per socket basis today? What is the outlook for the future?

  6. Richard says:

    I think Zach has asked a very important question re simplified architecture to eliminate waste of power… will someone at Intel should answer this question?

  7. tshirts says:

    To get a computer to sort out economic or social problems is just plain stupid. Honestly : I mean, I hate to be so blunt about it, but it’s just never going to work. It’s not like you’re going to sort out social problems through a computer which only understands numbers and stuff. I mean, I know what it’s going to say already – it’ll just give a numeric result that has no understanding of social problems, morality, justice etc.
    I know that the whole idea is that it would understand these things, but I think things like morality and justice can’t really be resolved through maths : that’s taking maths a little too far.