As we speak, scientists are racing to apply Big Data computation techniques to help answer some of the most fundamental questions about the origin, composition, and evolution of the universe. A large part of this is the quest to understand dark matter and dark energy. These are thought to comprise as much as 96% of the universe but are undetectable through normal means – hence the term ‘dark’.
The answer may lie in Big Data — the ability to simultaneously correlate the movements of vast numbers of galaxies and find patterns that unlock these darkest of secrets in the universe. The challenge is keeping up with the data. In recent decades, the observed and simulated datasets have grown from a few dozen objects to billions. With the advent of bigger and faster telescopes there is no end in sight for the exploding data. Performing the necessary algorithm (called the Two Point Correlation Function or TPCF) today on a billion galaxies would take a single processor 50 years to analyze — and even today’s supercomputers are hard pressed to keep up. The computational requirements are expected to grow further, well into the domain of Exascale computing. This is the subject of an ACM Gordon Bell Prize – nominated SC12 paper from Intel this week.
Intel Labs, in collaboration with Lawrence Berkeley National Laboratory (LBNL) and the University of California, Berkeley (and in support of their ISAAC project) has demonstrated new techniques to significantly accelerate the computation of TPCF on these immense datasets and reduce both the cost and energy of the quest of cosmic understanding. This approach is comprised of three components: 1) the ability to effectively distribute and manage the work across tens of thousands of Intel® Xeon® compute cores, 2) more efficient use of the SIMD (single compute on multiple data) capabilities within each Intel Xeon core, and 3) more efficient communications among the compute nodes.
This technique was tested on a 1.7 billion object dataset (provided via a collaboration between LBNL and the University of Sussex) using Lawrence Livermore National Laboratory’s Zin computer, a Petascale-class machine with 1600 nodes each containing two Intel Xeon processors. The calculation was completed in just over five hours — more than 35 times faster than previous approaches (see notices below). This means scientists will be able to use this technique to complete experiments in a single day rather than weeks. In addition, the experiment demonstrated an 11x improvement in cost efficiency (measured in flops/$), making these experiments more practical and affordable.
More recently on the Texas Advanced Computing Center’s Stampede cluster, a Petascale computer using Intel® Xeon Phi™ Coprocessors (launched this week), Intel Labs achieved a further speedup in run-time of 3.2 X on each node in comparison to the results above (see notices below).
This technique provides a path to computing even larger datasets into the Exascale domain, where new answers to many cosmological questions may be found within the next decade.
- Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance
- Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice revision #20110804