HPC Speaker Series at Stanford Shows How Modern Code Helps Developers Build Code Designed to Take Advantage of Today’s Powerful
From the discovery of the Higgs boson particle at the Large Hadron Collider, to the development of hypersonic vehicles, to the mapping of the human genome, high performance computing (HPC) is changing the world.
And recent advancements in hardware—including multi- and many-core processors, high-bandwidth inter-processor communications fabric, lighting-fast memory, huge caches, and broad support for I/O capabilities—mean that today’s processors have the power to run increasingly demanding datasets, like big data analytics, visualization, machine learning, and more.
The kind of workloads that deliver deep insights that fuel innovation and expand human capabilities.
The first step: Get up to speed on modern code
By incorporating parallelism at multiple levels—including vectorization, multithreading, and multi-node optimization—developers can make full advantage of modern hardware capabilities that can power these types of strategic breakthroughs. And by embracing a modern code approach, developers also have the ability to future-proof their code and deliver software that is scalable, portable, and built to last.
To help developers build a modern code approach that takes advantage of today’s powerful hardware, Intel offers tools, libraries, videos, webinars, and recorded and live trainings (many hands-on) as part of the Intel Modern Code Developer Community.
Stanford High Performance Computing Center Speaker Series
Our live trainings include practical, hands-on “lunch and learns” at the Stanford High Performance Computer Center (HPC Center), which provides high performance computing resources and services to enable computationally intensive research within the Stanford School of Engineering.
The HPC Center Lunch and Learn seminars are an opportunity for students and professional developers alike to meet face to face with HPC industry experts and learn about code modernization tools and best practices.
The most recent sessions covered:
Ways to increase Python* performance
“Intel® Distribution for Python: A Scalability Story in Production Environments,” presented by Sergey Maidanov, head of the Intel Distribution for Python* team, covered ways to develop and optimize technical computing programs in the Python language to achieve near-native code performance and to avoid the need to rewrite code.
The session covered a number of Intel high performance libraries and profilers, and described how Intel is extending support for multi-core and vectorization (Single Instruction, Multiple Data) parallelism for the Intel® Distribution for Python.
Case studies covered in the session showed speedups of 100x and more from highly optimized libraries such as NumPy/SciPy, Intel® Data Analytics Acceleration Library (Intel® DAAL), and Scikit-learn*, and illustrated how those scale across multiple cores and multiple nodes. Sergey also covered how Intel® VTune™ Amplifier allows low intrusive profiling of Python and native codes to identify performance hotspots.
Performance tuning for HPC workloads
Intel senior HPC engineer Thanh Phung and Intel VTune HPC lead Dmitry Prohorov discussed the Intel® VTune™ Performance Analyzer and gave a demo of its usage to study HPC workload performance.
The session, “Deep-Dive Performance Characterization and Tuning for HPC Workloads Using Intel VTune Amplifier XE Tool,” described the iterative process needed to optimize workload performance and covered how to use VTune for parallel performance tuning by using code samples to help understand how to increase CPU utilization, memory efficiency, and floating-point unit (FPU) utilization.
Putting vectorization to use
At the session “Guided Code Vectorization with Intel® Advisor XE,” Ryo Asai, a researcher at Colfax International, discussed the usage of the Intel® Advisor optimization tool. He illustrated with an example workload that computes the electric potential in a set of points in 3-D space produced by a group of charged particles; the workload achieved a 16x performance boost after undergoing optimization and vectorization.
In this example, Intel Advisor XE detected a vector dependence, a type conversion, and an inefficient memory access pattern. Ryo showed attendees how to interpret the data presented by Intel Advisor, and how to optimize the application to resolve the issues.
Faster machine learning applications with Intel® Performance Libraries
In “Building Faster Machine Learning Applications with Intel Performance Libraries,” Shaojuan Zhu, an Intel technical consulting engineer, and Sarah Knepper, an Intel software engineer, gave an overview of two performance libraries, the Intel® Math Kernel Library (Intel® MKL) and Intel Data Analytics Acceleration Library (Intel DAAL), which offer optimized building blocks for data analytics and machine learning algorithms. They also introduced the Intel Math Kernel Library for Deep Neural Networks (Intel MKL-DNN), which offers deep-learning framework optimization with DNN primitives.
Focusing on lower-level primitive functions, Intel MKL is a collection of routines for linear algebra, fast Fourier transform (FFT), vector math, and statistics that can be used to speed up math processing in almost every kind of technical computing application.
Intel DAAL focuses on data applications and provides higher level, canned solutions for supervised and unsupervised learning. This Intel architecture-based data analytics acceleration library of fundamental algorithms covers all machine learning stages, from data management and processing to modeling, and does so for offline, streaming, and distributed analytics usages.
The session also covered the Intel distribution for Python, as well as an upcoming Intel optimized framework for Caffe*.
Next up: Insights on simulation for IoT and wearable devices session
Our next training session, “Simulation for IoT—A Multi-Physics Approach for a Wearable IoT Device,” is October 15. The discussion will center around how, in the race to develop the next blockbuster wearable device, ANSYS* Electronics tools can help ensure that Internet of Things (IoT) products deliver exceptional performance and user experiences. We’ll demonstrate ANSYS’s simulation offering and explore how it can also improve the manufacturing process and durability of smartwatch products.
Learn more and register for this hands-on, face-to-face seminar here.
And discover more about code modernization at the Intel® Software Developer Zone.
To learn more about the HPC Center, visit hpcc.stanford.edu.