Category Archives: Performance & Optimization

Understanding x86 vs ARM Memory Alignment on Android

With Google’s recent release of the NDK (r6), it is now possible build Android application for x86 processors in addition to ARM. In general, this only involves rebuilding native code to port applications from ARM to x86. However, there are a few pitfalls to avoid. One difference between x86 and ARM is the memory alignment [...] Read more >

Understanding x86 vs ARM Memory Alignment on Android

With Google’s recent release of the NDK (r6), it is now possible build Android application for x86 processors in addition to ARM. In general, this only involves rebuilding native code to port applications from ARM to x86. However, there are a few pitfalls to avoid. One difference between x86 and ARM is the memory alignment [...] Read more >

Register for Intel(R) Technical Presentation "Using Intel(R) Inspector XE 2011 with Fortran Applications" by Jackson Marusarz (Technical Consulting Engineer)

Jackson Marusarz, Technical Consulting Engineer, will be presenting on Aug 17th at 9am PDT on “Using Intel(R) Inspector XE 2011 with Fortran Applications”. Please register and attend. Read more >

Thread Safety Analysis

DreamWorks Animation seeks to thread complex rendering applications that were written before threading was commonplace.  This article shows a technique to find and fix thread safety issues by executing legacy code in a threaded test harness and monitoring execution with Intel developer tools. Our engineering engagement with DreamWorks Animation involved introducing thread parallelism in performance [...] Read more >

Parallelism as a First Class Citizen in C and C++, the time has come.

It is time to make Parallelism a full First Class Citizen in C and C++.  Hardware is once again ahead of software, and we need to close the gap so that application development is better able to utilize the hardware without low level programming. The time has come for high level constructs for task and [...] Read more >

New Rules for Array Sections in Intel(R) Cilk(TM) Plus

Fans of Cilk Plus or language specifications may be interested in the revised specification of Intel® Cilk™ Plus posted at http://software.intel.com/file/37679/Intel_Cilk_plus_lang_spec_2.htm .   Clark Nelson did most of the work for turning the previous specification into something closer to standardese and illuminating ambiguities in the previous specification.  I’ll mention two important changes that the new specification to [...] Read more >

Register for Intel(R) Technical Presentation "Modeling parallelism with Intel(R) Parallel Advisor" by Dr.Paul Petersen (Architect)

Paul Petersen, Architect for the Intel(R) Parallel Studio product suite, will be presenting on July 21st at 9am PDT on “Modeling Parallelism with Intel® Parallel Advisor”. Please register and attend. Read more >

Dynamic Resolution Rendering Sample Now Live

The Dynamic Resolution Rendering sample, first shown at the Games Developers Conference 2011 in San Francisco, has now gone live in time for games developers attending my Develop UK 2011 talk to check it out. This sample demonstrates a technique for balancing rendering quality and performance through altering the resolution at runtime. Download the sample [...] Read more >

OpenMP 3.1 API Specification Available

I’m happy to share some news for all the OpenMP folks out there! Last week the OpenMP Architecture Review Board has voted on the final release of the OpenMP 3.1 API Specification. After a successful vote, the Architecture Review Board has released the OpenMP 3.1 API Specification. It is available on the OpenMP webpage for [...] Read more >

OpenMP 3.1 API Specification Available

I’m happy to share some news for all the OpenMP folks out there! Last week the OpenMP Architecture Review Board has voted on the final release of the OpenMP 3.1 API Specification. After a successful vote, the Architecture Review Board has released the OpenMP 3.1 API Specification. It is available on the OpenMP webpage for [...] Read more >

What we’ve been doing to make performance analysis easier on Intel® Microarchitecture Codename Sandy Bridge

New Intel(R) Microarchitecture Codename Sandy Bridge support and tuning guide! We’ve been listening to your feedback on software tuning. Specifically, we’ve been working to make it even easier for developers to analyze software performance on Intel® Microarchitecture Codename Sandy Bridge. So now I’m really excited to tell you about the new Sandy Bridge: General Exploration [...] Read more >

Register for Mark Davis’ presentation "Intel® Parallel Advisor 2011 Shows Its Stuff on Duplo"

Mark Davis, Senior Principal Engineer for the Intel(R) Parallel Advisor 2011 product, will be presenting on June 22nd at 9am PDT on “Intel® Parallel Advisor 2011 Shows Its Stuff on Duplo”. Please register and attend. Read more >

Hamburg Germany – ISC’11

I’m headed to Hamburg Germany for the International Supercomputing Conference next week. We will be talking a lot about our Many-Integrated Core (MIC) Architecture and how to realize amazing performance on highly parallel applications. We have some incredible demos and partners in our booth – so I hope you can drop by to visit us. [...] Read more >

Haswell New Instruction Descriptions Now Available!

Intel just released public details on the next generation of the x86 architecture. Arriving first in our 2013 Intel microarchitecture codename “Haswell”, the new instructions accelerate a broad category of applications and usage models. Download the full Intel® Advanced Vector Extensions Programming Reference (319433-011). These build upon the instructions coming in Intel® microarchitecture code name [...] Read more >