ADVISOR DETAILS

RECENT BLOG POSTS

Transparent Huge Pages on Intel® Xeon Phi™ coprocessors

Transparent Huge Pages

As of Gold Update 1, the Intel® Many Integrated Core Platform Software Stack (MPSS) Operating System has been updated to Linux kernel version 2.6.38, which adds support of the transparent huge pages (THP) functionality. This feature is enabled by default and, in most cases, improves performance of application without any code or environmental changes. In large memory working sets this gives us a performance advantage by taking a single page fault for a touched 2MB region instead of a page fault per 4KB region. If memory is touched sequentially this is a 512x reduction in page faults for a 2MB region. The second factor is related to the translation lookaside buffer (TLB) which stores virtual to physical address mappings for individual memory pages. Using 2MB pages can be an advantage since a single TLB entry can map 2MB of memory instead of 4KB. The Intel® Xeon Phi™ coprocessor has 64 4KB TLBs and 8 level1 2MB TLBs as well as 64 level 2 2MB TLBs. Assuming just 64 TLBs apiece for 4KB and 2MB, the 2MB TLBs can map 128MB of memory where the 4KB TLBs can only map 256KB of memory. There can be cases where the memory access pattern is so sparse that the benefit of large pages is lost and performance possibly even reduced due to larger clear page and copy page time during page faults. In these cases it can be beneficial to disable huge pages for the application. It should also be noted that applications using the libhugetlbfs library do not need to modify their code in anyway and it will continue to perform as expected.

Disabling THP support

Although THP functionality boosts performance in most applications, there are cases where it adversely affect performance of some applications. In this case there are multiple options for disabling THP support.

The current setting, as well as all the possible setting options, can be queried using:

cat /sys/kernel/mm/transparent_hugepage/enabled

The current setting is shown on brackets. For example:

[always] madvise never

Global Disabling

This can be done temporarily by setting a system variable:

echo never >/sys/kernel/mm/transparent_hugepage/enabled

In this case THP support will be disabled until a system reset (at which time it will revert to the default), or until the variable is set back using:

echo always >/sys/kernel/mm/transparent_hugepage/enabled

Note that this method requires root permissions and is a global change for any workloads running on the system during this time.

To make this permanent, you can add the following command parameter to the ExtraCommandLine option in /etc/sysconfig/mic/default.conf.

transparent_hugepage=never

Control THP per Allocation

There are two choices of how to handle things locally.

1.       Leave THP enabled and use the MADVISE API to scope memory regions for which to use 4KB pages.

madvise(void *addr, size_t length, MADV_NOHUGEPAGE);

2.      The second choice is to set the system variable:

echo madvise >/sys/kernel/mm/transparent_hugepage/enabled

To make this permanent, you can add the following command parameter to the ExtraCommandLine option in /etc/sysconfig/mic/default.conf or into one of the per card config files in the same directory.

transparent_hugepage=madvise

Now all allocations will be in 4KB pages unless the memory region is within that set in a call:

madvise(void *addr, size_t length, MADV_HUGEPAGE);

Note that writing to the sys variable or modifying the default.conf file require root permissions.

Keep in mind that THP applies to stack based memory allocations as well as heap based allocations. If you are using the madvise APIs, you must call them for stack based allocations as well.

Read more >