Skip to content

Hybrid architectures

Roman Storozhenko edited this page May 9, 2024 · 3 revisions

Summary

Heterogeneous Alder Lake and upcoming hybrid systems, that is containing both P-cores and E-cores, running Linux with kernel support for Resource Control (CONFIG_X86_CPU_RESCTRL) may experience reduced performance for applications running on the efficient cores (E-cores). This is a result of the L2 cache resource being configured in a manner that does not make full use of the total L2 cache available. The same is true for usage of the current version of pqos tool with both OS and MSR interfaces.

Problem Statement

When Resource Control (CONFIG_X86_CPU_RESCTRL) is enabled in the kernel, during the boot sequence the capacity bitmask (CBM) registers for each cache hierarchy that supports Intel Resource Director Technology (Intel RDT) Cache Allocation Technology (CAT) are reinitialized. On heterogenous Alder Lake systems, the capacity bitmask length for the L2 cache may vary between core types (P-core and E-core). As a result of the differing lengths and depending on which core type was used for initial discovery, programming of one length to all core types can result in undesirable outcomes. Linux and pqos tool utilize CPUID to discover the capacity bitmask length, however it does not account for different CBM lengths based on core type.

For example, consider the 12th Generation Intel i9-12900E which consists of 8 P-cores and 8 E-cores. The following table represents the capacity bitmasks out of reset and after reinitialization by Resource Control.

Model Specific Register (MSR) 0xD10 – CBM for L2 Cache

Core Type Initial value out of reset Value after Resource Control reinitialization
P-Core 0x3FF 0x3FF
E-Core 0xFFFF 0x3FF

The above values assume the initial probe of CPUID to obtain the capacity bitmask length was issued from a P-core. Reduced performance is a result of changing the E-core L2 CBM from 16-ways (0xFFFF) to only 10 ways (0x3FF), effectively limiting the use of the L2 cache to 62.5% of its total capacity.

It is possible that CPUID may be issued from an E-core, resulting in a CBM length of 0xFFFF. Since it is not possible to program a CBM value of 0xFFFF to a P-core (whose max value is 0x3FF), this will result in an unchecked MSR access error being recorded in the kernel’s log messages.

Mitigations

There are multiple options for restoring full use of the L2 cache for the E-cores.

  1. Disable L2 CAT: Add rdt=!l2cat to the kernel boot parameters.
  2. Manually reinitialize L2 CBM: Write the full waymask to the IA32_L2_QOS_MASK_0 – IA32_QOS_MASK_15 MSRs (0xD10 – 0xD1F) on at least 1 E-core per a given cache_id.
  3. Remove Resource Control support from the kernel: Recompile the kernel and disable CONFIG_X86_CPU_RESCTRL.
Clone this wiki locally