Fix Intel CPUID leaf 4 cache topology for SMT#1002
Fix Intel CPUID leaf 4 cache topology for SMT#1002glitzflitz wants to merge 3 commits intooxidecomputer:masterfrom
Conversation
|
oh nice, thanks! totally an oversight when I was in there. how visible is this under so, if this is legible in the guest, could you adjust that test to check (.. I also see that in retrospect that test assumes SMT siblings are adjacent in APIC ID, which is definitely wrong in general.) |
|
This is what I get before the patch which indeed shows each vCPU as its own private L1 and L2 cache. After the patch I get Let me add a test for this |
|
Also I just noticed the propolis/phd-tests/tests/src/cpuid.rs Line 326 in dacb53d shouldn't sibling_idx be idx/2 instead of idx/4?The thread_siblings documentation in linux is vague but I think it is a hex bitmask where bit N is set if CPU N is a sibling. With SMT, CPUs 0-1 set bits 0-1 which gives '3', CPUs 2-3 set bits 2-3 which gives 'c' and so on. Since each hex digit covers 4 CPUs and we are iterating over pairs of siblings, we go through 2 pairs before moving to the next hex digit, so idx/2 makes sense instead of idx/4? The current code would only advance sibling_idx every 8 CPUs.I just ran the test it fails for me locally. |
When SMT is enabled, L1/L2 caches should report being shared by 2 logical processors (the SMT siblings). Previously EAX[25:14] was always being set to 0, indicating no sharing which contradicts the SMT topology reported in leaf 0xB. As per [1] EAX[25:14] indicates maximum number of addressable IDs for logical processors sharing this cache. This mismatch causes linux guest to print "BUG: arch topology borken / the SMT domain not a subset of the CLS domain" during boot. Linux derives L2 cache sharing groups from leaf 4 and expects SMT siblings to share L2 but it was being informed that each vCPU has private L1/L2. This brings the SMT handling logic in CPUID inline with what being done for AMD in fix_amd_cache_topo() which sets the sharing count to 2 when has_smt is true. This fixes oxidecomputer#1001. [1]: Table 1-15. Reference for CPUID Leaf 04H https://cdrdv2-public.intel.com/775917/intel-64-architecture-processor-topology-enumeration.pdf Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>
The existing test assertion would fail on hosts with SMT enabled due to incorrect index calculations. Also add has_smt() helper to skip thread_siblings checks on non-SMT hosts and remove the unused itertools import. Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>
Verify that Linux guest observes correct cache sharing topology from /sys/devices/system/cpu/cpu0/cache/. With SMT enabled, L1 and L2 caches should report sharing by SMT siblings while L3 should be shared across all vCPUs. Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>
|
I fixed the |
|
Also noticed this late
Do you mean index3 😅/L3 which should be shared across all cores? Index 0 and 1 is split among L1i and L1d. I added test for L1 and L2 to be shared among SMT siblings and L3 to be among all CPUs in Add cache topology verification to guest_cpu_topo_test |
iximeow
left a comment
There was a problem hiding this comment.
hello! apologies again for taking forever to get back to this :)
I have a few notes here for references (for myself and others, things I wished I'd thought to cite earlier), so I'll push another commit that adjusts some of the docs a bit, squash it all, and merge shortly. thanks again for the fix and adjusting the test to suit 🙏
separately I was taking a look at this locally with propolis-standalone and pretty confused by:
root@ubuntu:~# cat /sys/devices/system/cpu/cpu*/cache/index*/shared_cpu_list
0
0
0
0
1
1
1
1
2
2
2
2
3
3
3
3
which... comes down to propolis-standalone taking a different path to CPU profiles specifically if there is no CPUID profile set. totally different issue! in that case we take the bhyve default literally and bhyve indeed does not indicate that L1/L2 caches are shared. on a quick skim that all pleads ignorance of SMT... except that CPUID_HTT is set in leaf 1 so at least I'm left wondering what all this looks like on a HT-capable processor with hyperthreading disabled. it's probably not terribly important over in bhyve the default CPUID behavior should probably represent threads as siblings in the topology here too; that'd be an issue over at illumos.org.
ps, to to myself from a few months ago:
(.. I also see that in retrospect that test assumes SMT siblings are adjacent in APIC ID, which is definitely wrong in general.)
a more careful reading of CACHE TOPOLOGY ENUMERATION USING CPUID LEAF 04H from the topology enumeration doc you linked, and this is said similarly in the AMD APM as well, actually does require cache-sharing cores to be colocated in APIC ID. quote Intel:
The CACHE_IDs for each cache level can be extracted from the x2APIC ID for processors that report 32-bit x2APIC ID, or from the initial APIC ID for processors that do not report x2APIC ID. ... The list of CACEH_Masks[n] of all types and levels when a bitwise AND is performed against its own x2APIC ID/APIC ID will provide the CACHE_ID[n] for that level and type that can be used to match other processors sharing the same cache ...
so cores sharing L1 or L2 cache must have APIC IDs that are the same outside bits corresponding to bits 25-14 in cpuid 4.eax. in a topology where three cores shared an L1 cache on a six-core processor, that yields APIC IDs 0, 1, 2, 4, 5, 6.
AMD describes APIC IDs having similar relatedness in Volume 3, appendix E "Obtaining Processor Information Via the CPUID Instruction", in the section about Function 8000_001Dh:
ShareId = LocalApicId >> log2(NumSharingCache+1)
so same as Intel, you can slice up the APIC ID into bitmasks of who shares caches.
When SMT is enabled, L1/L2 caches should report being shared by 2 logical processors (the SMT siblings). Previously EAX[25:14] was always being set to 0, indicating no sharing which contradicts the SMT topology reported in leaf
0xB. As per [1] EAX[25:14] indicates maximum number of addressable IDs for logical processors sharing this cache.This mismatch causes linux guest to print "BUG: arch topology borken / the SMT domain not a subset of the CLS domain" during boot. Linux derives L2 cache sharing groups from leaf 4 and expects SMT siblings to share L2 but it was being informed that each vCPU has private L1/L2.
This brings the SMT handling logic in CPUID inline with whats being done for AMD in
fix_amd_cache_topo()which sets the sharing count to 2 when has_smt is true. This fixes #1001.[1]: Table 1-15. Reference for CPUID Leaf 04H
https://cdrdv2-public.intel.com/775917/intel-64-architecture-processor-topology-enumeration.pdf