mirror of
git://sourceware.org/git/glibc.git
synced 2025-04-12 14:21:18 +08:00
Modifying the shareable cache '__x86_shared_cache_size', which is a factor in computing the non-temporal threshold parameter '__x86_shared_non_temporal_threshold' to optimize memcpy for AMD Zen architectures. In the existing implementation, the shareable cache is computed as 'L3 per thread, L2 per core'. Recomputing this shareable cache as 'L3 per CCX(Core-Complex)' has brought in performance gains. As per the large bench variant results, this patch also addresses the regression problem on AMD Zen architectures. Backport of commit 59803e81f96b479c17f583b31eac44b57591a1bf upstream, with the fix from cb3a749a22a55645dc6a52659eea765300623f98 ("x86: Restore processing of cache size tunables in init_cacheinfo") applied. Reviewed-by: Premachandra Mallappa <premachandra.mallappa@amd.com> Co-Authored-by: Florian Weimer <fweimer@redhat.com>