It was observed that AVX512 code paths can negatively affect overall
Skylake-X system performance. But we are talking specifically about
512-bit code, while AVX512VL, 256-bit variant of AVX512F instructions,
is supposed to fly as smooth as AVX2. Which is why it remains unmasked.
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/4838)
Originally it was thought that it's possible to use AVX512VL+BW
instructions with XMM and YMM registers without kernel enabling
ZMM support, but it turned to be wrong assumption.
Reviewed-by: Rich Salz <rsalz@openssl.org>
"Optimize" is in quotes because it's rather a "salvage operation"
for now. Idea is to identify processor capability flags that
drive Knights Landing to suboptimial code paths and mask them.
Two flags were identified, XSAVE and ADCX/ADOX. Former affects
choice of AES-NI code path specific for Silvermont (Knights Landing
is of Silvermont "ancestry"). And 64-bit ADCX/ADOX instructions are
effectively mishandled at decode time. In both cases we are looking
at ~2x improvement.
AVX-512 results cover even Skylake-X :-)
Hardware used for benchmarking courtesy of Atos, experiments run by
Romain Dolbeau <romain.dolbeau@atos.net>. Kudos!
Reviewed-by: Rich Salz <rsalz@openssl.org>
Exteneded feature flags were not pulled on AMD processors, as result
a number of extensions were effectively masked on Ryzen. Original fix
for x86_64cpuid.pl addressed this problem, but messed up processor
vendor detection. This fix moves extended feature detection past
basic feature detection where it belongs. 32-bit counterpart is
harmonized too.
Reviewed-by: Rich Salz <rsalz@openssl.org>
Reviewed-by: Richard Levitte <levitte@openssl.org>
Exteneded feature flags were not pulled on AMD processors, as result a
number of extensions were effectively masked on Ryzen. It should have
been reported for Excavator since it implements AVX2 extension, but
apparently nobody noticed or cared...
Reviewed-by: Rich Salz <rsalz@openssl.org>
Add copyright to most .pl files
This does NOT cover any .pl file that has other copyright in it.
Most of those are Andy's but some are public domain.
Fix typo's in some existing files.
Reviewed-by: Richard Levitte <levitte@openssl.org>