openssl/crypto
Andrey Matyukov 63b996e752 AES-GCM enabled with AVX512 vAES and vPCLMULQDQ.
Vectorized 'stitched' encrypt + ghash implementation of AES-GCM enabled
with AVX512 vAES and vPCLMULQDQ instructions (available starting Intel's
IceLake micro-architecture).

The performance details for representative IceLake Server and Client
platforms are shown below

Performance data:
OpenSSL Speed KBs/Sec
Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (1Core/1Thread)
Payload in Bytes       16          64        256         1024        8192      16384
AES-128-GCM
  Baseline      478708.27   1118296.96  2428092.52  3518199.4   4172355.99  4235762.07
  Patched       534613.95   2009345.55  3775588.15  5059517.64  8476794.88  8941541.79
  Speedup            1.12         1.80        1.55        1.44        2.03        2.11

AES-256-GCM
  Baseline      399237.27   961699.9    2136377.65  2979889.15  3554823.37  3617757.5
  Patched       475948.13   1720128.51  3462407.12  4696832.2   7532013.16  7924953.91
  Speedup            1.19        1.79         1.62        1.58        2.12        2.19
Intel(R) Core(TM) i7-1065G7 CPU @ 1.30GHz (1Core/1Thread)
Payload in Bytes       16          64        256         1024        8192      16384
AES-128-GCM
  Baseline      259128.54   570756.43   1362554.16  1990654.57  2359128.88  2401671.58
  Patched       292139.47   1079320.95  2001974.63  2829007.46  4510318.59  4705314.41
  Speedup            1.13        1.89         1.47        1.42        1.91        1.96
AES-256-GCM
  Baseline      236000.34   550506.76   1234638.08  1716734.57  2011255.6   2028099.99
  Patched       247256.32   919731.34   1773270.43  2553239.55  3953115.14  4111227.29
  Speedup            1.05        1.67         1.44        1.49        1.97        2.03

Reviewed-by: TJ O'Dwyer, Marcel Cornu, Pablo de Lara
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17239)
2022-02-10 15:10:12 +01:00
..
aes aes: make the no-asm constant time code path not the default 2022-01-31 11:39:00 +11:00
aria fix some code with obvious wrong coding style 2021-10-28 13:10:46 +10:00
asn1 Move e_os.h to include/internal 2022-02-05 05:31:09 +01:00
async Use USE_SWAPCONTEXT on IA64. 2022-01-04 12:14:19 +01:00
bf
bio Move e_os.h to include/internal 2022-02-05 05:31:09 +01:00
bn Fix typos 2022-02-07 11:23:28 +11:00
buffer
camellia Update copyright year 2021-07-29 15:41:35 +01:00
cast
chacha aarch64: support BTI and pointer authentication in assembly 2021-10-01 09:35:38 +02:00
cmac EVP_Cipher: fix the incomplete return check 2021-11-16 17:28:23 +01:00
cmp Move e_os.h to include/internal 2022-02-05 05:31:09 +01:00
cms Fix malloc failure handling of X509_ALGOR_set0() 2022-01-14 18:47:20 +01:00
comp Fix coverity 1493364 & 1493375: unchecked return value 2021-11-08 08:55:32 +10:00
conf Move e_os.h to include/internal 2022-02-05 05:31:09 +01:00
crmf Fix the return check of OBJ_obj2txt 2021-11-22 11:17:48 +01:00
ct
des Move e_os.h to include/internal 2022-02-05 05:31:09 +01:00
dh Replace size check with more meaningful pubkey check 2022-02-07 16:32:40 +01:00
dsa Fix EVP todata and fromdata when used with selection of EVP_PKEY_PUBLIC_KEY. 2022-02-03 13:48:42 +01:00
dso Move e_os.h to include/internal 2022-02-05 05:31:09 +01:00
ec Move e_os.h to include/internal 2022-02-05 05:31:09 +01:00
encode_decode Move e_os.h to include/internal 2022-02-05 05:31:09 +01:00
engine Check for presence of 1.1.x openssl runtime 2022-02-08 13:26:13 +01:00
err Move e_os.h to include/internal 2022-02-05 05:31:09 +01:00
ess
evp evp enc: cache cipher key length 2022-02-07 09:46:16 +11:00
ffc Move e_os.h to include/internal 2022-02-05 05:31:09 +01:00
hmac Adapt other parts of the source to the changed EVP_Q_digest() and EVP_Q_mac() 2021-06-23 23:00:36 +02:00
http Move e_os.h to include/internal 2022-02-05 05:31:09 +01:00
idea
kdf
lhash lhash: Avoid 32 bit right shift of a 32 bit value 2022-01-27 10:36:57 +01:00
md2
md4
md5 Update copyright year 2021-07-29 15:41:35 +01:00
mdc2
modes AES-GCM enabled with AVX512 vAES and vPCLMULQDQ. 2022-02-10 15:10:12 +01:00
objects Move e_os.h to include/internal 2022-02-05 05:31:09 +01:00
ocsp add OSSL_STACK_OF_X509_free() for commonly used pattern 2021-12-21 12:11:49 +01:00
pem Allow empty passphrase in PEM_write_bio_PKCS8PrivateKey_nid() 2022-01-26 17:15:52 +01:00
perlasm perlasm/ppc-xlate.pl: Fix build on OS X 2021-11-18 13:24:17 +01:00
pkcs7 Fix malloc failure handling of X509_ALGOR_set0() 2022-01-14 18:47:20 +01:00
pkcs12 add OSSL_STACK_OF_X509_free() for commonly used pattern 2021-12-21 12:11:49 +01:00
poly1305 Don't use __ARMEL__/__ARMEB__ in aarch64 assembly 2022-01-09 07:40:44 +01:00
property Move e_os.h to include/internal 2022-02-05 05:31:09 +01:00
rand Move e_os.h to include/internal 2022-02-05 05:31:09 +01:00
rc2
rc4
rc5
ripemd
rsa rsa: add check after calling BN_BLINDING_lock 2022-02-08 15:22:35 +01:00
seed
sha Fix outdated comments 2022-02-10 13:52:17 +01:00
siphash
sm2 Add missing check according to SM2 Digital Signature generation algorithm 2021-11-02 12:02:56 +01:00
sm3 Fix sm3ss1 translation issue in sm3-armv8.pl 2022-01-20 12:50:20 +11:00
sm4 SM4 optimization for ARM by HW instruction 2022-01-18 11:52:14 +01:00
srp fix some code with obvious wrong coding style 2021-10-28 13:10:46 +10:00
stack Fix Coverity 1493746: constant expression result 2021-11-17 08:15:35 +10:00
store Move e_os.h to include/internal 2022-02-05 05:31:09 +01:00
ts Move e_os.h to include/internal 2022-02-05 05:31:09 +01:00
txt_db
ui Move e_os.h to include/internal 2022-02-05 05:31:09 +01:00
whrlpool
x509 Move e_os.h to include/internal 2022-02-05 05:31:09 +01:00
alphacpuid.pl
arm64cpuid.pl aarch64: fix branch target indications in arm64cpuid.pl and keccak1600 2022-02-09 13:24:31 +11:00
arm_arch.h Optimize AES-GCM for uarchs with unroll and new instructions 2022-01-25 14:30:00 +11:00
armcap.c Optimize AES-GCM for uarchs with unroll and new instructions 2022-01-25 14:30:00 +11:00
armv4cpuid.pl
asn1_dsa.c
bsearch.c
build.info Statically link the legacy provider to endecode_test 2022-01-11 11:00:21 +00:00
c64xpluscpuid.pl
context.c Add missing CRYPTO_THREAD_cleanup_local of default_context_thread_local 2022-02-04 08:59:08 +01:00
core_algorithm.c CORE: add a provider argument to ossl_method_construct() 2021-10-27 12:41:10 +02:00
core_fetch.c CORE: Encure that cached fetches can be done per provider 2021-10-27 12:41:15 +02:00
core_namemap.c Move e_os.h to include/internal 2022-02-05 05:31:09 +01:00
cpt_err.c err: add additional errors 2022-01-12 20:10:21 +11:00
cpuid.c Move e_os.h to include/internal 2022-02-05 05:31:09 +01:00
cryptlib.c Move e_os.h to include/internal 2022-02-05 05:31:09 +01:00
ctype.c
cversion.c
der_writer.c
dllmain.c Move e_os.h to include/internal 2022-02-05 05:31:09 +01:00
ebcdic.c
ex_data.c
getenv.c Move e_os.h to include/internal 2022-02-05 05:31:09 +01:00
ia64cpuid.S
info.c Move e_os.h to include/internal 2022-02-05 05:31:09 +01:00
init.c Move e_os.h to include/internal 2022-02-05 05:31:09 +01:00
initthread.c Avoid a race in init_thread_stop() 2021-11-12 17:16:14 +00:00
LPdir_nyi.c
LPdir_unix.c fix some code with obvious wrong coding style 2021-10-28 13:10:46 +10:00
LPdir_vms.c
LPdir_win32.c
LPdir_win.c
LPdir_wince.c
mem_clr.c
mem_sec.c Move e_os.h to include/internal 2022-02-05 05:31:09 +01:00
mem.c Move e_os.h to include/internal 2022-02-05 05:31:09 +01:00
mips_arch.h
o_dir.c Move e_os.h to include/internal 2022-02-05 05:31:09 +01:00
o_fopen.c Move e_os.h to include/internal 2022-02-05 05:31:09 +01:00
o_init.c Move e_os.h to include/internal 2022-02-05 05:31:09 +01:00
o_str.c Move e_os.h to include/internal 2022-02-05 05:31:09 +01:00
o_time.c
packet.c
param_build_set.c param build set: add errors to failure returns 2022-01-12 20:10:21 +11:00
param_build.c Add support for signed BIGNUMs in the OSSL_PARAM_BLD API 2022-01-26 21:35:39 +01:00
params_dup.c Move e_os.h to include/internal 2022-02-05 05:31:09 +01:00
params_from_text.c Allow sign extension in OSSL_PARAM_allocate_from_text() 2021-11-24 19:18:19 +01:00
params.c Add support for signed BIGNUMs in the OSSL_PARAM API 2022-01-26 21:35:39 +01:00
pariscid.pl
passphrase.c Fix invalid malloc failures in PEM_write_bio_PKCS8PrivateKey() 2022-01-26 17:15:52 +01:00
ppccap.c Add support for BSD-ppc, BSD-ppc64 and BSD-ppc64le configurations 2021-12-09 16:07:14 +11:00
ppccpuid.pl
provider_child.c Stop receiving child callbacks in a child libctx when appropriate 2021-11-12 17:16:14 +00:00
provider_conf.c Refactor: a separate func for provider activation from config 2021-12-01 15:49:38 +01:00
provider_core.c ossl_provider_add_to_store: Avoid use-after-free 2021-12-17 17:33:49 +01:00
provider_local.h make struct provider_info_st a full type 2021-06-24 14:48:15 +01:00
provider_predefined.c make struct provider_info_st a full type 2021-06-24 14:48:15 +01:00
provider.c Correctly activate the provider in OSSL_PROVIDER_try_load 2021-11-12 17:16:14 +00:00
punycode.c Move more general parts of internal/cryptlib.h to new internal/common.h 2021-11-17 15:48:37 +01:00
README-sparse_array.md
s390x_arch.h Add default provider support for Keccak 224, 256, 384 and 512 2021-09-23 12:07:57 +10:00
s390xcap.c
s390xcpuid.pl
self_test_core.c
sparccpuid.S
sparcv9cap.c Split bignum code out of the sparcv9cap.c 2021-07-15 09:33:04 +02:00
sparse_array.c
threads_lib.c
threads_none.c
threads_pthread.c Defined out MUTEX attributes not available on NonStop SPT Threads. 2021-07-02 12:33:45 +10:00
threads_win.c Explicitly #include <synchapi.h> is unnecessary 2021-09-23 14:07:18 +02:00
trace.c Move e_os.h to include/internal 2022-02-05 05:31:09 +01:00
uid.c Openssl fails to compile on Debian with kfreebsd kernels 2021-09-02 10:02:32 +10:00
vms_rms.h
x86_64cpuid.pl
x86cpuid.pl