Commit Graph

419 Commits

Author SHA1 Message Date
Dimitri Papadopoulos
164a541b93 Fix new typos found by codespell
Reviewed-by: Paul Yang <kaishen.yy@antfin.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/23133)
2023-12-29 10:12:05 +01:00
Phoebe Chen
ebecf322e5 Provide additional AES-GCM test patterns to enhance test coverage.
To enhance test coverage for AES-GCM mode, we provided longer additional
testing patterns for AES-GCM testing.

Signed-off-by: Phoebe Chen <phoebe.chen@sifive.com>
Signed-off-by: Jerry Shih <jerry.shih@sifive.com>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
2023-10-26 15:55:50 +01:00
Jerry Shih
d056e90ee5 riscv: Provide vector crypto implementation of AES-GCM mode.
To accelerate the performance of the AES-GCM mode, in this patch, we
have the specialized multi-block implementations for AES-128-GCM,
AES-192-GCM and AES-256-GCM.

Signed-off-by: Phoebe Chen <phoebe.chen@sifive.com>
Signed-off-by: Jerry Shih <jerry.shih@sifive.com>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
2023-10-26 15:55:50 +01:00
Jerry Shih
3645eb0be2 Update for Zvkb extension.
c8ddeb7e64/doc/vector/riscv-crypto-vector-zvkb.adoc
Create `RISCV_HAS_ZVKB()` macro.
Use zvkb for SM4 instead of zvbb.
Use zvkb for ghash instead of zvbb.
We could just use the zvbb's subset `zvkb` for flexibility.

Signed-off-by: Jerry Shih <jerry.shih@sifive.com>
Signed-off-by: Phoebe Chen <phoebe.chen@sifive.com>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
2023-10-26 15:55:50 +01:00
Phoebe Chen
33469d0370 Fix typo in ghash-riscv64*.pl
Changed "mutiple" to "multiple" for improved clarity and correctness.

Signed-off-by: Phoebe Chen <phoebe.chen@sifive.com>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
2023-10-26 15:55:49 +01:00
Christoph Müllner
5191bcc816 riscv: GCM: Provide a Zvkg-based implementation
The upcoming RISC-V vector crypto extensions feature
a Zvkg extension, that provides a vghmac.vv instruction.
This patch provides an implementation that utilizes this
extension if available.

Tested on QEMU and no regressions observed.

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
2023-10-26 15:55:49 +01:00
Christoph Müllner
003f569814 riscv: GCM: Provide a Zvbb/Zvbc-based implementation
The RISC-V vector crypto extensions features a Zvbc extension
that provides a carryless multiplication ('vclmul.vv') instruction.
This patch provides an implementation that utilizes this
extension if available.

Tested on QEMU and no regressions observed.

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
2023-10-26 15:55:49 +01:00
trigpolynom
0fbc50ef0c aes-gcm-avx512.pl: fix non-reproducibility issue
Replace the random suffix with a counter, to make the
build reproducible.

Fixes #20954

Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Matthias St. Pierre <Matthias.St.Pierre@ncp-e.com>
Reviewed-by: Tom Cosgrove <tom.cosgrove@arm.com>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/22415)
2023-10-26 15:27:19 +01:00
Evgeny Karpov
636ee1d0b8 * Enable extra Arm64 optimization on Windows for GHASH, RAND and AES
Reviewed-by: Tom Cosgrove <tom.cosgrove@arm.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21673)
2023-10-10 15:37:41 +02:00
Matt Caswell
da1c088f59 Copyright year updates
Reviewed-by: Richard Levitte <levitte@openssl.org>
Release: yes
2023-09-07 09:59:15 +01:00
Heiko Stuebner
3e76b38852 riscv: Clarify dual-licensing wording for GCM and AES
The original text for the Apache + BSD dual licensing for riscv GCM and AES
perlasm was taken from other openSSL users like crypto/crypto/LPdir_unix.c .

Though Eric pointed out that the dual-licensing text could be read in a
way negating the second license [0] and suggested to clarify the text
even more.

So do this here for all of the GCM, AES and shared riscv.pm .

We already had the agreement of all involved developers for the actual
dual licensing in [0] and [1], so this is only a better clarification
for this.

[0] https://github.com/openssl/openssl/pull/20649#issuecomment-1589558790
[1] https://github.com/openssl/openssl/pull/21018

Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>

Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21357)
2023-07-06 12:53:27 +10:00
Tomas Mraz
44957a4932 Do not use stitched AES-GCM implementation on PPC32
The implementation is not usable there at all.
Fixes #21301

Reviewed-by: Hugo Landau <hlandau@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21312)

(cherry picked from commit b256d32915)
2023-06-30 08:31:49 +10:00
fisher.yu
6c0ecc2bce Fix function signatures in aes-gcm-armv8 comments.
Reviewed-by: Tom Cosgrove <tom.cosgrove@arm.com>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21208)
2023-06-16 20:15:24 +10:00
Dimitri Papadopoulos
eb4129e12c Fix typos found by codespell
Typos in doc/man* will be fixed in a different commit.

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/20910)
2023-06-15 10:11:46 +10:00
Heiko Stuebner
33523d6d66 riscv: GCM: dual-license under Apache + 2-clause BSD
To allow re-use of the already reviewed openSSL crypto code for RISC-V in
other projects - like the Linux kernel, add a second license (2-clause BSD)
to the recently added GCM ghash functions.

Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Todd Short <todd.short@me.com>
(Merged from https://github.com/openssl/openssl/pull/20649)
2023-06-11 01:26:45 -04:00
Xiaokang Qian
09bd0d05a6 Fix arm64 asm code back compatible issue with gcc 4.9.4
Fix: #20963

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/20967)
2023-05-31 10:50:28 +10:00
JerryDevis
507356598b aes-gcm-armv8_64 asm support bigdian
Reviewed-by: Tom Cosgrove <tom.cosgrove@arm.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/20489)

(cherry picked from commit 32344a74b7)
2023-05-09 16:21:04 +02:00
Evan Miller
175645a1a6 Do not build P10-specific AES-GCM assembler on macOS
Reviewed-by: Tom Cosgrove <tom.cosgrove@arm.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/20543)
2023-03-22 14:26:26 +01:00
Tomas Mraz
2dbddfab08 aes-gcm-avx512.pl: Fix the clang version detection on Apple Oses
Reviewed-by: Tom Cosgrove <tom.cosgrove@arm.com>
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/20519)

(cherry picked from commit 110dac5783)
2023-03-17 11:25:25 +01:00
Christoph Müllner
f3fed0d5fc riscv: GCM: Implement GHASH()
RISC-V currently only offers a GMULT() callback for accelerated
processing. Let's implement the missing piece to have GHASH()
available as well. Like GMULT(), we provide a variant for
systems with the Zbkb extension (including brev8).

The integration follows the existing pattern for GMULT()
in RISC-V. We keep the C implementation as we need to decide
if we can call an optimized routine at run-time.
The C implementation is the fall-back in case we don't have
any extensions available that can be used to accelerate
the calculation.

Tested with all combinations of possible extensions
on QEMU (limiting the available instructions accordingly).
No regressions observed.

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/20078)
2023-03-16 13:12:19 +11:00
Christoph Müllner
b24684369b riscv: GCM: Simplify GCM calculation
The existing GCM calculation provides some potential
for further optimizations. Let's use the demo code
from the RISC-V cryptography extension groups
(https://github.com/riscv/riscv-crypto), which represents
the extension architect's intended use of the clmul instruction.

The GCM calculation depends on bit and byte reversal.
Therefore, we use the corresponding instructions to do that
(if available at run-time).

The resulting computation becomes quite compact and passes
all tests.

Note, that a side-effect of this change is a reduced register
usage in .gmult(), which opens the door for an efficient .ghash()
implementation.

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/20078)
2023-03-16 13:12:19 +11:00
Christoph Müllner
75623ed8d0 riscv: GCM: Use riscv.pm
A recent commit introduced a Perl module for common code.
This patch changes the GCM code to use this module, removes duplicated code,
and moves the instruction encoding functions into the module.

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/20078)
2023-03-16 13:12:19 +11:00
Christoph Müllner
86c69fe841 riscv: Clean up extension test macros
In RISC-V we have multiple extensions, that can be
used to accelerate processing.
The known extensions are defined in riscv_arch.def.
From that file test functions of the following
form are generated: RISCV_HAS_$ext().

In recent commits new ways to define the availability
of these test macros have been defined. E.g.:
  #define RV32I_ZKND_ZKNE_CAPABLE   \
          (RISCV_HAS_ZKND() && RISCV_HAS_ZKNE())
  [...]
  #define RV64I_ZKND_ZKNE_CAPABLE   \
          (RISCV_HAS_ZKND() && RISCV_HAS_ZKNE())

This leaves us with two different APIs to test capabilities.
Further, creating the same macros for RV32 and RV64 results
in duplicated code (see example above).

This inconsistent situation makes it hard to integrate
further code. So let's clean this up with the following steps:
* Replace RV32I_* and RV64I_* macros by RICSV_HAS_* macros
* Move all test macros into riscv_arch.h
* Use "AND" and "OR" to combine tests with more than one extension
* Rename include files for accelerated processing (remove extension
  postfix).

We end up with compile time tests for RV32/RV64 and run-time tests
for available extensions. Adding new routines (e.g. for vector crypto
instructions) should be straightforward.

Testing showed no regressions.

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/20078)
2023-03-16 13:12:19 +11:00
Tom Cosgrove
4596c20b86 Fix the return values of the aarch64 unroll8_eor_aes_gcm_*_*_kernel functions
These aren't currently checked when they are called in cipher_aes_gcm_hw_armv8.inc,
but they are declared as returning as size_t the number of bytes they have processed,
and the aes_gcm_*_*_kernel (unroll by 4) versions of these do return the correct
values.

Change-Id: Ic3eaf139e36e29e8779b5bd8b867c08fde37a337

Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/20191)
2023-02-08 17:15:14 +01:00
Tomas Mraz
50d9b2b5f1 Do not build P10-specific AES-GCM assembler on AIX
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/19865)

(cherry picked from commit 5c92ac52c2)
2022-12-14 12:53:05 +01:00
Xu Yizhou
2788b56f0c providers: Add SM4 XTS implementation
Signed-off-by: Xu Yizhou <xuyizhou1@huawei.com>

Reviewed-by: Hugo Landau <hlandau@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/19619)
2022-11-29 16:17:30 +01:00
Tomas Mraz
be0161ff10 gcm_get_funcs(): Add missing fallback for ghash on x86_64
Fixes #19673

Reviewed-by: Hugo Landau <hlandau@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/19674)
2022-11-15 22:35:41 +01:00
Richard Levitte
e077455e9e Stop raising ERR_R_MALLOC_FAILURE in most places
Since OPENSSL_malloc() and friends report ERR_R_MALLOC_FAILURE, and
at least handle the file name and line number they are called from,
there's no need to report ERR_R_MALLOC_FAILURE where they are called
directly, or when SSLfatal() and RLAYERfatal() is used, the reason
`ERR_R_MALLOC_FAILURE` is changed to `ERR_R_CRYPTO_LIB`.

There were a number of places where `ERR_R_MALLOC_FAILURE` was reported
even though it was a function from a different sub-system that was
called.  Those places are changed to report ERR_R_{lib}_LIB, where
{lib} is the name of that sub-system.
Some of them are tricky to get right, as we have a lot of functions
that belong in the ASN1 sub-system, and all the `sk_` calls or from
the CRYPTO sub-system.

Some extra adaptation was necessary where there were custom OPENSSL_malloc()
wrappers, and some bugs are fixed alongside these changes.

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/19301)
2022-10-05 14:02:03 +02:00
Juergen Christ
cd854f225b Fix GHASH-ASM implementation on s390x
s390x GHASH assembler implementation assumed it was called from a
gcm128_context structure where the Xi paramter to the ghash function was
embedded in that structure.  Since the structure layout resembles the paramter
block required for kimd-GHASH, the assembler code simply assumed the 128 bytes
after Xi are the hash subkey.

This assumption was broken with the introduction of AES-GCM-SIV which uses the
GHASH implementation without a gcm128_context structure.  Furthermore, the
bytes following the Xi input parameter to the GHASH function do not contain
the hash subkey.  To fix this, we remove the assumption about the calling
context and build the parameter block on the stack.  This requires some
copying of data to and from the stack.  While this introduces a performance
degradation, new systems anyway use kma for GHASH/AES-GCM.

Finally fixes #18693 for s390x.

Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>

Reviewed-by: Todd Short <todd.short@me.com>
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18939)
2022-08-09 10:52:08 +01:00
Todd Short
0113ec8460 Implement AES-GCM-SIV (RFC8452)
Fixes #16721

This uses AES-ECB to create a counter mode AES-CTR32 (32bit counter, I could
not get AES-CTR to work as-is), and GHASH to implement POLYVAL. Optimally,
there would be separate polyval assembly implementation(s), but the only one
I could find (and it was SSE2 x86_64 code) was not Apache 2.0 licensed.

This implementation lives only in the default provider; there is no legacy
implementation.

The code offered in #16721 is not used; that implementation sits on top of
OpenSSL, this one is embedded inside OpenSSL.

Full test vectors from RFC8452 are included, except the 0 length plaintext;
that is not supported; and I'm not sure it's worthwhile to do so.

Reviewed-by: Hugo Landau <hlandau@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18693)
2022-07-29 08:32:16 -04:00
Tomas Mraz
186be8ed26 Fix regression from GCM mode refactoring
Fixes #18896

Reviewed-by: Todd Short <todd.short@me.com>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18903)
2022-07-29 10:32:49 +10:00
Juergen Christ
48e35b99bd s390x: Fix GCM setup
Rework of GCM code did not include s390x causing NULL pointer dereferences on
GCM operations other than AES-GCM on platforms that support kma.  Fix this by
a proper setup of the function pointers.

Fixes: 92c9086e5c ("Use separate function to get GCM functions")

Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>

Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18862)
2022-07-26 12:27:59 +01:00
Todd Short
d50e0934e5 Clean up GCM_MUL and remove GCM_FUNCREF_4BIT
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18835)
2022-07-22 08:34:13 -04:00
Todd Short
95201ef457 Clean up use of GHASH macro
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18835)
2022-07-22 08:34:13 -04:00
Todd Short
92c9086e5c Use separate function to get GCM functions
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18835)
2022-07-22 08:34:13 -04:00
Todd Short
7da952bcc5 Remove some unused 4bit GCM code
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18835)
2022-07-22 08:34:13 -04:00
Todd Short
7b6e19fc4e Remove unused 1bit GCM implementation
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18835)
2022-07-22 08:34:12 -04:00
Todd Short
a8b5128fd7 Remove unused 8bit GCM implementation
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18835)
2022-07-22 08:34:12 -04:00
Daniel Fiala
36c269c302 Change loops conditions to make zero loop risk more obvious.
Fixes openssl#18073.

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Todd Short <todd.short@me.com>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18327)
2022-05-24 14:11:20 +10:00
Sebastian Andrzej Siewior
9968c77539 Rename x86-32 assembly files from .s to .S.
Rename x86-32 assembly files from .s to .S. While processing the .S file
gcc will use the pre-processor whic will evaluate macros and ifdef. This
is turn will be used to enable the endbr32 opcode based on the __CET__
define.

Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18353)
2022-05-24 13:16:06 +10:00
Henry Brausen
999376dcf3 Add clmul-based gmult for riscv64 with Zbb, Zbc
ghash-riscv64.pl implements 128-bit galois field multiplication for
use in the GCM mode using RISC-V carryless multiplication primitives.

The clmul-accelerated routine can be selected by setting the Zbb and
Zbc bits of the OPENSSL_riscvcap environment variable at runtime.

Reviewed-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Signed-off-by: Henry Brausen <henry.brausen@vrull.eu>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17640)
2022-05-19 16:32:49 +10:00
Matt Caswell
fecb3aae22 Update copyright year
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Release: yes
2022-05-03 13:34:51 +01:00
XiaokangQian
3b5b91992c Fix incorrect comments in aes-gcm-armv8-unroll8_64.pl
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17918)
2022-03-22 21:07:12 +11:00
Andrey Matyukov
224ea84b40 aes-gcm-avx512.pl: Fixed mingw64 build
Decoration prefix for some assembler labels in aes-gcm-avx512.pl was
fixed for mingw64 build.

Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17868)
2022-03-14 17:08:27 +01:00
XiaokangQian
2507903eb7 Fix build issue with aes-gcm-armv8-unroll8_64.S on older aarch64 assemblers
The EOR3 instruction is implemented with .inst, and the code here is enabled
using run-time detection of the CPU capabilities, so no need to explicitly
ask for the sha3 extension.

Fixes #17773

Reviewed-by: Kurt Roeckx <kurt@roeckx.be>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17795)
2022-03-04 11:11:51 +01:00
Andrey Matyukov
63b996e752 AES-GCM enabled with AVX512 vAES and vPCLMULQDQ.
Vectorized 'stitched' encrypt + ghash implementation of AES-GCM enabled
with AVX512 vAES and vPCLMULQDQ instructions (available starting Intel's
IceLake micro-architecture).

The performance details for representative IceLake Server and Client
platforms are shown below

Performance data:
OpenSSL Speed KBs/Sec
Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (1Core/1Thread)
Payload in Bytes       16          64        256         1024        8192      16384
AES-128-GCM
  Baseline      478708.27   1118296.96  2428092.52  3518199.4   4172355.99  4235762.07
  Patched       534613.95   2009345.55  3775588.15  5059517.64  8476794.88  8941541.79
  Speedup            1.12         1.80        1.55        1.44        2.03        2.11

AES-256-GCM
  Baseline      399237.27   961699.9    2136377.65  2979889.15  3554823.37  3617757.5
  Patched       475948.13   1720128.51  3462407.12  4696832.2   7532013.16  7924953.91
  Speedup            1.19        1.79         1.62        1.58        2.12        2.19
Intel(R) Core(TM) i7-1065G7 CPU @ 1.30GHz (1Core/1Thread)
Payload in Bytes       16          64        256         1024        8192      16384
AES-128-GCM
  Baseline      259128.54   570756.43   1362554.16  1990654.57  2359128.88  2401671.58
  Patched       292139.47   1079320.95  2001974.63  2829007.46  4510318.59  4705314.41
  Speedup            1.13        1.89         1.47        1.42        1.91        1.96
AES-256-GCM
  Baseline      236000.34   550506.76   1234638.08  1716734.57  2011255.6   2028099.99
  Patched       247256.32   919731.34   1773270.43  2553239.55  3953115.14  4111227.29
  Speedup            1.05        1.67         1.44        1.49        1.97        2.03

Reviewed-by: TJ O'Dwyer, Marcel Cornu, Pablo de Lara
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17239)
2022-02-10 15:10:12 +01:00
Danny Tsen
345c99b665 Fixed counter overflow
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17607)
2022-02-07 11:29:18 +11:00
Dimitris Apostolou
07c5465e98 Fix typos
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17634)
2022-02-07 11:23:28 +11:00
XiaokangQian
954f45ba4c Optimize AES-GCM for uarchs with unroll and new instructions
Increase the block numbers to 8 for every iteration.  Increase the hash
table capacity.  Make use of EOR3 instruction to improve the performance.

This can improve performance 25-40% on out-of-order microarchitectures
with a large number of fast execution units, such as Neoverse V1.  We also
see 20-30% performance improvements on other architectures such as the M1.

Assembly code reviewd by Tom Cosgrove (ARM).

Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/15916)
2022-01-25 14:30:00 +11:00
Danny Tsen
44a563dde1 AES-GCM performance optimzation with stitched method for p9+ ppc64le
Assembly code reviewed by Shricharan Srivatsan <ssrivat@us.ibm.com>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/16854)
2022-01-24 11:25:53 +11:00