This patch implements the SM4 optimization for ARM processor,
using SM4 HW instruction, which is an optional feature of
crypto extension for aarch64 V8.
Tested on some modern ARM micro-architectures with SM4 support, the
performance uplift can be observed around 8X~40X over existing
C implementation in openssl. Algorithms that can be parallelized
(like CTR, ECB, CBC decryption) are on higher end, with algorithm
like CBC encryption on lower end (due to inter-block dependency)
Perf data on Yitian-710 2.75GHz hardware, before and after optimization:
Before:
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
SM4-CTR 105787.80k 107837.87k 108380.84k 108462.08k 108549.46k 108554.92k
SM4-ECB 111924.58k 118173.76k 119776.00k 120093.70k 120264.02k 120274.94k
SM4-CBC 106428.09k 109190.98k 109674.33k 109774.51k 109827.41k 109827.41k
After (7.4x - 36.6x faster):
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
SM4-CTR 781979.02k 2432994.28k 3437753.86k 3834177.88k 3963715.58k 3974556.33k
SM4-ECB 937590.69k 2941689.02k 3945751.81k 4328655.87k 4459181.40k 4468692.31k
SM4-CBC 890639.88k 1027746.58k 1050621.78k 1056696.66k 1058613.93k 1058701.31k
Signed-off-by: Daniel Hu <Daniel.Hu@arm.com>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17455)
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17528)
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17528)
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17528)
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17528)
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17528)
The `func` parameter was incorrect. It was documented as `const char *func`
instead of `const char **func`.
CLA: trivial
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17522)
These compilers define _ARCH_PPC64 for 32 bit builds
so we cannot depend solely on this define to identify
32 bit build.
Fixes#17087
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17497)
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17499)
During counting of the unprocessed records, return code is treated in a
wrong way. This forces kTLS RX path to be skipped in case of presence
of unprocessed records.
CLA: trivial
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17492)
This takes out the lock step stacks that allow a fast property to name
resolution. Follow on from #17325.
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17388)
Also update and slightly extend the respective documentation and simplify some code.
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/16251)
SM3 hardware instruction is optional feature of crypto extension for
aarch64. This implementation accelerates SM3 via SM3 instructions. For
the platform not supporting SM3 instruction, the original C
implementation still works. Thanks to AliBaba for testing and reporting
the following perf numbers for Yitian710:
Benchmark on T-Head Yitian-710 2.75GHz:
Before:
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
sm3 49297.82k 121062.63k 223106.05k 283371.52k 307574.10k 309400.92k
After (33% - 74% faster):
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
sm3 65640.01k 179121.79k 359854.59k 481448.96k 534055.59k 538274.47k
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17454)
EVP_MD_CTX_FLAG_NON_FIPS_ALLOW macro is obsolete and unused from
openssl-3.0 onwards
CLA: trivial
Signed-off-by: Shreenidhi Shedi <sshedi@vmware.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17484)
Add null checks to avoid dereferencing a pointer that could be null.
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: David von Oheimb <david.von.oheimb@siemens.com>
(Merged from https://github.com/openssl/openssl/pull/17488)
PR #17255 fixed a bug in EVP_DigestInit_ex(). While backporting the PR
to 1.1.1 (see #17472) I spotted an error in the original patch. This fixes
it.
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17473)
Not all platforms support tsan operations, those that don't need to have an
alternative locking path.
Fixes#17447
Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
(Merged from https://github.com/openssl/openssl/pull/17479)
Most of the DRGB code is run under lock from the EVP layer. This is relied
on to make the majority of TSAN operations safe. However, it is still necessary
to enable locking for all DRBGs created.
Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
(Merged from https://github.com/openssl/openssl/pull/17479)
Doing the tsan operations under lock would be difficult to arrange here (locks
require memory allocation).
Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
(Merged from https://github.com/openssl/openssl/pull/17479)
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17471)
Specifically:
* out of range
* unsigned negatives
* inexact reals
* bad param types
* buffers that are too small
* null function arguments
* unknown sizes of real
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17440)
Although we had a test for fetching an encoder/decoder/store loader it
did not use a query string. The issue highlighted by #17456 only occurs
if a query string is used.
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17459)
Attempting to fetch one of the above and providing a query string was
failing with an internal assertion error. We must ensure that we give the
provider when calling ossl_method_store_cache_set()
Fixes#17456
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17459)
OSSL_PARAMs that are of type OSSL_PARAM_INTEGER or
OSSL_PARAM_UNSIGNED_INTEGER can be obtained using any of the functions
EVP_PKEY_get_int_param(), EVP_PKEY_get_size_t_param() or
EVP_PKEY_get_bn_param(). The former two will fail if the parameter is too
large to fit into the C variable. We clarify this in the documentation.
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17445)
We already statically link libcrypto to endecode_test even in a "shared"
build. This can cause problems on some platforms with tests that load the
legacy provider which is dynamically linked to libcrypto. Two versions of
libcrypto are then linked to the same executable which can lead to crashes.
Fixes#17059
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17345)
We check that the init and cleanup functions for the custom method are
called as expected.
Based on an original reproducer by Dmitry Belyavsky from issue #17149.
Reviewed-by: Dmitry Belyavskiy <beldmit@gmail.com>
(Merged from https://github.com/openssl/openssl/pull/17255)