This is likely the leftover of a previous hack,
and thus should be removed now.
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17259)
bn_sqr_comba8 does for instance compute a wrong result for the value:
a=0x4aaac919 62056c84 fba7334e 1a6be678 022181ba fd3aa878 899b2346 ee210f45
The correct result is:
r=0x15c72e32 605a3061 d11b1012 3c187483 6df96999 bd0c22ba d3e7d437 4724a82f
912c5e61 6a187efe 8f7c47fc f6945fe5 75be8e3d 97ed17d4 7950b465 3cb32899
but the actual result was:
r=0x15c72e32 605a3061 d11b1012 3c187483 6df96999 bd0c22ba d3e7d437 4724a82f
912c5e61 6a187efe 8f7c47fc f6945fe5 75be8e3c 97ed17d4 7950b465 3cb32899
so the forth word of the result was 0x75be8e3c but should have been
0x75be8e3d instead.
Likewise bn_sqr_comba4 has an identical bug for the same value as well:
a=0x022181ba fd3aa878 899b2346 ee210f45
correct result:
r=0x00048a69 9fe82f8b 62bd2ed1 88781335 75be8e3d 97ed17d4 7950b465 3cb32899
wrong result:
r=0x00048a69 9fe82f8b 62bd2ed1 88781335 75be8e3c 97ed17d4 7950b465 3cb32899
Fortunately the bn_mul_comba4/8 code paths are not affected.
Also the mips64 target does in fact not handle the carry propagation
correctly.
Example:
a=0x4aaac91900000000 62056c8400000000 fba7334e00000000 1a6be67800000000
022181ba00000000 fd3aa87800000000 899b234635dad283 ee210f4500000001
correct result:
r=0x15c72e32272c4471 392debf018c679c8 b85496496bf8254c d0204f36611e2be1
0cdb3db8f3c081d8 c94ba0e1bacc5061 191b83d47ff929f6 5be0aebfc13ae68d
3eea7a7fdf2f5758 42f7ec656cab3cb5 6a28095be34756f2 64f24687bf37de06
2822309cd1d292f9 6fa698c972372f09 771e97d3a868cda0 dc421e8a00000001
wrong result:
r=0x15c72e32272c4471 392debf018c679c8 b85496496bf8254c d0204f36611e2be1
0cdb3db8f3c081d8 c94ba0e1bacc5061 191b83d47ff929f6 5be0aebfc13ae68d
3eea7a7fdf2f5758 42f7ec656cab3cb5 6a28095be34756f2 64f24687bf37de06
2822309cd1d292f8 6fa698c972372f09 771e97d3a868cda0 dc421e8a00000001
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17258)
Apparently using OPENSSL_cleanse() confuses the fuzzer so it
makes the buffer to appear uninitialized. And memset can be
safely used here and it is also potentially faster.
Fixes#17237
Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
(Merged from https://github.com/openssl/openssl/pull/17240)
It uses AVX512_IFMA + AVX512_VL (with 256-bit wide registers) ISA to
keep lower power license.
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/14908)
Negative return value indicates an error so we bail out.
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Kurt Roeckx <kurt@roeckx.be>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/16975)
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/16918)
This change adds optional support for
- Armv8.3-A Pointer Authentication (PAuth) and
- Armv8.5-A Branch Target Identification (BTI)
features to the perl scripts.
Both features can be enabled with additional compiler flags.
Unless any of these are enabled explicitly there is no code change at
all.
The extensions are briefly described below. Please read the appropriate
chapters of the Arm Architecture Reference Manual for the complete
specification.
Scope
-----
This change only affects generated assembly code.
Armv8.3-A Pointer Authentication
--------------------------------
Pointer Authentication extension supports the authentication of the
contents of registers before they are used for indirect branching
or load.
PAuth provides a probabilistic method to detect corruption of register
values. PAuth signing instructions generate a Pointer Authentication
Code (PAC) based on the value of a register, a seed and a key.
The generated PAC is inserted into the original value in the register.
A PAuth authentication instruction recomputes the PAC, and if it matches
the PAC in the register, restores its original value. In case of a
mismatch, an architecturally unmapped address is generated instead.
With PAuth, mitigation against ROP (Return-oriented Programming) attacks
can be implemented. This is achieved by signing the contents of the
link-register (LR) before it is pushed to stack. Once LR is popped,
it is authenticated. This way a stack corruption which overwrites the
LR on the stack is detectable.
The PAuth extension adds several new instructions, some of which are not
recognized by older hardware. To support a single codebase for both pre
Armv8.3-A targets and newer ones, only NOP-space instructions are added
by this patch. These instructions are treated as NOPs on hardware
which does not support Armv8.3-A. Furthermore, this patch only considers
cases where LR is saved to the stack and then restored before branching
to its content. There are cases in the code where LR is pushed to stack
but it is not used later. We do not address these cases as they are not
affected by PAuth.
There are two keys available to sign an instruction address: A and B.
PACIASP and PACIBSP only differ in the used keys: A and B, respectively.
The keys are typically managed by the operating system.
To enable generating code for PAuth compile with
-mbranch-protection=<mode>:
- standard or pac-ret: add PACIASP and AUTIASP, also enables BTI
(read below)
- pac-ret+b-key: add PACIBSP and AUTIBSP
Armv8.5-A Branch Target Identification
--------------------------------------
Branch Target Identification features some new instructions which
protect the execution of instructions on guarded pages which are not
intended branch targets.
If Armv8.5-A is supported by the hardware, execution of an instruction
changes the value of PSTATE.BTYPE field. If an indirect branch
lands on a guarded page the target instruction must be one of the
BTI <jc> flavors, or in case of a direct call or jump it can be any
other instruction. If the target instruction is not compatible with the
value of PSTATE.BTYPE a Branch Target Exception is generated.
In short, indirect jumps are compatible with BTI <j> and <jc> while
indirect calls are compatible with BTI <c> and <jc>. Please refer to the
specification for the details.
Armv8.3-A PACIASP and PACIBSP are implicit branch target
identification instructions which are equivalent with BTI c or BTI jc
depending on system register configuration.
BTI is used to mitigate JOP (Jump-oriented Programming) attacks by
limiting the set of instructions which can be jumped to.
BTI requires active linker support to mark the pages with BTI-enabled
code as guarded. For ELF64 files BTI compatibility is recorded in the
.note.gnu.property section. For a shared object or static binary it is
required that all linked units support BTI. This means that even a
single assembly file without the required note section turns-off BTI
for the whole binary or shared object.
The new BTI instructions are treated as NOPs on hardware which does
not support Armv8.5-A or on pages which are not guarded.
To insert this new and optional instruction compile with
-mbranch-protection=standard (also enables PAuth) or +bti.
When targeting a guarded page from a non-guarded page, weaker
compatibility restrictions apply to maintain compatibility between
legacy and new code. For detailed rules please refer to the Arm ARM.
Compiler support
----------------
Compiler support requires understanding '-mbranch-protection=<mode>'
and emitting the appropriate feature macros (__ARM_FEATURE_BTI_DEFAULT
and __ARM_FEATURE_PAC_DEFAULT). The current state is the following:
-------------------------------------------------------
| Compiler | -mbranch-protection | Feature macros |
+----------+---------------------+--------------------+
| clang | 9.0.0 | 11.0.0 |
+----------+---------------------+--------------------+
| gcc | 9 | expected in 10.1+ |
-------------------------------------------------------
Available Platforms
------------------
Arm Fast Model and QEMU support both extensions.
https://developer.arm.com/tools-and-software/simulation-models/fast-modelshttps://www.qemu.org/
Implementation Notes
--------------------
This change adds BTI landing pads even to assembly functions which are
likely to be directly called only. In these cases, landing pads might
be superfluous depending on what code the linker generates.
Code size and performance impact for these cases would be negligible.
Interaction with C code
-----------------------
Pointer Authentication is a per-frame protection while Branch Target
Identification can be turned on and off only for all code pages of a
whole shared object or static binary. Because of these properties if
C/C++ code is compiled without any of the above features but assembly
files support any of them unconditionally there is no incompatibility
between the two.
Useful Links
------------
To fully understand the details of both PAuth and BTI it is advised to
read the related chapters of the Arm Architecture Reference Manual
(Arm ARM):
https://developer.arm.com/documentation/ddi0487/latest/
Additional materials:
"Providing protection for complex software"
https://developer.arm.com/architectures/learn-the-architecture/providing-protection-for-complex-software
Arm Compiler Reference Guide Version 6.14: -mbranch-protection
https://developer.arm.com/documentation/101754/0614/armclang-Reference/armclang-Command-line-Options/-mbranch-protection?lang=en
Arm C Language Extensions (ACLE)
https://developer.arm.com/docs/101028/latest
Addional Notes
--------------
This patch is a copy of the work done by Tamas Petz in boringssl. It
contains the changes from the following commits:
aarch64: support BTI and pointer authentication in assembly
Change-Id: I4335f92e2ccc8e209c7d68a0a79f1acdf3aeb791
URL: https://boringssl-review.googlesource.com/c/boringssl/+/42084
aarch64: Improve conditional compilation
Change-Id: I14902a64e5f403c2b6a117bc9f5fb1a4f4611ebf
URL: https://boringssl-review.googlesource.com/c/boringssl/+/43524
aarch64: Fix name of gnu property note section
Change-Id: I6c432d1c852129e9c273f6469a8b60e3983671ec
URL: https://boringssl-review.googlesource.com/c/boringssl/+/44024
Change-Id: I2d95ebc5e4aeb5610d3b226f9754ee80cf74a9af
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/16674)
There is a problem that appears when calling BN_div(a, c, a, b) with negative b.
In this case, the sign of the remainder c is incorrect. The problem only
occurs if the dividend and the quotient are the same BIGNUM.
Fixes#15982
Reviewed-by: Nicola Tuveri <nic.tuv@gmail.com>
(Merged from https://github.com/openssl/openssl/pull/15991)
This code is currently unconditional even though build.info has:
$BNASM_ppc64=$BNASM_ppc32 ppc64-mont-fixed.s
This causes a build failure on 32-bit systems.
Fixes#15923
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/15971)
This requires the text address.
Fixes#15923
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/15971)
Ancient toolchains fail the build because they don't like the hints,
newer ISAs recommend not using the hints and relying on dynamic branch
prediction.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/15971)
Fixes#15963
INSTALL.md uses these exact options as an example so it should work.
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/15965)
mtvsrd/mfvsrd are ISA >= 2.07 only, so this won't work for older
CPUs.
It would be possible to use this scheme only in the ISA >= 3.0
implementation. However, in the future it may be possible for newer
ISAs to allow CPU implementations without a vector unit, so don't
bother. The performance improvement versus using the stack was small
anyway.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/15798)
No need to save/restore because it is volatile.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/15798)
This is done in other versions due to the possibility of an early
return. However, there is no early return here.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/15798)
64-bit alignment at the beginning of functions, 32-bit alignment for
loop targets.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/15798)
This works on Linux but breaks the build on AIX.
Fixes#15748
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/15798)
Remove unused -DCONF_DEBUG and -DBN_CTX_DEBUG.
Rename REF_PRINT to REF_DEBUG for consistency, and add a new
tracing category and use it for printing reference counts.
Rename -DDEBUG_UNUSED to -DUNUSED_RESULT_DEBUG
Fix BN_DEBUG_RAND so it compiles and, when set, force DEBUG_RAND to
be set also.
Rename engine_debug_ref to be ENGINE_REF_PRINT also for consistency.
Fixes#15357
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/15353)
The symbols renamed are:
RSAZ_amm52x20_x1_256
RSAZ_amm52x20_x2_256
rsaz_avx512ifma_eligible
RSAZ_mod_exp_avx512_x2
Additionally, RSAZ_exp52x20_x2_256 was made static
Reviewed-by: Shane Lontis <shane.lontis@oracle.com>
(Merged from https://github.com/openssl/openssl/pull/15445)
The new names are ossl_err_load_xxx_strings.
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/15446)
The non-destructive substitution syntax (s///r), was introduced in perl
5.14. We need to support 5.10 and above.
Fixes#15378
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/15379)
Overall improvement for p384 of ~18% on Power 9, compared to existing
Power assembling code. See comment in code for more details.
Multiple unrolled versions could be generated for values other than
6. However, for TLS 1.3 the only other ECC algorithms that might use
Montgomery Multiplication are p256 and p521, but these have custom
algorithms that don't use Montgomery Multiplication. Non-ECC
algorithms are likely to use larger key lengths that won't fit into
the n <= 10 length limitation of this code.
Signed-off-by: Amitay Isaacs <amitay@ozlabs.org>
Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/15175)
libimplementations.a was a nice idea, but had a few flaws:
1. The idea to have common code in libimplementations.a and FIPS
sensitive helper functions in libfips.a / libnonfips.a didn't
catch on, and we saw full implementation ending up in them instead
and not appearing in libimplementations.a at all.
2. Because more or less ALL algorithm implementations were included
in libimplementations.a (the idea being that the appropriate
objects from it would be selected automatically by the linker when
building the shared libraries), it's very hard to find only the
implementation source that should go into the FIPS module, with
the result that the FIPS checksum mechanism include source files
that it shouldn't
To mitigate, we drop libimplementations.a, but retain the idea of
collecting implementations in static libraries. With that, we not
have:
libfips.a
Includes all implementations that should become part of the FIPS
provider.
liblegacy.a
Includes all implementations that should become part of the legacy
provider.
libdefault.a
Includes all implementations that should become part of the
default and base providers.
With this, libnonfips.a becomes irrelevant and is dropped.
libcommon.a is retained to include common provider code that can be
used uniformly by all providers.
Fixes#15157
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/15171)
clean a few style nits.
Reviewed-by: Shane Lontis <shane.lontis@oracle.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/14806)
Signed-off-by: Amitay Isaacs <amitay@ozlabs.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/14784)
It turns out that some CPUID code requires the presence of some BN
assembler code, so we make sure it's included in the same manner as
the CPUID code itself.
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/14755)
This is used for generating a more-correct copyright statement
for the "build_generated" targets.
Fixes: #13765
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/13791)
The reason is that clang-6 does not enable proper -march flags by
default for assembly modules (rsaz-avx512.pl requires avx512ifma, avx512dq,
avx512vl, avx512f). This is not true for newer clang versions - clang-7 and
further work ok.
For older clang versions users who want to get optimization from this
file, we have a note in the OPENSSL_ia32cap.pod with the workaround that
proposes having a wrapper that forces using external assembler.
Fixes#14668: clang-6.0.0 build broken
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/14671)
Fixes#14660: Windows linking error
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/14665)
with AVX512_IFMA + AVX512_VL instructions, primarily for RSA CRT private key
operations. It uses 256-bit registers to avoid CPU frequency scaling issues.
The performance speedup for RSA2k signature on ICL is ~2x.
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/13750)