On newer machines the SHA3/SHAKE performance of CPACF instructions KIMD and KLMD
can be enhanced by using additional modifier bits. This allows the application
to omit initializing the ICV, but also affects the internal processing of the
instructions. Performance is mostly gained when processing short messages.
The new CPACF feature is backwards compatible with older machines, i.e. the new
modifier bits are ignored on older machines. However, to save the ICV
initialization, the application must detect the MSA level and omit the ICV
initialization only if this feature is supported.
Signed-off-by: Joerg Schmidbauer <jschmidb@de.ibm.com>
Reviewed-by: Paul Dale <ppzgs1@gmail.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/25235)
In OpenSSL, it's actually OSSL_NELEM() in "internal/nelem.h".
Found by running the checkpatch.pl Linux script to enforce coding style.
Reviewed-by: Neil Horman <nhorman@openssl.org>
Reviewed-by: David von Oheimb <david.von.oheimb@siemens.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/22097)
Found by running the checkpatch.pl Linux script to enforce coding style.
Reviewed-by: Neil Horman <nhorman@openssl.org>
Reviewed-by: David von Oheimb <david.von.oheimb@siemens.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/22097)
It will work only if OSSL_DIGEST_PARAM_XOFLEN is set.
Reviewed-by: Paul Dale <ppzgs1@gmail.com>
Reviewed-by: Shane Lontis <shane.lontis@oracle.com>
(Merged from https://github.com/openssl/openssl/pull/24105)
Added commas for sentence openers in Implementation Notes. Fixed
spelling of "reasons" section of the notes.
CLA: trivial
Co-authored-by: Tom Cosgrove <tom.cosgrove@arm.com>
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/24241)
In order to get asm code running on OpenBSD we must place
all constants into .rodata sections.
davidben@ also pointed out we need to adjust `x86_64-xlate.pl` perlasm
script to adjust read-olny sections for various flavors (OSes). Those
changes were cherry-picked from boringssl.
closes#23312
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/23997)
The following files referred to ../liblegacy.a when they should have
referred to ../../liblegacy.a. This cause the creation of a mysterious
directory 'crypto/providers', and because of an increased strictness
with regards to where directories are created, configuration failure
on some platforms.
Fixes#23436
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Tom Cosgrove <tom.cosgrove@arm.com>
(Merged from https://github.com/openssl/openssl/pull/23452)
(cherry picked from commit 667b45454a)
Amend the assembler so it uses only 32bit value.
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Shane Lontis <shane.lontis@oracle.com>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/22750)
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Shane Lontis <shane.lontis@oracle.com>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/22750)
For armv4 - Only the first 4 parameters can be passed via registers
(r0..r3).
As all of the general registers are already used,
r11 was used to store the 'next' param.
R11 is now pushed/poped on entry/exit.
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/22719)
Fix the conditional on the 'next' parameter passed into SHA3_squeeze.
Reported-by: David Benjamin <davidben@davidben.net>
Signed-off-by: Rohan McLure <rmclure@linux.ibm.com>
Reviewed-by: Shane Lontis <shane.lontis@oracle.com>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/22722)
The low level SHA3_Squeeze() function needed to change slightly so
that it can handle multiple squeezes. Support this on s390x
architecture as well.
Signed-off-by: Holger Dengler <dengler@linux.ibm.com>
Reviewed-by: Shane Lontis <shane.lontis@oracle.com>
Reviewed-by: Todd Short <todd.short@me.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/22221)
Fixes#7894
This allows SHAKE to squeeze multiple times with different output sizes.
The existing EVP_DigestFinalXOF() API has been left as a one shot
operation. A similar interface is used by another toolkit.
The low level SHA3_Squeeze() function needed to change slightly so
that it can handle multiple squeezes. This involves changing the
assembler code so that it passes a boolean to indicate whether
the Keccak function should be called on entry.
At the provider level, the squeeze is buffered, so that it only requests
a multiple of the blocksize when SHA3_Squeeze() is called. On the first
call the value is zero, on subsequent calls the value passed is 1.
This PR is derived from the excellent work done by @nmathewson in
https://github.com/openssl/openssl/pull/7921
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21511)
This patch supports SHA-512, SHA-512/224, SHA-512/256 on platforms with
vlen greater than 128,
Signed-off-by: Phoebe Chen <phoebe.chen@sifive.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
Keep SHA-256 constant values in registers to save the loading time.
Move the constant loading for sha256 into a separate subroutine.
By creating a dedicated sub routine for loading sha256 constants, the
code can be made more modular and easier to modify in the future.
Relaxing the SHA256 constraint, zvknhb also supports SHA256.
Simplify the H and mask initialization flows.
Signed-off-by: Phoebe Chen <phoebe.chen@sifive.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
The upcoming RISC-V vector crypto extensions feature
a Zvknhb extension, that provides sha512-specific istructions.
This patch provides an implementation that utilizes this
extension if available.
Tested on QEMU and no regressions observed.
Signed-off-by: Charalampos Mitrodimas <charalampos.mitrodimas@vrull.eu>
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
Currently, architectures have to decide if they want the C code or an
arch-specific implementation. Let's add a macro, that allows to keep the C
code even if SHA512_ASM is defined (but rename it from sha512_block_data_order
to sha512_block_data_order_c). The macro INCLUDE_C_SHA512 can be used by
architectures, that want the C code as fallback code.
Signed-off-by: Charalampos Mitrodimas <charalampos.mitrodimas@vrull.eu>
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
The upcoming RISC-V vector crypto extensions feature
a Zvknha extension, that provides sha256-specific instructions.
This patch provides an implementation that utilizes this
extension if available.
Tested on QEMU and no regressions observed.
Signed-off-by: Charalampos Mitrodimas <charalampos.mitrodimas@vrull.eu>
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
Currently, architectures have to decide if they want the C code or an
arch-specific implementation. Let's add a macro, that allows to keep the C
code even if SHA256_ASM is defined (but rename it from sha256_block_data_order
to sha256_block_data_order_c). The macro INCLUDE_C_SHA256 can be used by
architectures, that want the C code as fallback code.
Signed-off-by: Charalampos Mitrodimas <charalampos.mitrodimas@vrull.eu>
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
void f() should probably be void f(void)
Found by running the checkpatch.pl Linux script to enforce coding style.
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21468)
rhotates tables are placed to .text section which confuses tools such as BOLT.
Move them to rodata to unbreak and avoid polluting icache/iTLB with data.
CLA: trivial
Reviewed-by: Hugo Landau <hlandau@openssl.org>
Reviewed-by: Paul Yang <kaishen.yy@antfin.com>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21440)
This is defined in NIST SP 800-208 as the truncation to 192 bits of
SHA256. Unlike other truncated hashes in the SHA2 suite, this variant
doesn't have a different initial state, it is just a pure truncation
of the output.
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21180)
Change-Id: Ia94e528a2d55934435de6a2949784c52eb38d82f
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/20621)
The use case is that uppercase .ASM extension may be used on some platforms,
and we were only testing for the lowercase extension.
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Dmitry Belyavskiy <beldmit@gmail.com>
(Merged from https://github.com/openssl/openssl/pull/19604)
Flags for ASM implementations of SHA, SHAKE, and KECCAK were only passed to
the FIPS provider and not to the default or legacy provider. This left some
potential for optimization. Pass the correct flags also to these providers.
Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Patrick Steuer <patrick.steuer@de.ibm.com>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18747)
Rename x86-32 assembly files from .s to .S. While processing the .S file
gcc will use the pre-processor whic will evaluate macros and ifdef. This
is turn will be used to enable the endbr32 opcode based on the __CET__
define.
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18353)
Reviewed-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Signed-off-by: Henry Brausen <henry.brausen@vrull.eu>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18275)
Update the comment "../md32_common.h" to "crypto/md32_common.h".
CLA: trivial
Signed-off-by: Weiguo Li <liwg06@foxmail.com>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Matthias St. Pierre <Matthias.St.Pierre@ncp-e.com>
(Merged from https://github.com/openssl/openssl/pull/17670)
Add missing AARCH64_VALID_CALL_TARGET to armv8_rng_probe(). Also add
these to the functions defined by gen_random(), and note that this Perl
sub prints the assembler out directly, not going via the $code xlate
mechanism (and therefore coming before the include of arm_arch.h). So
fix this too.
In KeccakF1600_int, AARCH64_SIGN_LINK_REGISTER functions as
AARCH64_VALID_CALL_TARGET on BTI-only builds, so it needs to come before
the 'adr' line.
Change-Id: If241efe71591c88253a3e36647ced00300c3c1a3
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17653)
We currently load data byte by byte in order to byteswap it on big
endian. On little endian we can just do 8 byte loads.
A SHAKE128 benchmark runs 10% faster on POWER9 with this patch applied.
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/8455)
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/16918)
This change adds optional support for
- Armv8.3-A Pointer Authentication (PAuth) and
- Armv8.5-A Branch Target Identification (BTI)
features to the perl scripts.
Both features can be enabled with additional compiler flags.
Unless any of these are enabled explicitly there is no code change at
all.
The extensions are briefly described below. Please read the appropriate
chapters of the Arm Architecture Reference Manual for the complete
specification.
Scope
-----
This change only affects generated assembly code.
Armv8.3-A Pointer Authentication
--------------------------------
Pointer Authentication extension supports the authentication of the
contents of registers before they are used for indirect branching
or load.
PAuth provides a probabilistic method to detect corruption of register
values. PAuth signing instructions generate a Pointer Authentication
Code (PAC) based on the value of a register, a seed and a key.
The generated PAC is inserted into the original value in the register.
A PAuth authentication instruction recomputes the PAC, and if it matches
the PAC in the register, restores its original value. In case of a
mismatch, an architecturally unmapped address is generated instead.
With PAuth, mitigation against ROP (Return-oriented Programming) attacks
can be implemented. This is achieved by signing the contents of the
link-register (LR) before it is pushed to stack. Once LR is popped,
it is authenticated. This way a stack corruption which overwrites the
LR on the stack is detectable.
The PAuth extension adds several new instructions, some of which are not
recognized by older hardware. To support a single codebase for both pre
Armv8.3-A targets and newer ones, only NOP-space instructions are added
by this patch. These instructions are treated as NOPs on hardware
which does not support Armv8.3-A. Furthermore, this patch only considers
cases where LR is saved to the stack and then restored before branching
to its content. There are cases in the code where LR is pushed to stack
but it is not used later. We do not address these cases as they are not
affected by PAuth.
There are two keys available to sign an instruction address: A and B.
PACIASP and PACIBSP only differ in the used keys: A and B, respectively.
The keys are typically managed by the operating system.
To enable generating code for PAuth compile with
-mbranch-protection=<mode>:
- standard or pac-ret: add PACIASP and AUTIASP, also enables BTI
(read below)
- pac-ret+b-key: add PACIBSP and AUTIBSP
Armv8.5-A Branch Target Identification
--------------------------------------
Branch Target Identification features some new instructions which
protect the execution of instructions on guarded pages which are not
intended branch targets.
If Armv8.5-A is supported by the hardware, execution of an instruction
changes the value of PSTATE.BTYPE field. If an indirect branch
lands on a guarded page the target instruction must be one of the
BTI <jc> flavors, or in case of a direct call or jump it can be any
other instruction. If the target instruction is not compatible with the
value of PSTATE.BTYPE a Branch Target Exception is generated.
In short, indirect jumps are compatible with BTI <j> and <jc> while
indirect calls are compatible with BTI <c> and <jc>. Please refer to the
specification for the details.
Armv8.3-A PACIASP and PACIBSP are implicit branch target
identification instructions which are equivalent with BTI c or BTI jc
depending on system register configuration.
BTI is used to mitigate JOP (Jump-oriented Programming) attacks by
limiting the set of instructions which can be jumped to.
BTI requires active linker support to mark the pages with BTI-enabled
code as guarded. For ELF64 files BTI compatibility is recorded in the
.note.gnu.property section. For a shared object or static binary it is
required that all linked units support BTI. This means that even a
single assembly file without the required note section turns-off BTI
for the whole binary or shared object.
The new BTI instructions are treated as NOPs on hardware which does
not support Armv8.5-A or on pages which are not guarded.
To insert this new and optional instruction compile with
-mbranch-protection=standard (also enables PAuth) or +bti.
When targeting a guarded page from a non-guarded page, weaker
compatibility restrictions apply to maintain compatibility between
legacy and new code. For detailed rules please refer to the Arm ARM.
Compiler support
----------------
Compiler support requires understanding '-mbranch-protection=<mode>'
and emitting the appropriate feature macros (__ARM_FEATURE_BTI_DEFAULT
and __ARM_FEATURE_PAC_DEFAULT). The current state is the following:
-------------------------------------------------------
| Compiler | -mbranch-protection | Feature macros |
+----------+---------------------+--------------------+
| clang | 9.0.0 | 11.0.0 |
+----------+---------------------+--------------------+
| gcc | 9 | expected in 10.1+ |
-------------------------------------------------------
Available Platforms
------------------
Arm Fast Model and QEMU support both extensions.
https://developer.arm.com/tools-and-software/simulation-models/fast-modelshttps://www.qemu.org/
Implementation Notes
--------------------
This change adds BTI landing pads even to assembly functions which are
likely to be directly called only. In these cases, landing pads might
be superfluous depending on what code the linker generates.
Code size and performance impact for these cases would be negligible.
Interaction with C code
-----------------------
Pointer Authentication is a per-frame protection while Branch Target
Identification can be turned on and off only for all code pages of a
whole shared object or static binary. Because of these properties if
C/C++ code is compiled without any of the above features but assembly
files support any of them unconditionally there is no incompatibility
between the two.
Useful Links
------------
To fully understand the details of both PAuth and BTI it is advised to
read the related chapters of the Arm Architecture Reference Manual
(Arm ARM):
https://developer.arm.com/documentation/ddi0487/latest/
Additional materials:
"Providing protection for complex software"
https://developer.arm.com/architectures/learn-the-architecture/providing-protection-for-complex-software
Arm Compiler Reference Guide Version 6.14: -mbranch-protection
https://developer.arm.com/documentation/101754/0614/armclang-Reference/armclang-Command-line-Options/-mbranch-protection?lang=en
Arm C Language Extensions (ACLE)
https://developer.arm.com/docs/101028/latest
Addional Notes
--------------
This patch is a copy of the work done by Tamas Petz in boringssl. It
contains the changes from the following commits:
aarch64: support BTI and pointer authentication in assembly
Change-Id: I4335f92e2ccc8e209c7d68a0a79f1acdf3aeb791
URL: https://boringssl-review.googlesource.com/c/boringssl/+/42084
aarch64: Improve conditional compilation
Change-Id: I14902a64e5f403c2b6a117bc9f5fb1a4f4611ebf
URL: https://boringssl-review.googlesource.com/c/boringssl/+/43524
aarch64: Fix name of gnu property note section
Change-Id: I6c432d1c852129e9c273f6469a8b60e3983671ec
URL: https://boringssl-review.googlesource.com/c/boringssl/+/44024
Change-Id: I2d95ebc5e4aeb5610d3b226f9754ee80cf74a9af
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/16674)
Also add hints to SHA256_Init.pod and CHANGES.md how to replace SHA256() etc.
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/14741)
libimplementations.a was a nice idea, but had a few flaws:
1. The idea to have common code in libimplementations.a and FIPS
sensitive helper functions in libfips.a / libnonfips.a didn't
catch on, and we saw full implementation ending up in them instead
and not appearing in libimplementations.a at all.
2. Because more or less ALL algorithm implementations were included
in libimplementations.a (the idea being that the appropriate
objects from it would be selected automatically by the linker when
building the shared libraries), it's very hard to find only the
implementation source that should go into the FIPS module, with
the result that the FIPS checksum mechanism include source files
that it shouldn't
To mitigate, we drop libimplementations.a, but retain the idea of
collecting implementations in static libraries. With that, we not
have:
libfips.a
Includes all implementations that should become part of the FIPS
provider.
liblegacy.a
Includes all implementations that should become part of the legacy
provider.
libdefault.a
Includes all implementations that should become part of the
default and base providers.
With this, libnonfips.a becomes irrelevant and is dropped.
libcommon.a is retained to include common provider code that can be
used uniformly by all providers.
Fixes#15157
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/15171)
These are: keccak_kmac_init(), sha3_final(), sha3_init(), sha3_reset() and
sha3_update().
Reviewed-by: Tim Hudson <tjh@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/13417)