The .rodata section with precomputed constant `ecp_nistz256_precomputed` needs to be
terminated by .text, because the ecp_nistz256_precomputed' happens to be the
first section in the file. The lack of .text makes code to arrive into the same
.rodata section where ecp_nistz256_precomputed is found. The exception is raised
as soon as CPU attempts to execute the code from read only section.
Fixes#24184
Reviewed-by: Paul Dale <ppzgs1@gmail.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/24192)
Although we have a Zvkb version of Chacha20, the Zvkb from the RISC-V
Vector Cryptography Bit-manipulation extension was ratified in late 2023
and does not come to the RVA23 Profile. Many CPUs in 2024 currently do not
support Zvkb but may have Vector and Bit-manipulation, which are already in
the RVA22 Profile. This commit provides a vector-only implementation that
replaced the vror with vsll+vsrl+vor and can provide enough speed for
Chacha20 for new CPUs this year.
Signed-off-by: Yangyu Chen <cyy@cyyself.name>
Reviewed-by: Paul Dale <ppzgs1@gmail.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/24069)
Fixes#24070
Use scalar ALU for 1 chacha block with rvv ALU simultaneously.
The tail elements(non-multiple of block length) will be handled by
the scalar logic.
Use rvv path if the input length > chacha_block_size.
And we have about 1.2x improvement comparing with the original code.
Reviewed-by: Hongren Zheng <i@zenithal.me>
Reviewed-by: Paul Dale <ppzgs1@gmail.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/24097)
To accelerate the performance of the AES-XTS mode, in this patch, we
have the specialized multi-block implementation for AES-128-XTS and
AES-256-XTS.
Signed-off-by: Jerry Shih <jerry.shih@sifive.com>
Signed-off-by: Phoebe Chen <phoebe.chen@sifive.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
c8ddeb7e64/doc/vector/riscv-crypto-vector-zvkb.adoc
Create `RISCV_HAS_ZVKB()` macro.
Use zvkb for SM4 instead of zvbb.
Use zvkb for ghash instead of zvbb.
We could just use the zvbb's subset `zvkb` for flexibility.
Signed-off-by: Jerry Shih <jerry.shih@sifive.com>
Signed-off-by: Phoebe Chen <phoebe.chen@sifive.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
Added helper functions and opcode encoding functions
in riscv.pm perl module to avoid pointless code duplication.
Signed-off-by: Phoebe Chen <phoebe.chen@sifive.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
The upcoming RISC-V vector crypto extensions feature
a Zvksh extension, that provides SM3-specific istructions.
This patch provides an implementation that utilizes this
extension if available.
Tested on QEMU and no regressions observed.
Signed-off-by: Charalampos Mitrodimas <charalampos.mitrodimas@vrull.eu>
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
The upcoming RISC-V vector crypto extensions feature
a Zvksed extension, that provides SM4-specific instructions.
This patch provides an implementation that utilizes this
extension if available.
Tested on QEMU and no regressions observed.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
The upcoming RISC-V vector crypto extensions feature
a Zvknhb extension, that provides sha512-specific istructions.
This patch provides an implementation that utilizes this
extension if available.
Tested on QEMU and no regressions observed.
Signed-off-by: Charalampos Mitrodimas <charalampos.mitrodimas@vrull.eu>
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
The upcoming RISC-V vector crypto extensions feature
a Zvknha extension, that provides sha256-specific instructions.
This patch provides an implementation that utilizes this
extension if available.
Tested on QEMU and no regressions observed.
Signed-off-by: Charalampos Mitrodimas <charalampos.mitrodimas@vrull.eu>
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
The upcoming RISC-V vector crypto extensions provide
the Zvkned extension, that provides a AES-specific instructions.
This patch provides an implementation that utilizes this
extension if available.
Tested on QEMU and no regressions observed.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
The upcoming RISC-V vector crypto extensions feature
a Zvkg extension, that provides a vghmac.vv instruction.
This patch provides an implementation that utilizes this
extension if available.
Tested on QEMU and no regressions observed.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
The RISC-V vector crypto extensions features a Zvbc extension
that provides a carryless multiplication ('vclmul.vv') instruction.
This patch provides an implementation that utilizes this
extension if available.
Tested on QEMU and no regressions observed.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
When $label == "0", $label is not truthy, so `if ($label)` thinks there isn't
a label. Correct this by looking at the result of the s/// command.
Verified that there are no changes in the .S files created during a normal
build, and that the "0:" labels appear in the translation given in the error
report (and they are the only difference in the before and after output).
Fixes#21647
Change-Id: I5f2440100c62360bf4bdb7c7ece8dddd32553c79
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Shane Lontis <shane.lontis@oracle.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21653)
The original text for the Apache + BSD dual licensing for riscv GCM and AES
perlasm was taken from other openSSL users like crypto/crypto/LPdir_unix.c .
Though Eric pointed out that the dual-licensing text could be read in a
way negating the second license [0] and suggested to clarify the text
even more.
So do this here for all of the GCM, AES and shared riscv.pm .
We already had the agreement of all involved developers for the actual
dual licensing in [0] and [1], so this is only a better clarification
for this.
[0] https://github.com/openssl/openssl/pull/20649#issuecomment-1589558790
[1] https://github.com/openssl/openssl/pull/21018
Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21357)
To allow re-use of the already reviewed openSSL crypto code for RISC-V in
other projects - like the Linux kernel, add a second license (2-clause BSD)
to the recently added GCM ghash functions.
Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Todd Short <todd.short@me.com>
(Merged from https://github.com/openssl/openssl/pull/20649)
The existing GCM calculation provides some potential
for further optimizations. Let's use the demo code
from the RISC-V cryptography extension groups
(https://github.com/riscv/riscv-crypto), which represents
the extension architect's intended use of the clmul instruction.
The GCM calculation depends on bit and byte reversal.
Therefore, we use the corresponding instructions to do that
(if available at run-time).
The resulting computation becomes quite compact and passes
all tests.
Note, that a side-effect of this change is a reduced register
usage in .gmult(), which opens the door for an efficient .ghash()
implementation.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/20078)
A recent commit introduced a Perl module for common code.
This patch changes the GCM code to use this module, removes duplicated code,
and moves the instruction encoding functions into the module.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/20078)
On systems where Devel::StackTrace is available, we can use this module
to create more usable error messages. Further, don't print error
messages in case of official register aliases, but simply accept them.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/20078)
Move helper functions and instruction encoding functions
into a riscv.pm Perl module to avoid pointless code duplication.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/20078)
Unlike gcc, the clang assembler has issues with the maximum value of the literal
in the `ldr REG, #VALUE` pseudo-instruction (where the assembler places the
value into a literal pool and generates a PC-relative load from that pool) when
used with Neon registers.
Specifically, while dN refers to 64-bit Neon registers, and qN refers to 128-bit
Neon registers, clang assembly only supports a maximum of 32-bit loads to
either with this instruction.
Therefore restrict accordingly to avoid breakage when building with clang.
clang appears to support the correct maximums with the scalar registers xN etc.
This will prevent the kind of breakage we saw when #19914 was merged (which has
since been fixed by #20202) - assembly authors will need to manually apply the
literal load, as is done in #20202.
None of the Arm assembler code uses this pseudo-instruction anyway, as it
doesn't seem to avoid duplication of constants.
Change-Id: If52f6ce22c10feb1cc334d996ff71b1efed3218e
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Todd Short <todd.short@me.com>
(Merged from https://github.com/openssl/openssl/pull/20222)
an assembler for Windows on Arm builds and also clang-cl as the compiler
as well. Make appropriate changes to armcap source and peralsm scripts.
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/19523)
The VIA Nehemiah CPU is a x86-32 CPU without SSE2 support. It does not
support multi byte nops and considers the endb32 opcode as an invalid
instruction.
Add an ifdef around the endbr32 opcode on x86-32.
Fixes: #18334
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18353)
Decoration prefix for some assembler labels in aes-gcm-avx512.pl was
fixed for mingw64 build.
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17868)
vsr2vr1() fails on OS X because the main loop doesn't strip the
non-numeric register prefixes for OS X.
Strip any non-numeric prefix (likely just "v") from registers before
doing numeric calculation, then put the prefix back on the result.
Fixes: #16995
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17026)
This commit adds an architecture named aix64-gcc-as which can generate
assembler source code compatible with AIX assembler (as) instead of the
GNU Assembler (gas). This architecture name is then used in a callback
for the .p2align directive which is not available in AIX as.
The motivation for this addition came out of an issue we ran into when
working on upgrading OpenSSL in Node.js. We ran into the following
compilation error on one of the CI machines that uses AIX:
05:39:05 Assembler:
05:39:05 crypto/bn/ppc64-mont-fixed.s: line 4: Error In Syntax
This machine is using AIX Version 7.2 and does not have gas installed
and the .p2align directive is causing this error. After asking around if
it would be possible to install GAS on this machine I learned that AIX
GNU utils are not maintained as well as the native AIX ones and we
(Red Hat/IBM) have run into issues with the GNU utils in the past and if
possible it would be preferable to be able to use the AIX native
assembler.
Refs: https://github.com/nodejs/node/pull/38512
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/15638)
Power has 2 numbering systems for vector registers:
* VR: Vector Registers are numbered from 0 to 31
* VSR: Vector-Scalar registers are numbers from 32 to 63
These refer to the same registers. Some instructions use VR numbering
for their operands, while others use VSR numbering.
When using Perl to provide a meaningful name for a register it makes
sense to use the same variable for both VR and VSR instructions. This
makes the code more readable.
However, providing a VSR number (i.e. >=32) to an instruction that
expects a VR number will cause an assembler error.
So, for instructions that require VR numbering, map VSR numbers
(i.e. >=32) to VR numbers. This also allows existing code that uses
VR numbering to remain unchanged.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/15401)
CLA: trivial
Reviewed-by: Matthias St. Pierre <Matthias.St.Pierre@ncp-e.com>
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/12161)
A small number of files contain references to the "OpenSSL license"
which has been deprecated and replaced by the "Apache License 2.0".
Amend the occurences.
Fixes#11649
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/11663)
It turns out that GNU as and Solaris as don't have compatible ideas on
the .section syntax, so we need to check if we're using GNU as or
another assembler and adapt this .section syntax accordingly.
Fixes#11132
Reviewed-by: Tomas Mraz <tmraz@fedoraproject.org>
(Merged from https://github.com/openssl/openssl/pull/11191)
Since pointer in x32 is 4 bytes, add x86_64-support.pl to define
pointer_size and pointer_register based on flavour to support
stuctures like:
struct { void *ptr; int blocks; }
This fixes 90-test_sslapi.t on x32. Verified with
$ ./Configure shared linux-x86_64
$ make
$ make test
and
$ ./Configure shared linux-x32
$ make
$ make test
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Tomas Mraz <tmraz@fedoraproject.org>
(Merged from https://github.com/openssl/openssl/pull/10988)
In https://github.com/openssl/openssl/pull/10883, I'd meant to exclude
the perlasm drivers since they aren't opening pipes and do not
particularly need it, but I only noticed x86_64-xlate.pl, so
arm-xlate.pl and ppc-xlate.pl got the change.
That seems to have been fine, so be consistent and also apply the change
to x86_64-xlate.pl. Checking for errors is generally a good idea.
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: David Benjamin <davidben@google.com>
(Merged from https://github.com/openssl/openssl/pull/10930)
We should always generate .note.gnu.property section in x86_64 assembly
codes for ELF outputs to mark Intel CET support since all input files
must be marked with Intel CET support in order for linker to mark output
with Intel CET support. Also .note.gnu.property section in x32 should
be aligned to 4 bytes, not 8 bytes and .p2align should be used
consistently.
Verified with
$ CC="gcc -Wl,-z,cet-report=error" ./Configure shared linux-x86_64 -fcf-protection
$ make
$ make test
and
$ CC="gcc -mx32 -Wl,-z,cet-report=error" ./Configure shared linux-x32 -fcf-protection
$ make
$ make test # <<< 90-test_sslapi.t failed because 8-byte pointer size.
Fix#10896
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Paul Dale <paul.dale@oracle.com>
Reviewed-by: Tomas Mraz <tmraz@fedoraproject.org>
(Merged from https://github.com/openssl/openssl/pull/10985)
We should always generate .note.gnu.property section in x86 assembly
codes for ELF outputs to mark Intel CET support since all input files
must be marked with Intel CET support in order for linker to mark output
with Intel CET support.
Verified with
$ CC="gcc -Wl,-z,cet-report=error" ./Configure shared linux-x86 -fcf-protection
$ make
$ make test
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Paul Dale <paul.dale@oracle.com>
(Merged from https://github.com/openssl/openssl/pull/11044)
To support Intel CET, all indirect branch targets must start with
endbranch. Here is a patch to add endbranch to all function entries
in x86 assembly codes which are indirect branch targets as discovered
by running openssl testsuite on Intel CET machine and visual inspection.
Since x86 cbc.pl uses indirect branch with a jump table, we also need
to add endbranch to all jump targets.
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Paul Dale <paul.dale@oracle.com>
(Merged from https://github.com/openssl/openssl/pull/10984)
If one of the perlasm xlate drivers crashes, OpenSSL's build will
currently swallow the error and silently truncate the output to however
far the driver got. This will hopefully fail to build, but better to
check such things.
Handle this by checking for errors when closing STDOUT (which is a pipe
to the xlate driver).
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Tomas Mraz <tmraz@fedoraproject.org>
(Merged from https://github.com/openssl/openssl/pull/10883)
This appears to be emitted with gcc and clang with -fcf-protection
selected, so we should do the same.
We're trying to be smart, and only emit this when the 'endbranch'
pseudo-mnemonic has been used at least once.
This is inspired by and owes to work done by @hjl-tools (github)
Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
(Merged from https://github.com/openssl/openssl/pull/10875)