Use rvv and zvbb extensions for CHACHA20 cipher.
Signed-off-by: Jerry Shih <jerry.shih@sifive.com>
Signed-off-by: Phoebe Chen <phoebe.chen@sifive.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
This assembly implementation for ChaCha20 includes three code paths:
scalar path, 128-bit LSX path and 256-bit LASX path. We prefer the
LASX path or LSX path if the hardware and system support these
extensions.
There are 32 vector registers avaialable in the LSX and LASX
extensions. So, we can load the 16 initial states and the 16
intermediate states of ChaCha into the 32 vector registers for
calculating in the implementation. The test results on the 3A5000
and 3A6000 show that this assembly implementation significantly
improves the performance of ChaCha20 on LoongArch based machines.
The detailed test results are as following.
Test with:
$ openssl speed -evp chacha20
3A5000
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
C code 178484.53k 282789.93k 311793.70k 322234.99k 324405.93k 324659.88k
assembly code 223152.28k 407863.65k 989520.55k 2049192.96k 2127248.70k 2131749.55k
+25% +44% +217% +536% +556% +557%
3A6000
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
C code 214945.33k 310041.75k 340724.22k 349949.27k 352925.01k 353140.74k
assembly code 299151.34k 492766.34k 2070166.02k 4300909.91k 4473978.88k 4499084.63k
+39% +59% +508% +1129% +1168% +1174%
Signed-off-by: Min Zhou <zhoumin@loongson.cn>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21998)
This reverts commit 6009997abd.
The .s extension is incorrect as the assembler files contain
preprocessor directives.
Fixes#18259
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18263)
Rename x86-32 assembly files from .s to .S. While processing the .S file
gcc will use the pre-processor whic will evaluate macros and ifdef. This
is turn will be used to enable the endbr32 opcode based on the __CET__
define.
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18353)
This patch accelerates chacha20 on aarch64 when Scalable Vector Extension
(SVE) is supported by CPU. Tested on modern micro-architecture with
256-bit SVE, it has the potential to improve performance up to 20%
The solution takes a hybrid approach. SVE will handle multi-blocks that fit
the SVE vector length, with Neon/Scalar to process any tail data
Test result:
With SVE
type 1024 bytes 8192 bytes 16384 bytes
ChaCha20 1596208.13k 1650010.79k 1653151.06k
Without SVE (by Neon/Scalar)
type 1024 bytes 8192 bytes 16384 bytes
chacha20 1355487.91k 1372678.83k 1372662.44k
The assembly code has been reviewed internally by
ARM engineer Fangming.Fang@arm.com
Signed-off-by: Daniel Hu <Daniel.Hu@arm.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17916)
Since the arguments are now generated in the build file templates,
they should be removed from the build.info files.
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/9884)
It was an ugly hack to avoid certain problems that are no more.
Also added GENERATE lines for perlasm scripts that didn't have that
explicitly.
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/8125)
Signed-off-by: Patrick Steuer <patrick.steuer@de.ibm.com>
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/6919)
The make variables LIB_CFLAGS, DSO_CFLAGS and so on were used in
addition to CFLAGS and so on. This works without problem on Unix and
Windows, where options with different purposes (such as -D and -I) can
appear anywhere on the command line and get accumulated as they come.
This is not necessarely so on VMS. For example, macros must all be
collected and given through one /DEFINE, and the same goes for
inclusion directories (/INCLUDE).
So, to harmonize all platforms, we repurpose make variables starting
with LIB_, DSO_ and BIN_ to be all encompassing variables that
collects the corresponding values from CFLAGS, CPPFLAGS, DEFINES,
INCLUDES and so on together with possible config target values
specific for libraries DSOs and programs, and use them instead of the
general ones everywhere.
This will, for example, allow VMS to use the exact same generators for
generated files that go through cpp as all other platforms, something
that has been impossible to do safely before now.
Reviewed-by: Andy Polyakov <appro@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/5357)
C preprocessor flags get separated from C flags, which has the
advantage that we don't get loads of macro definitions and inclusion
directory specs when linking shared libraries, DSOs and programs.
This is a step to add support for "make variables" when configuring.
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/5177)
The reason to do so is that some of the generators detect PIC flags
like -fPIC and -KPIC, and those are normally delivered in LD_CFLAGS.
Reviewed-by: Rich Salz <rsalz@openssl.org>
This gets rid of the BEGINRAW..ENDRAW sections in crypto/chacha/build.info.
This also moves the assembler generating perl scripts to take the
output file name as last command line argument, where necessary.
Reviewed-by: Andy Polyakov <appro@openssl.org>
It seems that on some platforms, the perlasm scripts call the C
compiler for certain checks. These scripts need the environment
variable CC to have the C compiler command.
Reviewed-by: Rich Salz <rsalz@openssl.org>
Now that we have the foundation for the "unified" build scheme in
place, we add build.info files. They have been generated from the
Makefiles in the same directories. Things that are platform specific
will appear in later commits.
Reviewed-by: Andy Polyakov <appro@openssl.org>