Commit Graph

26 Commits

Author SHA1 Message Date
Jerry Shih
fcf68127e2 riscv: Provide a vector implementation of CHACHA20 cipher.
Use rvv and zvbb extensions for CHACHA20 cipher.

Signed-off-by: Jerry Shih <jerry.shih@sifive.com>
Signed-off-by: Phoebe Chen <phoebe.chen@sifive.com>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21923)
2023-10-26 15:55:50 +01:00
Min Zhou
9a41a3c6a4 LoongArch64 assembly pack: add ChaCha20 modules
This assembly implementation for ChaCha20 includes three code paths:
scalar path, 128-bit LSX path and 256-bit LASX path. We prefer the
LASX path or LSX path if the hardware and system support these
extensions.

There are 32 vector registers avaialable in the LSX and LASX
extensions. So, we can load the 16 initial states and the 16
intermediate states of ChaCha into the 32 vector registers for
calculating in the implementation. The test results on the 3A5000
and 3A6000 show that this assembly implementation significantly
improves the performance of ChaCha20 on LoongArch based machines.
The detailed test results are as following.

Test with:
$ openssl speed -evp chacha20

3A5000
type               16 bytes     64 bytes    256 bytes    1024 bytes    8192 bytes   16384 bytes
C code           178484.53k   282789.93k   311793.70k    322234.99k    324405.93k    324659.88k
assembly code    223152.28k   407863.65k   989520.55k   2049192.96k   2127248.70k   2131749.55k
                   +25%         +44%         +217%        +536%         +556%         +557%

3A6000
type               16 bytes     64 bytes     256 bytes    1024 bytes    8192 bytes   16384 bytes
C code           214945.33k   310041.75k    340724.22k    349949.27k    352925.01k    353140.74k
assembly code    299151.34k   492766.34k   2070166.02k   4300909.91k   4473978.88k   4499084.63k
                   +39%         +59%         +508%         +1129%        +1168%        +1174%

Signed-off-by: Min Zhou <zhoumin@loongson.cn>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/21998)
2023-09-11 08:49:09 +10:00
Evan Miller
175645a1a6 Do not build P10-specific AES-GCM assembler on macOS
Reviewed-by: Tom Cosgrove <tom.cosgrove@arm.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/20543)
2023-03-22 14:26:26 +01:00
Tomas Mraz
abfc152126 Do not build P10-specific Chacha20 assembler on AIX
Fixes #18145

Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/19865)

(cherry picked from commit cdcc439aa0)
2022-12-14 12:53:00 +01:00
Tomas Mraz
db24ed5430 Generate the preprocessed .s files for chacha and poly 1305 on ia64
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18263)
2022-05-27 08:10:49 +02:00
Tomas Mraz
debd921006 Revert "Use .s extension for ia64 assembler"
This reverts commit 6009997abd.

The .s extension is incorrect as the assembler files contain
preprocessor directives.

Fixes #18259

Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18263)
2022-05-27 08:10:49 +02:00
Sebastian Andrzej Siewior
9968c77539 Rename x86-32 assembly files from .s to .S.
Rename x86-32 assembly files from .s to .S. While processing the .S file
gcc will use the pre-processor whic will evaluate macros and ifdef. This
is turn will be used to enable the endbr32 opcode based on the __CET__
define.

Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18353)
2022-05-24 13:16:06 +10:00
Daniel Hu
b1b2146ded Acceleration of chacha20 on aarch64 by SVE
This patch accelerates chacha20 on aarch64 when Scalable Vector Extension
(SVE) is supported by CPU. Tested on modern micro-architecture with
256-bit SVE, it has the potential to improve performance up to 20%

The solution takes a hybrid approach. SVE will handle multi-blocks that fit
the SVE vector length, with Neon/Scalar to process any tail data

Test result:
With SVE
type            1024 bytes   8192 bytes  16384 bytes
ChaCha20        1596208.13k  1650010.79k  1653151.06k

Without SVE (by Neon/Scalar)
type            1024 bytes   8192 bytes  16384 bytes
chacha20        1355487.91k  1372678.83k  1372662.44k

The assembly code has been reviewed internally by
ARM engineer Fangming.Fang@arm.com

Signed-off-by: Daniel Hu <Daniel.Hu@arm.com>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17916)
2022-05-03 14:37:46 +10:00
Jon Spillett
6009997abd Use .s extension for ia64 assembler
Reviewed-by: Shane Lontis <shane.lontis@oracle.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18136)
2022-04-25 14:04:57 +02:00
Deepankar Bhattacharjee
f596bbe4da chacha20 performance optimizations for ppc64le with 8x lanes,
Performance increase around 50%.

Co-authored-by: Madhusudhanan Duraisamy <madurais@in.ibm.com>

Co-authored-by: Nilamjyoti Goswami <nilamgoswami@in.ibm.com>

Co-authored-by: Siva Sundar Anbareeswaran <srisivasundar@in.ibm.com>

Reviewed-by: Danny Tsen <dtsen@us.ibm.com>
Tested-by: Danny Tsen <dtsen@us.ibm.com>
Signed-off-by: Danny <dtsen@us.ibm.com>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/16637)
2022-02-22 16:58:55 +11:00
Tomas Mraz
3d178db73b ppccap.c: Split out algorithm-specific functions
Fixes #13336

Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/15828)
2021-06-25 08:49:45 +01:00
Richard Levitte
a1c8befd66 build.info: For all assembler generators, remove all arguments
Since the arguments are now generated in the build file templates,
they should be removed from the build.info files.

Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/9884)
2019-09-16 16:29:57 +02:00
Richard Levitte
bcb7afe18a Move chacha_asm_src file information to build.info files
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/9166)
2019-06-17 16:08:53 +02:00
Andy Polyakov
291bc802e4 IA64 assembly pack: add {chacha|poly1305}-ia64 modules.
Reviewed-by: Paul Dale <paul.dale@oracle.com>
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/8540)
2019-03-29 07:33:15 +01:00
Richard Levitte
77adb75e16 Build: Remove BEGINRAW / ENDRAW / OVERRIDE
It was an ugly hack to avoid certain problems that are no more.

Also added GENERATE lines for perlasm scripts that didn't have that
explicitly.

Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/8125)
2019-01-31 16:19:49 +01:00
Patrick Steuer
f760137b21 crypto/chacha/asm/chacha-s390x.pl: add vx code path.
Signed-off-by: Patrick Steuer <patrick.steuer@de.ibm.com>

Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/6919)
2019-01-05 09:40:35 +01:00
Richard Levitte
722c9762f2 Harmonize the make variables across all known platforms families
The make variables LIB_CFLAGS, DSO_CFLAGS and so on were used in
addition to CFLAGS and so on.  This works without problem on Unix and
Windows, where options with different purposes (such as -D and -I) can
appear anywhere on the command line and get accumulated as they come.
This is not necessarely so on VMS.  For example, macros must all be
collected and given through one /DEFINE, and the same goes for
inclusion directories (/INCLUDE).

So, to harmonize all platforms, we repurpose make variables starting
with LIB_, DSO_ and BIN_ to be all encompassing variables that
collects the corresponding values from CFLAGS, CPPFLAGS, DEFINES,
INCLUDES and so on together with possible config target values
specific for libraries DSOs and programs, and use them instead of the
general ones everywhere.

This will, for example, allow VMS to use the exact same generators for
generated files that go through cpp as all other platforms, something
that has been impossible to do safely before now.

Reviewed-by: Andy Polyakov <appro@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/5357)
2018-02-14 17:13:53 +01:00
Richard Levitte
8c3bc594e0 Processing GNU-style "make variables" - separate CPP flags from C flags
C preprocessor flags get separated from C flags, which has the
advantage that we don't get loads of macro definitions and inclusion
directory specs when linking shared libraries, DSOs and programs.

This is a step to add support for "make variables" when configuring.

Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/5177)
2018-01-28 07:26:10 +01:00
Richard Levitte
f425f9dcff Add $(LIB_CFLAGS) for any build.info generator that uses $(CFLAGS)
The reason to do so is that some of the generators detect PIC flags
like -fPIC and -KPIC, and those are normally delivered in LD_CFLAGS.

Reviewed-by: Rich Salz <rsalz@openssl.org>
2016-03-13 00:02:55 +01:00
Andy Polyakov
ee619197db crypto/*/build.info: make it work on ARM platforms.
Reviewed-by: Richard Levitte <levitte@openssl.org>
2016-03-11 15:30:57 +01:00
Richard Levitte
f0667b1430 Add include directory options for assembler files that include from crypto/
Closes RT#4406

Reviewed-by: Rich Salz <rsalz@openssl.org>
2016-03-10 20:30:47 +01:00
Richard Levitte
df0cb57ca3 Unified - adapt the generation of chacha assembler to use GENERATE
This gets rid of the BEGINRAW..ENDRAW sections in crypto/chacha/build.info.

This also moves the assembler generating perl scripts to take the
output file name as last command line argument, where necessary.

Reviewed-by: Andy Polyakov <appro@openssl.org>
2016-03-09 11:09:26 +01:00
Richard Levitte
de72be2e57 Pass $(CC) to perlasm scripts via the environment
It seems that on some platforms, the perlasm scripts call the C
compiler for certain checks.  These scripts need the environment
variable CC to have the C compiler command.

Reviewed-by: Rich Salz <rsalz@openssl.org>
2016-02-13 19:21:36 +01:00
Andy Polyakov
9e58d1192d PPC assembly pack: add ChaCha20 and Poly1305 modules.
Reviewed-by: Richard Levitte <levitte@openssl.org>
2016-02-13 17:21:47 +01:00
Richard Levitte
567a9e6fe0 unified build scheme: add a "unified" template for Unix Makefile
This also adds all the raw sections needed for some files.

Reviewed-by: Rich Salz <rsalz@openssl.org>
2016-02-10 14:36:04 +01:00
Richard Levitte
777a288270 unified build scheme: add build.info files
Now that we have the foundation for the "unified" build scheme in
place, we add build.info files.  They have been generated from the
Makefiles in the same directories.  Things that are platform specific
will appear in later commits.

Reviewed-by: Andy Polyakov <appro@openssl.org>
2016-02-01 12:46:58 +01:00