Commit Graph

529 Commits

Author SHA1 Message Date
Pauli
e442cdaea2 remove unused initialisations
Reviewed-by: Tomas Mraz <tmraz@fedoraproject.org>
(Merged from https://github.com/openssl/openssl/pull/13577)
2020-12-03 11:22:06 +10:00
Ard Biesheuvel
26217510d2 aes/asm/aesv8-armx.pl: avoid 32-bit lane assignment in CTR mode
ARM Cortex-A57 and Cortex-A72 cores running in 32-bit mode are affected
by silicon errata #1742098 [0] and #1655431 [1], respectively, where the
second instruction of a AES instruction pair may execute twice if an
interrupt is taken right after the first instruction consumes an input
register of which a single 32-bit lane has been updated the last time it
was modified.

This is not such a rare occurrence as it may seem: in counter mode, only
the least significant 32-bit word is incremented in the absence of a
carry, which makes our counter mode implementation susceptible to these
errata.

So let's shuffle the counter assignments around a bit so that the most
recent updates when the AES instruction pair executes are 128-bit wide.

[0] ARM-EPM-049219 v23 Cortex-A57 MPCore Software Developers Errata Notice
[1] ARM-EPM-012079 v11.0 Cortex-A72 MPCore Software Developers Errata Notice

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@arm.com>

Reviewed-by: Paul Dale <paul.dale@oracle.com>
Reviewed-by: Tomas Mraz <tmraz@fedoraproject.org>
(Merged from https://github.com/openssl/openssl/pull/13504)
2020-11-30 12:14:54 +01:00
XiaokangQian
9ce8e0d17e Optimize AES-XTS mode in OpenSSL for aarch64
Aes-xts mode can be optimized by interleaving cipher operation on
several blocks and loop unrolling. Interleaving needs one ideal
unrolling factor, here we adopt the same factor with aes-cbc,
which is described as below:
	If blocks number > 5, select 5 blocks as one iteration,every
	loop, decrease the blocks number by 5.
	If left blocks < 5, treat them as tail blocks.
Detailed implementation has a little adjustment for squeezing
code space.
With this way, for small size such as 16 bytes, the performance is
similar as before, but for big size such as 16k bytes, the performance
improves a lot, even reaches to 2x uplift, for some arches such as A57,
the improvement even reaches more than 2x uplift. We collect many
performance datas on different micro-archs such as thunderx2,
ampere-emag, a72, a75, a57, a53 and N1, all of which reach 0.5-2x uplift.
The following table lists the encryption performance data on aarch64,
take a72, a75, a57, a53 and N1 as examples. Performance value takes the
unit of cycles per byte, takes the format as comparision of values.
List them as below:

A72:
                            Before optimization     After optimization  Improve
evp-aes-128-xts@16          8.899913518             5.949087263         49.60%
evp-aes-128-xts@64          4.525512668             3.389141845         33.53%
evp-aes-128-xts@256         3.502906908             1.633573479         114.43%
evp-aes-128-xts@1024        3.174210419             1.155952639         174.60%
evp-aes-128-xts@8192        3.053019303             1.028134888         196.95%
evp-aes-128-xts@16384       3.025292462             1.02021169          196.54%
evp-aes-256-xts@16          9.971105023             6.754233758         47.63%
evp-aes-256-xts@64          4.931479093             3.786527393         30.24%
evp-aes-256-xts@256         3.746788153             1.943975947         92.74%
evp-aes-256-xts@1024        3.401743802             1.477394648         130.25%
evp-aes-256-xts@8192        3.278769327             1.32950421          146.62%
evp-aes-256-xts@16384       3.27093296              1.325276257         146.81%

A75:
                            Before optimization     After optimization  Improve
evp-aes-128-xts@16          8.397965173             5.126839098         63.80%
evp-aes-128-xts@64          4.176860631             2.59817764          60.76%
evp-aes-128-xts@256         3.069126585             1.284561028         138.92%
evp-aes-128-xts@1024        2.805962699             0.932754655         200.83%
evp-aes-128-xts@8192        2.725820131             0.829820397         228.48%
evp-aes-128-xts@16384       2.71521905              0.823251591         229.82%
evp-aes-256-xts@16          11.24790935             7.383914448         52.33%
evp-aes-256-xts@64          5.294128847             3.048641998         73.66%
evp-aes-256-xts@256         3.861649617             1.570359905         145.91%
evp-aes-256-xts@1024        3.537646797             1.200493533         194.68%
evp-aes-256-xts@8192        3.435353012             1.085345319         216.52%
evp-aes-256-xts@16384       3.437952563             1.097963822         213.12%

A57:
                            Before optimization     After optimization  Improve
evp-aes-128-xts@16          10.57455446             7.165438012         47.58%
evp-aes-128-xts@64          5.418185447             3.721241202         45.60%
evp-aes-128-xts@256         3.855184592             1.747145379         120.66%
evp-aes-128-xts@1024        3.477199757             1.253049735         177.50%
evp-aes-128-xts@8192        3.36768104              1.091943159         208.41%
evp-aes-128-xts@16384       3.360373443             1.088942789         208.59%
evp-aes-256-xts@16          12.54559459             8.745489036         43.45%
evp-aes-256-xts@64          6.542808937             4.326387568         51.23%
evp-aes-256-xts@256         4.62668822              2.119908754         118.25%
evp-aes-256-xts@1024        4.161716505             1.557335554         167.23%
evp-aes-256-xts@8192        4.032462227             1.377749511         192.68%
evp-aes-256-xts@16384       4.023293877             1.371558933         193.34%

A53:
                            Before optimization     After optimization  Improve
evp-aes-128-xts@16          18.07842135             13.96980808         29.40%
evp-aes-128-xts@64          7.933818397             6.07159276          30.70%
evp-aes-128-xts@256         5.264604704             2.611155744         101.60%
evp-aes-128-xts@1024        4.606660117             1.722713454         167.40%
evp-aes-128-xts@8192        4.405160115             1.454379201         202.90%
evp-aes-128-xts@16384       4.401592028             1.442279392         205.20%
evp-aes-256-xts@16          20.07084054             16.00803726         25.40%
evp-aes-256-xts@64          9.192647294             6.883876732         33.50%
evp-aes-256-xts@256         6.336143161             3.108140452         103.90%
evp-aes-256-xts@1024        5.62502952              2.097960651         168.10%
evp-aes-256-xts@8192        5.412085608             1.807294191         199.50%
evp-aes-256-xts@16384       5.403062591             1.790135764         201.80%

N1:
                            Before optimization     After optimization  Improve
evp-aes-128-xts@16          6.48147613              4.209415473         53.98%
evp-aes-128-xts@64          2.847744115             1.950757468         45.98%
evp-aes-128-xts@256         2.085711968             1.061903238         96.41%
evp-aes-128-xts@1024        1.842014669             0.798486302         130.69%
evp-aes-128-xts@8192        1.760449052             0.713853939         146.61%
evp-aes-128-xts@16384       1.760763546             0.707702009         148.80%
evp-aes-256-xts@16          7.264142817             5.265970454         37.94%
evp-aes-256-xts@64          3.251356212             2.41176323          34.81%
evp-aes-256-xts@256         2.380488469             1.342095742         77.37%
evp-aes-256-xts@1024        2.08853022              1.041718215         100.49%
evp-aes-256-xts@8192        2.027432668             0.944571334         114.64%
evp-aes-256-xts@16384       2.00740782              0.941991415         113.10%

Add more XTS test cases to cover the cipher stealing mode and cases of different
number of blocks.

CustomizedGitHooks: yes
Change-Id: I93ee31b2575e1413764e27b599af62994deb4c96

Reviewed-by: Paul Dale <paul.dale@oracle.com>
Reviewed-by: Tomas Mraz <tmraz@fedoraproject.org>
(Merged from https://github.com/openssl/openssl/pull/11399)
2020-11-12 11:09:22 +01:00
Jung-uk Kim
cd84d8832d Ignore vendor name in Clang version number.
For example, FreeBSD prepends "FreeBSD" to version string, e.g.,

FreeBSD clang version 11.0.0 (git@github.com:llvm/llvm-project.git llvmorg-11.0.0-rc2-0-g414f32a9e86)
Target: x86_64-unknown-freebsd13.0
Thread model: posix
InstalledDir: /usr/bin

This prevented us from properly detecting AVX support, etc.

CLA: trivial

Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Paul Dale <paul.dale@oracle.com>
Reviewed-by: Ben Kaduk <kaduk@mit.edu>
(Merged from https://github.com/openssl/openssl/pull/12725)
2020-08-27 20:27:26 -07:00
Bernd Edlinger
77286fe3ec Avoid undefined behavior with unaligned accesses
Fixes: #4983

[extended tests]

Reviewed-by: Nicola Tuveri <nic.tuv@gmail.com>
(Merged from https://github.com/openssl/openssl/pull/6074)
2020-05-27 20:11:20 +02:00
Matt Caswell
33388b44b6 Update copyright year
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/11616)
2020-04-23 13:55:52 +01:00
Rich Salz
705536e2b5 Use build.info, not ifdef for crypto modules
Don't wrap conditionally-compiled files in global ifndef tests.
Instead, test if the feature is disabled and, if so, do not
compile it.

Reviewed-by: Tomas Mraz <tmraz@fedoraproject.org>
Reviewed-by: Paul Dale <paul.dale@oracle.com>
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/11263)
2020-04-16 13:52:22 +02:00
Patrick Steuer
7b2ce4a6e8 aes-s390x.pl: fix stg offset caused by typo in perlasm
Signed-off-by: Patrick Steuer <patrick.steuer@de.ibm.com>

Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/11234)
2020-03-05 17:19:33 +01:00
H.J. Lu
0d51cf3ccc x86_64: Don't assume 8-byte pointer size
Since pointer in x32 is 4 bytes, add x86_64-support.pl to define
pointer_size and pointer_register based on flavour to support
stuctures like:

struct {  void *ptr; int blocks;  }

This fixes 90-test_sslapi.t on x32.  Verified with

$ ./Configure shared linux-x86_64
$ make
$ make test

and

$ ./Configure shared linux-x32
$ make
$ make test

Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Tomas Mraz <tmraz@fedoraproject.org>
(Merged from https://github.com/openssl/openssl/pull/10988)
2020-02-18 18:03:16 +01:00
David Benjamin
a21314dbbc Also check for errors in x86_64-xlate.pl.
In https://github.com/openssl/openssl/pull/10883, I'd meant to exclude
the perlasm drivers since they aren't opening pipes and do not
particularly need it, but I only noticed x86_64-xlate.pl, so
arm-xlate.pl and ppc-xlate.pl got the change.

That seems to have been fine, so be consistent and also apply the change
to x86_64-xlate.pl. Checking for errors is generally a good idea.

Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: David Benjamin <davidben@google.com>
(Merged from https://github.com/openssl/openssl/pull/10930)
2020-02-17 12:17:53 +10:00
simplelins
bc8b648f74 Fix a bug for aarch64 BigEndian
FIXES #10692 #10638
a bug for aarch64 bigendian with instructions 'st1' and 'ld1' on AES-GCM mode.

CLA: trivial

Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Paul Dale <paul.dale@oracle.com>
(Merged from https://github.com/openssl/openssl/pull/10751)
2020-02-17 12:13:23 +10:00
H.J. Lu
98ad3fe82b x86_64: Add endbranch at function entries for Intel CET
To support Intel CET, all indirect branch targets must start with
endbranch.  Here is a patch to add endbranch to function entries
in x86_64 assembly codes which are indirect branch targets as
discovered by running openssl testsuite on Intel CET machine and
visual inspection.

Verified with

$ CC="gcc -Wl,-z,cet-report=error" ./Configure shared linux-x86_64 -fcf-protection
$ make
$ make test

and

$ CC="gcc -mx32 -Wl,-z,cet-report=error" ./Configure shared linux-x32 -fcf-protection
$ make
$ make test # <<< passed with https://github.com/openssl/openssl/pull/10988

Reviewed-by: Tomas Mraz <tmraz@fedoraproject.org>
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/10982)
2020-02-15 22:15:03 +01:00
Dr. Matthias St. Pierre
7fa8bcfe43 Fix misspelling errors and typos reported by codespell
Fixes #10998

Reviewed-by: Shane Lontis <shane.lontis@oracle.com>
(Merged from https://github.com/openssl/openssl/pull/11000)
2020-02-06 17:01:00 +01:00
David Benjamin
32be631ca1 Do not silently truncate files on perlasm errors
If one of the perlasm xlate drivers crashes, OpenSSL's build will
currently swallow the error and silently truncate the output to however
far the driver got. This will hopefully fail to build, but better to
check such things.

Handle this by checking for errors when closing STDOUT (which is a pipe
to the xlate driver).

Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Tomas Mraz <tmraz@fedoraproject.org>
(Merged from https://github.com/openssl/openssl/pull/10883)
2020-01-22 18:11:30 +01:00
Richard Levitte
9bb3e5fd87 For all assembler scripts where it matters, recognise clang > 9.x
Fixes #10853

Reviewed-by: Paul Dale <paul.dale@oracle.com>
(Merged from https://github.com/openssl/openssl/pull/10855)
2020-01-17 08:55:45 +01:00
Matt Caswell
c72fa2554f Deprecate the low level AES functions
Use of the low level AES functions has been informally discouraged for a
long time. We now formally deprecate them.

Applications should instead use the EVP APIs, e.g. EVP_EncryptInit_ex,
EVP_EncryptUpdate, EVP_EncryptFinal_ex, and the equivalently named decrypt
functions.

Reviewed-by: Tomas Mraz <tmraz@fedoraproject.org>
(Merged from https://github.com/openssl/openssl/pull/10580)
2020-01-06 15:09:57 +00:00
Shane Lontis
0d2bfe52bb Add AES_CBC_HMAC_SHA ciphers to providers.
Also Add ability for providers to dynamically exclude cipher algorithms.
Cipher algorithms are only returned from providers if their capable() method is either NULL,
or the method returns 1.
This is mainly required for ciphers that only have hardware implementations.
If there is no hardware support, then the algorithm needs to be not available.

Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/10146)
2020-01-06 13:02:16 +10:00
Bernd Edlinger
665de4d48a Fix aesni_cbc_sha256_enc_avx2 backtrace info
We store a secondary frame pointer info for the debugger
in the red zone.  This fixes a crash in the unwinder when
this function is interrupted.

Additionally the missing cfi function annotation is added
to aesni_cbc_sha256_enc_shaext.

[extended tests]

Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/10674)
2019-12-23 16:57:42 +01:00
Bernd Edlinger
b0d3442efc Add some missing cfi frame info in aesni-sha and sha-x86_64.pl
Reviewed-by: Kurt Roeckx <kurt@roeckx.be>
(Merged from https://github.com/openssl/openssl/pull/10655)
2019-12-20 23:14:37 +01:00
Bernd Edlinger
a5fe7825b9 Add some missing cfi frame info in aesni-x86_64.pl
Reviewed-by: Kurt Roeckx <kurt@roeckx.be>
(Merged from https://github.com/openssl/openssl/pull/10653)
2019-12-20 23:08:57 +01:00
Bernd Edlinger
4a0b7ffcc0 Add some missing cfi frame info in aes-x86_64.pl
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Kurt Roeckx <kurt@roeckx.be>
(Merged from https://github.com/openssl/openssl/pull/10650)
2019-12-20 22:46:20 +01:00
Veres Lajos
79c44b4e30 Fix some typos
Reported-by: misspell-fixer <https://github.com/vlajos/misspell-fixer>

CLA: trivial

Reviewed-by: Matthias St. Pierre <Matthias.St.Pierre@ncp-e.com>
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/10544)
2019-12-11 19:04:01 +01:00
XiaokangQian
2ff16afc17 Optimize AES-ECB mode in OpenSSL for both aarch64 and aarch32
Aes-ecb mode can be optimized by inverleaving cipher operation on
several blocks and loop unrolling. Interleaving needs one ideal
unrolling factor, here we adopt the same factor with aes-cbc,
which is described as below:
    If blocks number > 5, select 5 blocks as one iteration,every
    loop, decrease the blocks number by 5.
    If 3 < left blocks < 5 select 3 blocks as one iteration, every
    loop, decrease the block number by 3.
    If left blocks < 3, treat them as tail blocks.
Detailed implementation will have a little adjustment for squeezing
code space.
With this way, for small size such as 16 bytes, the performance is
similar as before, but for big size such as 16k bytes, the performance
improves a lot, even reaches to 100%, for some arches such as A57,
the improvement  even exceeds 100%. The following table will list the
encryption performance data on aarch64, take a72 and a57 as examples.
Performance value takes the unit of cycles per byte, takes the format
as comparision of values. List them as below:

A72:
                            Before optimization     After optimization  Improve
evp-aes-128-ecb@16          17.26538237             16.82663866         2.61%
evp-aes-128-ecb@64          5.50528499              5.222637557         5.41%
evp-aes-128-ecb@256         2.632700213             1.908442892         37.95%
evp-aes-128-ecb@1024        1.876102047             1.078018868         74.03%
evp-aes-128-ecb@8192        1.6550392               0.853982929         93.80%
evp-aes-128-ecb@16384       1.636871283             0.847623957         93.11%
evp-aes-192-ecb@16          17.73104961             17.09692468         3.71%
evp-aes-192-ecb@64          5.78984398              5.418545192         6.85%
evp-aes-192-ecb@256         2.872005308             2.081815274         37.96%
evp-aes-192-ecb@1024        2.083226672             1.25095642          66.53%
evp-aes-192-ecb@8192        1.831992057             0.995916251         83.95%
evp-aes-192-ecb@16384       1.821590009             0.993820525         83.29%
evp-aes-256-ecb@16          18.0606306              17.96963317         0.51%
evp-aes-256-ecb@64          6.19651997              5.762465812         7.53%
evp-aes-256-ecb@256         3.176991394             2.24642538          41.42%
evp-aes-256-ecb@1024        2.385991919             1.396018192         70.91%
evp-aes-256-ecb@8192        2.147862636             1.142222597         88.04%
evp-aes-256-ecb@16384       2.131361787             1.135944617         87.63%

A57:
                            Before optimization     After optimization  Improve
evp-aes-128-ecb@16          18.61045121             18.36456218         1.34%
evp-aes-128-ecb@64          6.438628994             5.467959461         17.75%
evp-aes-128-ecb@256         2.957452881             1.97238604          49.94%
evp-aes-128-ecb@1024        2.117096219             1.099665054         92.52%
evp-aes-128-ecb@8192        1.868385973             0.837440804         123.11%
evp-aes-128-ecb@16384       1.853078526             0.822420027         125.32%
evp-aes-192-ecb@16          19.07021756             18.50018552         3.08%
evp-aes-192-ecb@64          6.672351486             5.696088921         17.14%
evp-aes-192-ecb@256         3.260427769             2.131449916         52.97%
evp-aes-192-ecb@1024        2.410522832             1.250529718         92.76%
evp-aes-192-ecb@8192        2.17921605              0.973225504         123.92%
evp-aes-192-ecb@16384       2.162250997             0.95919871          125.42%
evp-aes-256-ecb@16          19.3008384              19.12743654         0.91%
evp-aes-256-ecb@64          6.992950658             5.92149541          18.09%
evp-aes-256-ecb@256         3.576361743             2.287619504         56.34%
evp-aes-256-ecb@1024        2.726671027             1.381267599         97.40%
evp-aes-256-ecb@8192        2.493583657             1.110959913         124.45%
evp-aes-256-ecb@16384       2.473916816             1.099967073         124.91%

Change-Id: Iccd23d972e0d52d22dc093f4c208f69c9d5a0ca7

Reviewed-by: Shane Lontis <shane.lontis@oracle.com>
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/10518)
2019-12-11 18:56:11 +01:00
Richard Levitte
936c2b9e93 Update source files for deprecation at 3.0
Previous macros suggested that from 3.0, we're only allowed to
deprecate things at a major version.  However, there's no policy
stating this, but there is for removal, saying that to remove
something, it must have been deprecated for 5 years, and that removal
can only happen at a major version.

Meanwhile, the semantic versioning rule is that deprecation should
trigger a MINOR version update, which is reflected in the macro names
as of this change.

Reviewed-by: Tim Hudson <tjh@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/10364)
2019-11-07 11:37:25 +01:00
Shane Lontis
64fd90fbe9 Fix missing Assembler defines
Implementations are now spread across several libraries, so the assembler
related defines need to be applied to all affected libraries and modules.

AES_ASM define was missing from libimplementations.a which disabled AESNI
aarch64 changes were made by xkqian.

Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/10180)
2019-10-16 16:10:39 +10:00
Richard Levitte
dec95d7589 Rework how our providers are built
We put almost everything in these internal static libraries:

libcommon               Block building code that can be used by all
                        our implementations, legacy and non-legacy
                        alike.
libimplementations      All non-legacy algorithm implementations and
                        only them.  All the code that ends up here is
                        agnostic to the definitions of FIPS_MODE.
liblegacy               All legacy implementations.

libnonfips              Support code for the algorithm implementations.
                        Built with FIPS_MODE undefined.  Any code that
                        checks that FIPS_MODE isn't defined must end
                        up in this library.
libfips                 Support code for the algorithm implementations.
                        Built with FIPS_MODE defined.  Any code that
                        checks that FIPS_MODE is defined must end up
                        in this library.

The FIPS provider module is built from providers/fips/*.c and linked
with libimplementations, libcommon and libfips.

The Legacy provider module is built from providers/legacy/*.c and
linked with liblegacy, libcommon and libcrypto.
If module building is disabled, the object files from liblegacy and
libcommon are added to libcrypto and the Legacy provider becomes a
built-in provider.

The Default provider module is built-in, so it ends up being linked
with libimplementations, libcommon and libnonfips.  For libcrypto in
form of static library, the object files from those other libraries
are simply being added to libcrypto.

Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/10088)
2019-10-10 14:12:15 +02:00
Dr. Matthias St. Pierre
ae4186b004 Fix header file include guard names
Make the include guards consistent by renaming them systematically according
to the naming conventions below

For the public header files (in the 'include/openssl' directory), the guard
names try to match the path specified in the include directives, with
all letters converted to upper case and '/' and '.' replaced by '_'. For the
private header files files, an extra 'OSSL_' is added as prefix.

Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/9333)
2019-09-28 20:26:36 +02:00
Dr. Matthias St. Pierre
706457b7bd Reorganize local header files
Apart from public and internal header files, there is a third type called
local header files, which are located next to source files in the source
directory. Currently, they have different suffixes like

  '*_lcl.h', '*_local.h', or '*_int.h'

This commit changes the different suffixes to '*_local.h' uniformly.

Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/9333)
2019-09-28 20:26:35 +02:00
Richard Levitte
1aa89a7a3a Unify all assembler file generators
They now generally conform to the following argument sequence:

    script.pl "$(PERLASM_SCHEME)" [ C preprocessor arguments ... ] \
              $(PROCESSOR) <output file>

However, in the spirit of being able to use these scripts manually,
they also allow for no argument, or for only the flavour, or for only
the output file.  This is done by only using the last argument as
output file if it's a file (it has an extension), and only using the
first argument as flavour if it isn't a file (it doesn't have an
extension).

While we're at it, we make all $xlate calls the same, i.e. the $output
argument is always quoted, and we always die on error when trying to
start $xlate.

There's a perl lesson in this, regarding operator priority...

This will always succeed, even when it fails:

    open FOO, "something" || die "ERR: $!";

The reason is that '||' has higher priority than list operators (a
function is essentially a list operator and gobbles up everything
following it that isn't lower priority), and since a non-empty string
is always true, so that ends up being exactly the same as:

    open FOO, "something";

This, however, will fail if "something" can't be opened:

    open FOO, "something" or die "ERR: $!";

The reason is that 'or' has lower priority that list operators,
i.e. it's performed after the 'open' call.

Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/9884)
2019-09-16 16:29:57 +02:00
Richard Levitte
a1c8befd66 build.info: For all assembler generators, remove all arguments
Since the arguments are now generated in the build file templates,
they should be removed from the build.info files.

Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/9884)
2019-09-16 16:29:57 +02:00
Antoine Cœur
c2969ff6e7 Fix Typos
CLA: trivial

Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Matthias St. Pierre <Matthias.St.Pierre@ncp-e.com>
(Merged from https://github.com/openssl/openssl/pull/9288)
2019-07-02 14:22:29 +02:00
Richard Levitte
2ce15a95da crypto/aes/build.info: Fix AES assembler specs
Two mistakes were made:

1. AES_ASM for x86 was misplaced
2. sse2 isn't applicable for x86_64 code

Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/9177)
2019-06-18 16:04:12 +02:00
Richard Levitte
cd42b9e9c2 Move aes_asm_src file information to build.info files
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/9166)
2019-06-17 16:08:52 +02:00
Richard Levitte
07c244f0cd Use variables in build.info files where it's worth the while
Reviewed-by: Shane Lontis <shane.lontis@oracle.com>
(Merged from https://github.com/openssl/openssl/pull/9144)
2019-06-15 00:34:02 +02:00
Matt Caswell
66ad63e801 Make basic AES ciphers available from within the FIPS providers
These ciphers were already provider aware, and were available from the
default provider. We move them into the FIPS provider too.

Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/9038)
2019-06-03 12:56:53 +01:00
Pauli
2752c8984c Revert "ppc assembly pack: always increment CTR IV as quadword"
The 32 bit counter behaviour is necessary and was intentional.

This reverts commit e9f148c935.

Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/8958)
2019-05-20 18:08:42 +10:00
Daniel Axtens
e9f148c935 ppc assembly pack: always increment CTR IV as quadword
The kernel self-tests picked up an issue with CTR mode. The issue was
detected with a test vector with an IV of
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFD: after 3 increments it should wrap
around to 0.

There are two paths that increment IVs: the bulk (8 at a time) path,
and the individual path which is used when there are fewer than 8 AES
blocks to process.

In the bulk path, the IV is incremented with vadduqm: "Vector Add
Unsigned Quadword Modulo", which does 128-bit addition.

In the individual path, however, the IV is incremented with vadduwm:
"Vector Add Unsigned Word Modulo", which instead does 4 32-bit
additions. Thus the IV would instead become
FFFFFFFFFFFFFFFFFFFFFFFF00000000, throwing off the result.

Use vadduqm.

This was probably a typo originally, what with q and w being
adjacent.

CLA: trivial

Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Paul Dale <paul.dale@oracle.com>
(Merged from https://github.com/openssl/openssl/pull/8942)
2019-05-17 11:05:16 +10:00
Andy Polyakov
d6e4287c97 aes/asm/aesv8-armx.pl: ~20% improvement on ThunderX2.
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/8776)
2019-04-17 21:30:39 +02:00
Andy Polyakov
6465321e40 ARM64 assembly pack: add ThunderX2 results.
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/8776)
2019-04-17 21:08:13 +02:00
Matt Caswell
fd367b4ce3 Deprecate AES_ige_encrypt() and AES_bi_ige_encrypt()
These undocumented functions were never integrated into the EVP layer
and implement the AES Infinite Garble Extension (IGE) mode and AES
Bi-directional IGE mode. These modes were never formally standardised
and usage of these functions is believed to be very small. In particular
AES_bi_ige_encrypt() has a known bug. It accepts 2 AES keys, but only
one is ever used. The security implications are believed to be minimal,
but this issue was never fixed for backwards compatibility reasons.

Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/8710)
2019-04-12 14:22:41 +01:00
Daniel Axtens
f643deac41 PPC assembly pack: fix copy-paste error in CTR mode
There are two copy-paste errors in handling CTR mode. When dealing
with a 2 or 3 block tail, the code branches to the CBC decryption exit
path, rather than to the CTR exit path.

This can lead to data corruption: in the Linux kernel we have a copy
of this file, and the bug leads to corruption of the IV, which leads
to data corruption when we call the encryption function again later to
encrypt subsequent blocks.

Originally reported to the Linux kernel by Ondrej Mosnáček <omosnacek@gmail.com>

CLA: trivial

Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Paul Dale <paul.dale@oracle.com>
(Merged from https://github.com/openssl/openssl/pull/8510)
2019-03-18 18:13:24 +10:00
Markus Stockhausen
4592172376 MIPS32R3 provides the EXT instruction to extract bits from
registers. As the AES table is already 1K aligned we can
use it everywhere and speedup table address calculation by
10%. Performance numbers:

decryption         16B       64B      256B     1024B     8192B
-------------------------------------------------------------------
aes-256-cbc   5636.84k  6443.26k  6689.02k  6752.94k  6766.59k bef.
aes-256-cbc   6200.31k  7195.71k  7504.30k  7585.11k  7599.45k aft.
-------------------------------------------------------------------
aes-128-cbc   7313.85k  8653.67k  9079.55k  9188.35k  9205.08k bef.
aes-128-cbc   7925.38k  9557.99k 10092.37k 10232.15k 10272.77k aft.

encryption         16B       64B      256B     1024B     8192B
-------------------------------------------------------------------
aes-256 cbc   6009.65k  6592.70k  6766.59k  6806.87k  6815.74k bef.
aes-256 cbc   6643.93k  7388.69k  7605.33k  7657.81k  7675.90k aft.
-------------------------------------------------------------------
aes-128 cbc   7862.09k  8892.48k  9214.04k  9291.78k  9311.57k bef.
aes-128 cbc   8639.29k  9881.17k 10265.86k 10363.56k 10392.92k aft.

Reviewed-by: Paul Dale <paul.dale@oracle.com>
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/8206)
2019-02-20 23:17:16 +01:00
David Benjamin
c0e8e5007b Fix some CFI issues in x86_64 assembly
The add/double shortcut in ecp_nistz256-x86_64.pl left one instruction
point that did not unwind, and the "slow" path in AES_cbc_encrypt was
not annotated correctly. For the latter, add
.cfi_{remember,restore}_state support to perlasm.

Next, fill in a bunch of functions that are missing no-op .cfi_startproc
and .cfi_endproc blocks. libunwind cannot unwind those stack frames
otherwise.

Finally, work around a bug in libunwind by not encoding rflags. (rflags
isn't a callee-saved register, so there's not much need to annotate it
anyway.)

These were found as part of ABI testing work in BoringSSL.

Reviewed-by: Richard Levitte <levitte@openssl.org>
GH: #8109
2019-02-17 23:39:51 +01:00
Andy Polyakov
db42bb440e ARM64 assembly pack: make it Windows-friendly.
"Windows friendliness" means a) unified PIC-ification, unified across
all platforms; b) unified commantary delimiter; c) explicit ldur/stur,
as Visual Studio assembler can't automatically encode ldr/str as
ldur/stur when needed.

Reviewed-by: Paul Dale <paul.dale@oracle.com>
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/8256)
2019-02-16 17:01:15 +01:00
Andy Polyakov
3405db97e5 ARM assembly pack: make it Windows-friendly.
"Windows friendliness" means a) flipping .thumb and .text directives,
b) always generate Thumb-2 code when asked(*); c) Windows-specific
references to external OPENSSL_armcap_P.

(*) so far *some* modules were compiled as .code 32 even if Thumb-2
was targeted. It works at hardware level because processor can alternate
between the modes with no overhead. But clang --target=arm-windows's
builtin assembler just refuses to compile .code 32...

Reviewed-by: Paul Dale <paul.dale@oracle.com>
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/8252)
2019-02-16 16:59:23 +01:00
Andy Polyakov
9a18aae5f2 AArch64 assembly pack: authenticate return addresses.
ARMv8.3 adds pointer authentication extension, which in this case allows
to ensure that, when offloaded to stack, return address is same at return
as at entry to the subroutine. The new instructions are nops on processors
that don't implement the extension, so that the vetification is backward
compatible.

Reviewed-by: Kurt Roeckx <kurt@roeckx.be>
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/8205)
2019-02-12 19:00:42 +01:00
Richard Levitte
77adb75e16 Build: Remove BEGINRAW / ENDRAW / OVERRIDE
It was an ugly hack to avoid certain problems that are no more.

Also added GENERATE lines for perlasm scripts that didn't have that
explicitly.

Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/8125)
2019-01-31 16:19:49 +01:00
Richard Levitte
c918d8e283 Following the license change, modify the boilerplates in crypto/aes/
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/7771)
2018-12-06 14:23:25 +01:00
Richard Levitte
389c09fa09 License: change any non-boilerplate comment referring to "OpenSSL license"
Make it just say "the License", which refers back to the standard
boilerplate.

Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/7764)
2018-12-06 13:26:28 +01:00
Matt Caswell
1212818eb0 Update copyright year
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/7176)
2018-09-11 13:45:17 +01:00