Since pointer in x32 is 4 bytes, add x86_64-support.pl to define
pointer_size and pointer_register based on flavour to support
stuctures like:
struct { void *ptr; int blocks; }
This fixes 90-test_sslapi.t on x32. Verified with
$ ./Configure shared linux-x86_64
$ make
$ make test
and
$ ./Configure shared linux-x32
$ make
$ make test
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Tomas Mraz <tmraz@fedoraproject.org>
(Merged from https://github.com/openssl/openssl/pull/10988)
In https://github.com/openssl/openssl/pull/10883, I'd meant to exclude
the perlasm drivers since they aren't opening pipes and do not
particularly need it, but I only noticed x86_64-xlate.pl, so
arm-xlate.pl and ppc-xlate.pl got the change.
That seems to have been fine, so be consistent and also apply the change
to x86_64-xlate.pl. Checking for errors is generally a good idea.
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: David Benjamin <davidben@google.com>
(Merged from https://github.com/openssl/openssl/pull/10930)
FIXES#10692#10638
a bug for aarch64 bigendian with instructions 'st1' and 'ld1' on AES-GCM mode.
CLA: trivial
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Paul Dale <paul.dale@oracle.com>
(Merged from https://github.com/openssl/openssl/pull/10751)
To support Intel CET, all indirect branch targets must start with
endbranch. Here is a patch to add endbranch to function entries
in x86_64 assembly codes which are indirect branch targets as
discovered by running openssl testsuite on Intel CET machine and
visual inspection.
Verified with
$ CC="gcc -Wl,-z,cet-report=error" ./Configure shared linux-x86_64 -fcf-protection
$ make
$ make test
and
$ CC="gcc -mx32 -Wl,-z,cet-report=error" ./Configure shared linux-x32 -fcf-protection
$ make
$ make test # <<< passed with https://github.com/openssl/openssl/pull/10988
Reviewed-by: Tomas Mraz <tmraz@fedoraproject.org>
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/10982)
If one of the perlasm xlate drivers crashes, OpenSSL's build will
currently swallow the error and silently truncate the output to however
far the driver got. This will hopefully fail to build, but better to
check such things.
Handle this by checking for errors when closing STDOUT (which is a pipe
to the xlate driver).
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Tomas Mraz <tmraz@fedoraproject.org>
(Merged from https://github.com/openssl/openssl/pull/10883)
Use of the low level AES functions has been informally discouraged for a
long time. We now formally deprecate them.
Applications should instead use the EVP APIs, e.g. EVP_EncryptInit_ex,
EVP_EncryptUpdate, EVP_EncryptFinal_ex, and the equivalently named decrypt
functions.
Reviewed-by: Tomas Mraz <tmraz@fedoraproject.org>
(Merged from https://github.com/openssl/openssl/pull/10580)
Also Add ability for providers to dynamically exclude cipher algorithms.
Cipher algorithms are only returned from providers if their capable() method is either NULL,
or the method returns 1.
This is mainly required for ciphers that only have hardware implementations.
If there is no hardware support, then the algorithm needs to be not available.
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/10146)
We store a secondary frame pointer info for the debugger
in the red zone. This fixes a crash in the unwinder when
this function is interrupted.
Additionally the missing cfi function annotation is added
to aesni_cbc_sha256_enc_shaext.
[extended tests]
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/10674)
Aes-ecb mode can be optimized by inverleaving cipher operation on
several blocks and loop unrolling. Interleaving needs one ideal
unrolling factor, here we adopt the same factor with aes-cbc,
which is described as below:
If blocks number > 5, select 5 blocks as one iteration,every
loop, decrease the blocks number by 5.
If 3 < left blocks < 5 select 3 blocks as one iteration, every
loop, decrease the block number by 3.
If left blocks < 3, treat them as tail blocks.
Detailed implementation will have a little adjustment for squeezing
code space.
With this way, for small size such as 16 bytes, the performance is
similar as before, but for big size such as 16k bytes, the performance
improves a lot, even reaches to 100%, for some arches such as A57,
the improvement even exceeds 100%. The following table will list the
encryption performance data on aarch64, take a72 and a57 as examples.
Performance value takes the unit of cycles per byte, takes the format
as comparision of values. List them as below:
A72:
Before optimization After optimization Improve
evp-aes-128-ecb@16 17.26538237 16.82663866 2.61%
evp-aes-128-ecb@64 5.50528499 5.222637557 5.41%
evp-aes-128-ecb@256 2.632700213 1.908442892 37.95%
evp-aes-128-ecb@1024 1.876102047 1.078018868 74.03%
evp-aes-128-ecb@8192 1.6550392 0.853982929 93.80%
evp-aes-128-ecb@16384 1.636871283 0.847623957 93.11%
evp-aes-192-ecb@16 17.73104961 17.09692468 3.71%
evp-aes-192-ecb@64 5.78984398 5.418545192 6.85%
evp-aes-192-ecb@256 2.872005308 2.081815274 37.96%
evp-aes-192-ecb@1024 2.083226672 1.25095642 66.53%
evp-aes-192-ecb@8192 1.831992057 0.995916251 83.95%
evp-aes-192-ecb@16384 1.821590009 0.993820525 83.29%
evp-aes-256-ecb@16 18.0606306 17.96963317 0.51%
evp-aes-256-ecb@64 6.19651997 5.762465812 7.53%
evp-aes-256-ecb@256 3.176991394 2.24642538 41.42%
evp-aes-256-ecb@1024 2.385991919 1.396018192 70.91%
evp-aes-256-ecb@8192 2.147862636 1.142222597 88.04%
evp-aes-256-ecb@16384 2.131361787 1.135944617 87.63%
A57:
Before optimization After optimization Improve
evp-aes-128-ecb@16 18.61045121 18.36456218 1.34%
evp-aes-128-ecb@64 6.438628994 5.467959461 17.75%
evp-aes-128-ecb@256 2.957452881 1.97238604 49.94%
evp-aes-128-ecb@1024 2.117096219 1.099665054 92.52%
evp-aes-128-ecb@8192 1.868385973 0.837440804 123.11%
evp-aes-128-ecb@16384 1.853078526 0.822420027 125.32%
evp-aes-192-ecb@16 19.07021756 18.50018552 3.08%
evp-aes-192-ecb@64 6.672351486 5.696088921 17.14%
evp-aes-192-ecb@256 3.260427769 2.131449916 52.97%
evp-aes-192-ecb@1024 2.410522832 1.250529718 92.76%
evp-aes-192-ecb@8192 2.17921605 0.973225504 123.92%
evp-aes-192-ecb@16384 2.162250997 0.95919871 125.42%
evp-aes-256-ecb@16 19.3008384 19.12743654 0.91%
evp-aes-256-ecb@64 6.992950658 5.92149541 18.09%
evp-aes-256-ecb@256 3.576361743 2.287619504 56.34%
evp-aes-256-ecb@1024 2.726671027 1.381267599 97.40%
evp-aes-256-ecb@8192 2.493583657 1.110959913 124.45%
evp-aes-256-ecb@16384 2.473916816 1.099967073 124.91%
Change-Id: Iccd23d972e0d52d22dc093f4c208f69c9d5a0ca7
Reviewed-by: Shane Lontis <shane.lontis@oracle.com>
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/10518)
Previous macros suggested that from 3.0, we're only allowed to
deprecate things at a major version. However, there's no policy
stating this, but there is for removal, saying that to remove
something, it must have been deprecated for 5 years, and that removal
can only happen at a major version.
Meanwhile, the semantic versioning rule is that deprecation should
trigger a MINOR version update, which is reflected in the macro names
as of this change.
Reviewed-by: Tim Hudson <tjh@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/10364)
Implementations are now spread across several libraries, so the assembler
related defines need to be applied to all affected libraries and modules.
AES_ASM define was missing from libimplementations.a which disabled AESNI
aarch64 changes were made by xkqian.
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/10180)
We put almost everything in these internal static libraries:
libcommon Block building code that can be used by all
our implementations, legacy and non-legacy
alike.
libimplementations All non-legacy algorithm implementations and
only them. All the code that ends up here is
agnostic to the definitions of FIPS_MODE.
liblegacy All legacy implementations.
libnonfips Support code for the algorithm implementations.
Built with FIPS_MODE undefined. Any code that
checks that FIPS_MODE isn't defined must end
up in this library.
libfips Support code for the algorithm implementations.
Built with FIPS_MODE defined. Any code that
checks that FIPS_MODE is defined must end up
in this library.
The FIPS provider module is built from providers/fips/*.c and linked
with libimplementations, libcommon and libfips.
The Legacy provider module is built from providers/legacy/*.c and
linked with liblegacy, libcommon and libcrypto.
If module building is disabled, the object files from liblegacy and
libcommon are added to libcrypto and the Legacy provider becomes a
built-in provider.
The Default provider module is built-in, so it ends up being linked
with libimplementations, libcommon and libnonfips. For libcrypto in
form of static library, the object files from those other libraries
are simply being added to libcrypto.
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/10088)
Make the include guards consistent by renaming them systematically according
to the naming conventions below
For the public header files (in the 'include/openssl' directory), the guard
names try to match the path specified in the include directives, with
all letters converted to upper case and '/' and '.' replaced by '_'. For the
private header files files, an extra 'OSSL_' is added as prefix.
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/9333)
Apart from public and internal header files, there is a third type called
local header files, which are located next to source files in the source
directory. Currently, they have different suffixes like
'*_lcl.h', '*_local.h', or '*_int.h'
This commit changes the different suffixes to '*_local.h' uniformly.
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/9333)
They now generally conform to the following argument sequence:
script.pl "$(PERLASM_SCHEME)" [ C preprocessor arguments ... ] \
$(PROCESSOR) <output file>
However, in the spirit of being able to use these scripts manually,
they also allow for no argument, or for only the flavour, or for only
the output file. This is done by only using the last argument as
output file if it's a file (it has an extension), and only using the
first argument as flavour if it isn't a file (it doesn't have an
extension).
While we're at it, we make all $xlate calls the same, i.e. the $output
argument is always quoted, and we always die on error when trying to
start $xlate.
There's a perl lesson in this, regarding operator priority...
This will always succeed, even when it fails:
open FOO, "something" || die "ERR: $!";
The reason is that '||' has higher priority than list operators (a
function is essentially a list operator and gobbles up everything
following it that isn't lower priority), and since a non-empty string
is always true, so that ends up being exactly the same as:
open FOO, "something";
This, however, will fail if "something" can't be opened:
open FOO, "something" or die "ERR: $!";
The reason is that 'or' has lower priority that list operators,
i.e. it's performed after the 'open' call.
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/9884)
Since the arguments are now generated in the build file templates,
they should be removed from the build.info files.
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/9884)
CLA: trivial
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Matthias St. Pierre <Matthias.St.Pierre@ncp-e.com>
(Merged from https://github.com/openssl/openssl/pull/9288)
Two mistakes were made:
1. AES_ASM for x86 was misplaced
2. sse2 isn't applicable for x86_64 code
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/9177)
These ciphers were already provider aware, and were available from the
default provider. We move them into the FIPS provider too.
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/9038)
The kernel self-tests picked up an issue with CTR mode. The issue was
detected with a test vector with an IV of
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFD: after 3 increments it should wrap
around to 0.
There are two paths that increment IVs: the bulk (8 at a time) path,
and the individual path which is used when there are fewer than 8 AES
blocks to process.
In the bulk path, the IV is incremented with vadduqm: "Vector Add
Unsigned Quadword Modulo", which does 128-bit addition.
In the individual path, however, the IV is incremented with vadduwm:
"Vector Add Unsigned Word Modulo", which instead does 4 32-bit
additions. Thus the IV would instead become
FFFFFFFFFFFFFFFFFFFFFFFF00000000, throwing off the result.
Use vadduqm.
This was probably a typo originally, what with q and w being
adjacent.
CLA: trivial
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Paul Dale <paul.dale@oracle.com>
(Merged from https://github.com/openssl/openssl/pull/8942)
These undocumented functions were never integrated into the EVP layer
and implement the AES Infinite Garble Extension (IGE) mode and AES
Bi-directional IGE mode. These modes were never formally standardised
and usage of these functions is believed to be very small. In particular
AES_bi_ige_encrypt() has a known bug. It accepts 2 AES keys, but only
one is ever used. The security implications are believed to be minimal,
but this issue was never fixed for backwards compatibility reasons.
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/8710)
There are two copy-paste errors in handling CTR mode. When dealing
with a 2 or 3 block tail, the code branches to the CBC decryption exit
path, rather than to the CTR exit path.
This can lead to data corruption: in the Linux kernel we have a copy
of this file, and the bug leads to corruption of the IV, which leads
to data corruption when we call the encryption function again later to
encrypt subsequent blocks.
Originally reported to the Linux kernel by Ondrej Mosnáček <omosnacek@gmail.com>
CLA: trivial
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Paul Dale <paul.dale@oracle.com>
(Merged from https://github.com/openssl/openssl/pull/8510)
The add/double shortcut in ecp_nistz256-x86_64.pl left one instruction
point that did not unwind, and the "slow" path in AES_cbc_encrypt was
not annotated correctly. For the latter, add
.cfi_{remember,restore}_state support to perlasm.
Next, fill in a bunch of functions that are missing no-op .cfi_startproc
and .cfi_endproc blocks. libunwind cannot unwind those stack frames
otherwise.
Finally, work around a bug in libunwind by not encoding rflags. (rflags
isn't a callee-saved register, so there's not much need to annotate it
anyway.)
These were found as part of ABI testing work in BoringSSL.
Reviewed-by: Richard Levitte <levitte@openssl.org>
GH: #8109
"Windows friendliness" means a) unified PIC-ification, unified across
all platforms; b) unified commantary delimiter; c) explicit ldur/stur,
as Visual Studio assembler can't automatically encode ldr/str as
ldur/stur when needed.
Reviewed-by: Paul Dale <paul.dale@oracle.com>
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/8256)
"Windows friendliness" means a) flipping .thumb and .text directives,
b) always generate Thumb-2 code when asked(*); c) Windows-specific
references to external OPENSSL_armcap_P.
(*) so far *some* modules were compiled as .code 32 even if Thumb-2
was targeted. It works at hardware level because processor can alternate
between the modes with no overhead. But clang --target=arm-windows's
builtin assembler just refuses to compile .code 32...
Reviewed-by: Paul Dale <paul.dale@oracle.com>
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/8252)
ARMv8.3 adds pointer authentication extension, which in this case allows
to ensure that, when offloaded to stack, return address is same at return
as at entry to the subroutine. The new instructions are nops on processors
that don't implement the extension, so that the vetification is backward
compatible.
Reviewed-by: Kurt Roeckx <kurt@roeckx.be>
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/8205)
It was an ugly hack to avoid certain problems that are no more.
Also added GENERATE lines for perlasm scripts that didn't have that
explicitly.
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/8125)
Make it just say "the License", which refers back to the standard
boilerplate.
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/7764)
As it turns out originally published results were skewed by "turbo"
mode. VM apparently remains oblivious to dynamic frequency scaling,
and reports that processor operates at "base" frequency at all times.
While actual frequency gets increased under load.
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/6406)
Current endianness detection is somewhat opportunistic and can fail
in cross-compile scenario. Since we are more likely to cross-compile
for little-endian now, adjust the default accordingly.
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/5613)
Thumb2 addresses are a bit a mess, depending on whether a label is
interpreted as a function pointer value (for use with BX and BLX) or as
a program counter value (for use with PC-relative addressing). Clang's
integrated assembler mis-assembles this code. See
https://crbug.com/124610#c54 for details.
Instead, use the ADR pseudo-instruction which has clear semantics and
should be supported by every assembler that handles the OpenSSL Thumb2
code. (In other files, the ADR vs SUB conditionals are based on
__thumb2__ already. For some reason, this one is based on __APPLE__, I'm
guessing to deal with an older version of clang assembler.)
It's unclear to me which of clang or binutils is "correct" or if this is
even a well-defined notion beyond "whatever binutils does". But I will
note that https://github.com/openssl/openssl/pull/4669 suggests binutils
has also changed behavior around this before.
Reviewed-by: Andy Polyakov <appro@openssl.org>
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/5431)
The make variables LIB_CFLAGS, DSO_CFLAGS and so on were used in
addition to CFLAGS and so on. This works without problem on Unix and
Windows, where options with different purposes (such as -D and -I) can
appear anywhere on the command line and get accumulated as they come.
This is not necessarely so on VMS. For example, macros must all be
collected and given through one /DEFINE, and the same goes for
inclusion directories (/INCLUDE).
So, to harmonize all platforms, we repurpose make variables starting
with LIB_, DSO_ and BIN_ to be all encompassing variables that
collects the corresponding values from CFLAGS, CPPFLAGS, DEFINES,
INCLUDES and so on together with possible config target values
specific for libraries DSOs and programs, and use them instead of the
general ones everywhere.
This will, for example, allow VMS to use the exact same generators for
generated files that go through cpp as all other platforms, something
that has been impossible to do safely before now.
Reviewed-by: Andy Polyakov <appro@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/5357)