This partially fixes a bug where, on x86_64, BN_mod_exp_mont_consttime
would sometimes return m, the modulus, when it should have returned
zero. Thanks to Guido Vranken for reporting it. It is only a partial fix
because the same bug also exists in the "rsaz" codepath.
The bug only affects zero outputs (with non-zero inputs), so we believe
it has no security impact on our cryptographic functions.
The fx is to delete lowercase bn_from_montgomery altogether, and have the
mont5 path use the same BN_from_montgomery ending as the non-mont5 path.
This only impacts the final step of the whole exponentiation and has no
measurable perf impact.
See the original BoringSSL commit
https://boringssl.googlesource.com/boringssl/+/13c9d5c69d04485a7a8840c12185c832026c8315
for further analysis.
Original-author: David Benjamin <davidben@google.com>
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18510)
For example, FreeBSD prepends "FreeBSD" to version string, e.g.,
FreeBSD clang version 11.0.0 (git@github.com:llvm/llvm-project.git llvmorg-11.0.0-rc2-0-g414f32a9e86)
Target: x86_64-unknown-freebsd13.0
Thread model: posix
InstalledDir: /usr/bin
This prevented us from properly detecting AVX support, etc.
CLA: trivial
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Paul Dale <paul.dale@oracle.com>
Reviewed-by: Ben Kaduk <kaduk@mit.edu>
(Merged from https://github.com/openssl/openssl/pull/12725)
In https://github.com/openssl/openssl/pull/10883, I'd meant to exclude
the perlasm drivers since they aren't opening pipes and do not
particularly need it, but I only noticed x86_64-xlate.pl, so
arm-xlate.pl and ppc-xlate.pl got the change.
That seems to have been fine, so be consistent and also apply the change
to x86_64-xlate.pl. Checking for errors is generally a good idea.
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: David Benjamin <davidben@google.com>
(Merged from https://github.com/openssl/openssl/pull/10930)
If one of the perlasm xlate drivers crashes, OpenSSL's build will
currently swallow the error and silently truncate the output to however
far the driver got. This will hopefully fail to build, but better to
check such things.
Handle this by checking for errors when closing STDOUT (which is a pipe
to the xlate driver).
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Tomas Mraz <tmraz@fedoraproject.org>
(Merged from https://github.com/openssl/openssl/pull/10883)
They now generally conform to the following argument sequence:
script.pl "$(PERLASM_SCHEME)" [ C preprocessor arguments ... ] \
$(PROCESSOR) <output file>
However, in the spirit of being able to use these scripts manually,
they also allow for no argument, or for only the flavour, or for only
the output file. This is done by only using the last argument as
output file if it's a file (it has an extension), and only using the
first argument as flavour if it isn't a file (it doesn't have an
extension).
While we're at it, we make all $xlate calls the same, i.e. the $output
argument is always quoted, and we always die on error when trying to
start $xlate.
There's a perl lesson in this, regarding operator priority...
This will always succeed, even when it fails:
open FOO, "something" || die "ERR: $!";
The reason is that '||' has higher priority than list operators (a
function is essentially a list operator and gobbles up everything
following it that isn't lower priority), and since a non-empty string
is always true, so that ends up being exactly the same as:
open FOO, "something";
This, however, will fail if "something" can't be opened:
open FOO, "something" or die "ERR: $!";
The reason is that 'or' has lower priority that list operators,
i.e. it's performed after the 'open' call.
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/9884)
The add/double shortcut in ecp_nistz256-x86_64.pl left one instruction
point that did not unwind, and the "slow" path in AES_cbc_encrypt was
not annotated correctly. For the latter, add
.cfi_{remember,restore}_state support to perlasm.
Next, fill in a bunch of functions that are missing no-op .cfi_startproc
and .cfi_endproc blocks. libunwind cannot unwind those stack frames
otherwise.
Finally, work around a bug in libunwind by not encoding rflags. (rflags
isn't a callee-saved register, so there's not much need to annotate it
anyway.)
These were found as part of ABI testing work in BoringSSL.
Reviewed-by: Richard Levitte <levitte@openssl.org>
GH: #8109
Montgomery multiplication post-conditions in some of code paths were
formally non-constant time. Cache access pattern was result-neutral,
but a little bit asymmetric, which might have produced a signal [if
processor reordered load and stores at run-time].
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/6141)
Around 138 distinct errors found and fixed; thanks!
Reviewed-by: Kurt Roeckx <kurt@roeckx.be>
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/3459)
I don't think this actually affects anything since the cfi_restore
directives aren't strictly needed anyway. (The old values are still in
memory so either will do.)
CLA: trivial
Reviewed-by: Rich Salz <rsalz@openssl.org>
Reviewed-by: Andy Polyakov <appro@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/2582)
- harmonize handlers with guidelines and themselves;
- fix some bugs in handlers;
- add missing handlers in chacha and ecp_nistz256 modules;
Reviewed-by: Rich Salz <rsalz@openssl.org>
The prevailing style seems to not have trailing whitespace, but a few
lines do. This is mostly in the perlasm files, but a few C files got
them after the reformat. This is the result of:
find . -name '*.pl' | xargs sed -E -i '' -e 's/( |'$'\t'')*$//'
find . -name '*.c' | xargs sed -E -i '' -e 's/( |'$'\t'')*$//'
find . -name '*.h' | xargs sed -E -i '' -e 's/( |'$'\t'')*$//'
Then bn_prime.h was excluded since this is a generated file.
Note mkerr.pl has some changes in a heredoc for some help output, but
other lines there lack trailing whitespace too.
Reviewed-by: Kurt Roeckx <kurt@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
Original strategy for page-walking was adjust stack pointer and then
touch pages in order. This kind of asks for double-fault, because
if touch fails, then signal will be delivered to frame above adjusted
stack pointer. But touching pages prior adjusting stack pointer would
upset valgrind. As compromise let's adjust stack pointer in pages,
touching top of the stack. This still asks for double-fault, but at
least prevents corruption of neighbour stack if allocation is to
overstep the guard page.
Also omit predict-non-taken hints as they reportedly trigger illegal
instructions in some VM setups.
Reviewed-by: Richard Levitte <levitte@openssl.org>
Some OSes, *cough*-dows, insist on stack being "wired" to
physical memory in strictly sequential manner, i.e. if stack
allocation spans two pages, then reference to farmost one can
be punishable by SEGV. But page walking can do good even on
other OSes, because it guarantees that villain thread hits
the guard page before it can make damage to innocent one...
Reviewed-by: Rich Salz <rsalz@openssl.org>
and reorganize/harmonize post-conditions.
Additional hardening following on from CVE-2016-0702
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Rich Salz <rsalz@openssl.org>
At the same time remove miniscule bias in final subtraction.
Performance penalty varies from platform to platform, and even with
key length. For rsa2048 sign it was observed to be 4% for Sandy
Bridge and 7% on Broadwell.
CVE-2016-0702
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Rich Salz <rsalz@openssl.org>
bn_get_bits5 was overstepping array boundary by 1 byte. It was exclusively
read overstep and data could not have been used. The only potential problem
would be if array happens to end on the very edge of last accesible page.
Reviewed-by: Richard Levitte <levitte@openssl.org>
x86_64 platform. It targets specifically RSA1024 sign (using ideas
from http://eprint.iacr.org/2011/239) and adds more than 10% on most
platforms. Overall performance improvement relative to 1.0.0 is ~40%
in average, with best result of 54% on Westmere. Incidentally ~40%
is average improvement even for longer key lengths.