available compiler versions generated bogus machine code trying to
compile new crypto/des/cfb_enc.c. Secondly, 8th version defines
__GNUC__ macro, but fails to compile *some* inline assembler correctly.
Note that all versions of icc implement MSC-like _lrot[rl] intrinsic,
which is used now instead of offensive asm. Finally, unnecessary linker
dependencies are eliminated. Most notably dependency from libirc.a
caused trouble at application start-up, if libcrypto.so is linked with
-Bsymbolic (which it is).
locally initialising their own.
NB: I've removed the "BN_clear_free()" loops for the exit-paths in some of
these functions, and that may be a major part of the performance
improvements we're seeing. The "free" part can be removed because we're
using BN_CTX. The "clear" part OTOH can be removed because BN_CTX
destruction automatically performs this task, so performing it inside
functions that may be called repeatedly is wasteful. This is currently safe
within openssl due to the fact that BN_CTX objects are never created for
longer than a single high-level operation. However, that is only because
there's currently no mechanism in openssl for thread-local storage. Beyond
that, this might be an issue for applications using the bignum API directly
and caching their own BN_CTX objects. The solution is to introduce a flag
to BN_CTX_start() that allows its variables to be automatically sanitised
on release during BN_CTX_end(). This way any higher-level function (and
perhaps the application) can specify this flag in its own
BN_CTX_start()/BN_CTX_end() pair, and this will cause inner-loop functions
specifying the flag to be ignored so that sanitisation is handled only once
back out at the higher level. I will be implementing this in the near
future.
little TODO list in there as well as the debugging code (only enabled if
BN_CTX_DEBUG is defined).
I'd appreciate as much review and testing as can be spared for this. I'll
commit some changes to other parts of the bignum code shortly to make
better use of this implementation (no more fixed size limitations). Note
also that under identical optimisations, I'm seeing a noticable speed
increase over openssl-0.9.7 - so any feedback to confirm/deny this on other
systems would also be most welcome.
operations no longer require two distinct BN_CTX structures. This may put
more "strain" on the current BN_CTX implementation (which has a fixed limit
to the number of variables it will hold), but so far this limit is not
triggered by any of the tests pass and I will be changing BN_CTX in the
near future to avoid this problem anyway.
This also changes the default RSA implementation code to use the BN_CTX in
favour of initialising some of its variables locally in each function.
- Remove some unnecessary "+1"-like fudges. Sizes should be handled
exactly, as enlarging size parameters causes needless bloat and may just
make bugs less likely rather than fixing them: bn_expand() macro,
bn_expand_internal(), and BN_sqr().
- Deprecate bn_dup_expand() - it's new since 0.9.7, unused, and not that
useful.
- Remove unnecessary zeroing of unused bytes in bn_expand2().
- Rewrite BN_set_word() - it should be much simpler, the previous
complexities probably date from old mismatched type issues.
- Add missing bn_check_top() macros in bn_word.c
- Improve some degenerate case handling in BN_[add|sub]_word(), add
comments, and avoid a bignum expansion if an overflow isn't possible.
functions and macros.
This change has associated tags: LEVITTE_before_const and
LEVITTE_after_const. Those will be removed when this change has been
properly reviewed.