Andy Polyakov
f3f620e1e0
bn_exp.c: move check for AD*X to rsaz-avx2.pl.
...
This ensures high performance is situations when assembler supports
AVX2, but not AD*X.
2014-06-27 00:07:15 +02:00
Andy Polyakov
7eb0488280
x86_64 assembly pack: addendum to last clang commit.
2014-06-24 08:37:05 +02:00
Andy Polyakov
ac171925ab
x86_64 assembly pack: allow clang to compile AVX code.
2014-06-24 08:24:25 +02:00
Andy Polyakov
5dcf70a1c5
ARM assembly pack: get ARMv7 instruction endianness right.
...
Pointer out and suggested by: Ard Biesheuvel.
2014-06-06 21:27:18 +02:00
Andy Polyakov
bd227733b9
C64x+ assembly pack: make it work with older toolchain.
2014-05-04 16:38:32 +02:00
Andy Polyakov
f8cee9d081
bn/asm/armv4-gf2m.pl, modes/asm/ghash-armv4.pl: faster multiplication
...
algorithm suggested in following paper:
Câmara, D.; Gouvêa, C. P. L.; López, J. & Dahab, R.: Fast Software
Polynomial Multiplication on ARM Processors using the NEON Engine.
http://conradoplg.cryptoland.net/files/2010/12/mocrysen13.pdf
2014-04-24 10:24:53 +02:00
Andy Polyakov
eedab5241e
bn/asm/x86_64-mont5.pl: fix compilation error on Solaris.
2014-01-09 13:44:59 +01:00
Andy Polyakov
ec9cc70f72
bn/asm/x86_64-mont5.pl: add MULX/AD*X code path.
...
This also eliminates code duplication between x86_64-mont and x86_64-mont
and optimizes even original non-MULX code.
2013-12-09 21:02:24 +01:00
Andy Polyakov
d1671f4f1a
bn/asm/armv4-mont.pl: add NEON code path.
2013-12-04 22:37:49 +01:00
Andy Polyakov
c5d5f5bd0f
bn/asm/x86_64-mont5.pl: comply with Win64 ABI.
...
PR: 3189
Submitted by: Oscar Ciurana
2013-12-03 23:59:55 +01:00
Andy Polyakov
8bd7ca9996
crypto/bn/asm/rsaz-x86_64.pl: make it work on Win64.
2013-12-03 22:28:48 +01:00
Andy Polyakov
31ed9a2131
crypto/bn/rsaz*: fix licensing note.
...
rsaz_exp.c: harmonize line terminating;
asm/rsaz-*.pl: minor optimizations.
2013-12-03 22:08:29 +01:00
Andy Polyakov
6efef384c6
bn/asm/rsaz-x86_64.pl: fix prototype.
2013-12-03 09:43:06 +01:00
Andy Polyakov
b9e87d07cb
ppc64-mont.pl: eliminate dependency on GPRs' upper halves.
2013-11-27 22:50:00 +01:00
Andy Polyakov
4eeb750d20
bn/asm/x86_64-mont.pl: minor optimization [for Decoded ICache].
2013-10-25 10:14:20 +02:00
Andy Polyakov
d6019e1654
PPC assembly pack: add .size directives.
2013-10-15 00:14:39 +02:00
Andy Polyakov
30b9c2348d
bn/asm/*x86_64*.pl: correct assembler requirement for ad*x.
2013-10-14 22:41:00 +02:00
Andy Polyakov
0c2adb0a9b
MIPS assembly pack: get rid of deprecated instructions.
...
Latest MIPS ISA specification declared 'branch likely' instructions
obsolete. To makes code future-proof replace them with equivalent.
2013-10-13 13:14:52 +02:00
Andy Polyakov
fa104be35e
bn/asm/rsax-avx2.pl: minor optimization [for Decoded ICache].
2013-10-10 23:06:43 +02:00
Andy Polyakov
a5bb5bca52
bn/asm/x86_64-mont*.pl: add MULX/ADCX/ADOX code path.
2013-10-03 00:45:04 +02:00
Andy Polyakov
87954638a6
rsaz-x86_64.pl: add MULX/ADCX/ADOX code path.
2013-10-03 00:30:12 +02:00
Andy Polyakov
72a158703b
crypto/bn/asm/x86_64-mont.pl: minor optimization.
2013-09-09 21:40:33 +02:00
Veres Lajos
478b50cf67
misspellings fixes by https://github.com/vlajos/misspell_fixer
2013-09-05 21:39:42 +01:00
Andy Polyakov
fd8ad019e1
crypto/bn/asm/rsax-x86_64.pl: make it work on Darwin.
2013-08-03 16:28:50 +02:00
Andy Polyakov
5c57c69f9e
bn/asm/rsaz-avx2.pl: Windows-specific fix.
2013-07-12 18:59:17 +02:00
Ben Laurie
852f837f5e
s/rsaz_eligible/rsaz_avx2_eligible/.
2013-07-12 12:47:39 +01:00
Andy Polyakov
0b4bb91db6
Add RSAZ assembly modules.
...
RT: 2582, 2850
2013-07-05 21:30:18 +02:00
Andy Polyakov
26e43b48a3
bn/asm/x86_86-mont.pl: optimize reduction for Intel Core family.
2013-07-05 21:10:56 +02:00
Andy Polyakov
4ddacd9921
Optimize SPARC T4 MONTMUL support.
...
Improve RSA sing performance by 20-30% by:
- switching from floating-point to integer conditional moves;
- daisy-chaining sqr-sqr-sqr-sqr-sqr-mul sequences;
- using MONTMUL even during powers table setup;
2013-06-18 10:39:38 +02:00
Andy Polyakov
02450ec69d
PA-RISC assembler pack: switch to bve in 64-bit builds.
...
PR: 3074
2013-06-18 10:37:00 +02:00
Adam Langley
7753a3a684
Add volatile qualifications to two blocks of inline asm to stop GCC from
...
eliminating them as dead code.
Both volatile and "memory" are used because of some concern that the compiler
may still cache values across the asm block without it, and because this was
such a painful debugging session that I wanted to ensure that it's never
repeated.
2013-06-04 18:46:25 +01:00
Andy Polyakov
342dbbbe4e
x86_64-gf2m.pl: fix typo.
2013-03-01 22:36:36 +01:00
Andy Polyakov
7c43601d44
x86_64-gf2m.pl: add missing Windows build fix for #2963 .
...
PR: 3004
2013-03-01 21:43:10 +01:00
Andy Polyakov
4568182a8b
x86_64 assembly pack: keep making Windows build more robust.
...
PR: 2963 and a number of others
2013-02-02 19:54:59 +01:00
Andy Polyakov
46bf83f07a
x86_64 assembly pack: make Windows build more robust.
...
PR: 2963 and a number of others
2013-01-22 22:27:28 +01:00
Andy Polyakov
543fd85460
bn/asm/mips.pl: hardwire local call to bn_div_words.
2013-01-22 21:13:37 +01:00
Andy Polyakov
904732f68b
C64x+ assembly pack: improve EABI support.
2012-11-28 13:19:10 +00:00
Andy Polyakov
9f6b0635ad
x86_64-gcc.c: resore early clobber constraint.
...
Submitted by: Florian Weimer
2012-11-19 15:02:00 +00:00
Andy Polyakov
68c06bf6b2
Support for SPARC T4 MONT[MUL|SQR] instructions.
...
Submitted by: David Miller, Andy Polyakov
2012-11-17 10:34:11 +00:00
Andy Polyakov
1efd583085
SPARCv9 assembly pack: harmonize ABI handling (so that it's handled in one
...
place at a time, by pre-processor in .S case and perl - in .s).
2012-10-25 12:07:32 +00:00
Andy Polyakov
0c832ec5c6
Add VIS3-capable sparcv9-gf2m module.
2012-10-20 15:59:14 +00:00
Andy Polyakov
947d78275b
Add VIS3 Montgomery multiplication.
2012-10-20 09:13:21 +00:00
Andy Polyakov
be0d31b166
Add linux-x32 target.
2012-08-29 14:08:46 +00:00
Andy Polyakov
1a002d88ad
MIPS assembly pack: assign default value to $flavour.
2012-08-17 09:10:31 +00:00
Andy Polyakov
6251989eb6
x86_64 assembly pack: make it possible to compile with Perl located on
...
path with spaces.
PR: 2835
2012-06-27 10:08:23 +00:00
Andy Polyakov
3e181369dd
C64x+ assembler pack. linux-c64xplus build is *not* tested nor can it be
...
tested, because kernel is not in shape to handle it *yet*. The code is
committed mostly to stimulate the kernel development.
2012-04-18 13:01:36 +00:00
Andy Polyakov
8c98b2591f
modexp512-x86_64.pl: Solaris protability fix.
...
PR: 2656
2011-12-12 15:10:14 +00:00
Andy Polyakov
5711dd8eac
x86-mont.pl: fix bug in integer-only squaring path.
...
PR: 2648
2011-12-09 14:21:25 +00:00
Andy Polyakov
6600126825
bn/asm/mips.pl: fix typos.
2011-12-01 12:16:09 +00:00
Andy Polyakov
29fd6746f5
armv4cpuid.S, armv4-gf2m.pl: make newest code compilable by older assembler.
2011-11-05 13:07:18 +00:00
Andy Polyakov
09f40a3cb9
ppc.pl: fix bug in bn_mul_comba4.
...
PR: 2636
Submitted by: Charles Bryant
2011-11-05 10:16:04 +00:00
Andy Polyakov
a9cf0b81fa
Remove superseded MIPS assembler modules.
2011-10-19 21:42:21 +00:00
Andy Polyakov
8329e2e776
bn_exp.c: further optimizations using more ideas from
...
http://eprint.iacr.org/2011/239 .
2011-10-17 17:41:49 +00:00
Andy Polyakov
3f66f2040a
x86_64-mont.pl: minor optimization.
2011-10-17 17:39:59 +00:00
Bodo Möller
cdfe0fdde6
Fix OPENSSL_BN_ASM_MONT5 for corner cases; add a test.
...
Submitted by: Emilia Kasper
2011-10-13 12:35:10 +00:00
Andy Polyakov
6c01cbb6a0
modexp512-x86_64.pl: make it work with ml64.
2011-08-19 06:30:32 +00:00
Andy Polyakov
e7d1363d12
x86_64-mont5.pl: add missing Win64 support.
2011-08-14 09:06:06 +00:00
Andy Polyakov
10bd69bf4f
armv4-mont.pl: profiler-assisted optimization gives 8%-14% improvement
...
(more for longer keys) on RSA/DSA.
2011-08-13 12:38:41 +00:00
Andy Polyakov
ae8b47f07f
SPARC assembler pack: fix FIPS linking errors.
2011-08-12 21:38:19 +00:00
Andy Polyakov
361512da0d
This commit completes recent modular exponentiation optimizations on
...
x86_64 platform. It targets specifically RSA1024 sign (using ideas
from http://eprint.iacr.org/2011/239 ) and adds more than 10% on most
platforms. Overall performance improvement relative to 1.0.0 is ~40%
in average, with best result of 54% on Westmere. Incidentally ~40%
is average improvement even for longer key lengths.
2011-08-12 16:44:32 +00:00
Andy Polyakov
20735f4c81
alphacpuid.pl: fix alignment bug.
...
alpha-mont.pl: fix typo.
PR: 2577
2011-08-12 12:28:52 +00:00
Andy Polyakov
85ec54a417
x86_64-mont.pl: futher optimization resulting in up to 48% improvement
...
(4096-bit RSA sign benchmark on Core2) in comparison to initial version
from 2005.
2011-08-09 13:05:05 +00:00
Andy Polyakov
be9a8cc2af
Add RSAX builtin engine. It optimizes RSA1024 sign benchmark.
2011-07-20 21:49:46 +00:00
Andy Polyakov
87873f4328
ARM assembler pack: add platform run-time detection.
2011-07-17 17:40:29 +00:00
Andy Polyakov
6179f06077
x86_64-mont.pl: add squaring procedure and improve RSA sign performance
...
by up to 38% (4096-bit benchmark on Core2).
2011-07-05 09:21:03 +00:00
Andy Polyakov
02a73e2bed
s390x-gf2m.pl: commentary update (final performance numbers turned to be
...
higher).
2011-07-04 11:20:33 +00:00
Andy Polyakov
0c237e42a4
s390x assembler pack: add s390x-gf2m.pl and harmonize AES_xts_[en|de]crypt.
2011-06-27 10:00:31 +00:00
Andy Polyakov
6715034002
PPC assembler pack: adhere closer to ABI specs, add PowerOpen traceback data.
2011-05-27 13:32:34 +00:00
Andy Polyakov
96abea332c
x86_64-gf2m.pl: add Win64 SEH.
2011-05-22 18:29:11 +00:00
Andy Polyakov
2b9a8ca15b
x86gas.pl: add palignr and move pclmulqdq.
2011-05-16 18:07:00 +00:00
Andy Polyakov
afebe623c5
x86_64 assembler pack: add x86_64-gf2m module.
2011-05-16 17:46:45 +00:00
Andy Polyakov
56c5f703c1
IA-64 assembler pack: fix typos and make it work on HP-UX.
2011-05-07 20:36:05 +00:00
Andy Polyakov
58cc21fdea
x86 assembler pack: add bn_GF2m_mul_2x2 implementations (see x86-gf2m.pl for
...
details and performance data).
2011-05-07 10:31:06 +00:00
Andy Polyakov
925596f85b
ARM assembler pack: engage newly introduced armv4-gf2m module.
2011-05-05 21:57:11 +00:00
Andy Polyakov
75359644d0
ARM assembler pack. Add bn_GF2m_mul_2x2 implementation (see source code
...
for details and performance data).
2011-05-05 07:21:17 +00:00
Andy Polyakov
a000759a5c
ia64-mont.pl: optimize short-key performance.
2011-03-04 13:27:29 +00:00
Andy Polyakov
0ab8fd58e1
s390x assembler pack: tune-up and support for new z196 hardware.
2011-03-04 13:09:16 +00:00
Dr. Stephen Henson
949c6f8ccf
Stop warnings.
2011-02-23 16:06:33 +00:00
Andy Polyakov
e822c756b6
s390x assembler pack: adapt for -m31 build, see commentary in Configure
...
for more details.
2010-11-29 20:52:43 +00:00
Andy Polyakov
dd128715a2
s390x.S: fix typo in bn_mul_words.
...
PR: 2380
2010-11-22 21:55:07 +00:00
Andy Polyakov
d466588788
MIPS assembler pack: enable it in Configure, add SHA2 module, fix make rules,
...
update commentary...
2010-10-02 11:47:17 +00:00
Andy Polyakov
da4d239dad
Add unified mips.pl, which will replace mips3.s.
2010-09-27 21:19:43 +00:00
Andy Polyakov
0985473636
sha1-mips.pl, mips-mont.pl: unify MIPS assembler modules in respect to
...
ABI and binutils.
2010-09-22 08:43:09 +00:00
Andy Polyakov
f8927c89d0
Alpha assembler pack: adapt for Linux.
...
PR: 2335
2010-09-13 13:28:52 +00:00
Andy Polyakov
dd4a0af370
crypto/bn/asm/s390x.S: drop redundant instructions.
2010-09-10 14:53:36 +00:00
Andy Polyakov
1cbdca7bf2
Harmonize s390x assembler modules with "catch-all" rules from commit#19749.
2010-07-09 12:11:12 +00:00
Andy Polyakov
e216cd6ee9
armv4-mont.pl: addenum to previous commit#19749.
2010-07-08 15:06:01 +00:00
Andy Polyakov
3efe51a407
Revert previous Linux-specific/centric commit#19629. If it really has to
...
be done, it's definitely not the way to do it. So far answer to the
question was to ./config -Wa,--noexecstack (adopted by RedHat).
2010-05-05 22:05:39 +00:00
Ben Laurie
0e3ef596e5
Non-executable stack in asm.
2010-05-05 15:50:13 +00:00
Andy Polyakov
d23f4e9d5a
alpha-mont.pl: comply with stack alignment requirements.
2010-04-10 13:33:04 +00:00
Andy Polyakov
97a6a01f0f
ARMv4 assembler: fix compilation failure. Fix is actually unconfirmed, but
...
I can't think of any other cause for failure
2010-03-29 09:55:19 +00:00
Andy Polyakov
964ed94649
parisc-mont.pl: PA-RISC 2.0 code path optimization based on intruction-
...
level profiling data resulted in almost 50% performance improvement.
PA-RISC 1.1 is also reordered in same manner, mostly to be consistent,
as no gain was observed, not on PA-7100LC.
2010-01-25 23:12:00 +00:00
Andy Polyakov
4407700c40
ia64-mont.pl: add shorter vector support ("shorter" refers to 512 bits and
...
less).
2010-01-17 11:33:59 +00:00
Andy Polyakov
74f2260694
ia64-mont.pl: addp4 is not needed when referring to stack (this is 32-bit
...
HP-UX thing).
2010-01-07 15:36:59 +00:00
Andy Polyakov
1f23001d07
ppc64-mont.pl: commentary update.
2010-01-06 10:58:59 +00:00
Andy Polyakov
dacdcf3c15
Add Montgomery multiplication module for IA-64.
2010-01-06 10:57:55 +00:00
Andy Polyakov
70b76d392f
ppccap.c: fix compiler warning and perform sanity check outside signal masking.
...
ppc64-mont.pl: clarify comment and fix spelling.
2009-12-29 11:18:16 +00:00
Andy Polyakov
3fc2efd241
PA-RISC assembler: missing symbol and typos.
2009-12-28 16:13:35 +00:00
Andy Polyakov
cb3b9b1323
Throw in more PA-RISC assembler.
2009-12-27 20:49:40 +00:00
Andy Polyakov
d741cf2267
ppccap.c: tidy up.
...
ppc64-mont.pl: missing predicate in commentary.
2009-12-27 11:25:24 +00:00
Andy Polyakov
b4b48a107c
ppc64-mont.pl: adapt for 32-bit and engage for all builds.
2009-12-26 21:30:13 +00:00
Andy Polyakov
0f529cbdc3
s390x-mont.pl: optimize prologue.
2009-02-10 08:46:48 +00:00
Andy Polyakov
8626230a02
s390x assembler pack update.
2009-02-09 15:42:04 +00:00
Dr. Stephen Henson
41b7619596
Fix missing prototype warnings then fix different prototype warnings ;-)
2009-01-11 16:17:26 +00:00
Andy Polyakov
be01f79d3d
x86_64 assembler pack: add support for Win64 SEH.
2008-12-19 11:17:29 +00:00
Andy Polyakov
93c4ba07d7
x86_64-xlate.pl update, engage x86_64 assembler in mingw64.
2008-11-14 16:40:37 +00:00
Andy Polyakov
aa8f38e49b
x86_64 assembler pack to comply with updated styling x86_64-xlate.pl rules.
2008-11-12 08:15:52 +00:00
Geoff Thorpe
6343829a39
Revert the size_t modifications from HEAD that had led to more
...
knock-on work than expected - they've been extracted into a patch
series that can be completed elsewhere, or in a different branch,
before merging back to HEAD.
2008-11-12 03:58:08 +00:00
Dr. Stephen Henson
9619b730b4
Use stddef.h to pick up size_t def.
2008-11-02 16:56:13 +00:00
Dr. Stephen Henson
c76fd290be
Fix warnings about mismatched prototypes, undefined size_t and value computed
...
not used.
2008-11-02 12:50:48 +00:00
Ben Laurie
4d6e1e4f29
size_tification.
2008-11-01 14:37:00 +00:00
Andy Polyakov
492279f6f3
AIX build updates.
2008-09-12 14:45:54 +00:00
Andy Polyakov
61b05a0025
Make x86_64-mont.pl work with debug Win64 build.
2008-02-27 20:09:28 +00:00
Andy Polyakov
089458b096
ppc64-mont optimization.
2008-02-05 13:10:14 +00:00
Andy Polyakov
addd641f3a
Unify ppc assembler make rules.
2008-01-13 22:01:30 +00:00
Dr. Stephen Henson
4d1f3f7a6c
Update perl asm scripts include paths for perlasm.
2008-01-05 22:28:38 +00:00
Andy Polyakov
c8ec4a1b0b
Final (for this commit series) optimized version and with commentary section.
2007-12-29 20:30:09 +00:00
Andy Polyakov
699e1a3a82
This is also informational commit exposing loop modulo scheduling "factor."
2007-12-29 20:28:01 +00:00
Andy Polyakov
64214a2183
New Montgomery multiplication module, ppc64-mont.pl. Reference, non-optimized
...
implementation. This is essentially informational commit.
2007-12-29 20:26:46 +00:00
Andy Polyakov
7722e53f12
Yet another ARM update. It appears to be more appropriate to make
...
developers responsible for -march choice.
2007-09-27 16:27:03 +00:00
Andy Polyakov
673c55a2fe
Latest bn_mont.c modification broke ECDSA test. I've got math wrong, which
...
is fixed now.
2007-06-29 13:10:19 +00:00
Andy Polyakov
5b89f78a89
Typo in x86_64-mont.pl.
...
PR: 1549
2007-06-21 11:38:52 +00:00
Andy Polyakov
1c7f8707fd
bn_asm for s390x.
2007-06-20 14:10:16 +00:00
Andy Polyakov
2329694222
SPARC Solaris and Linux assemblers treat .align directive differently.
...
PR: 1547
2007-06-20 12:24:22 +00:00
Andy Polyakov
7d9cf7c0bb
Eliminate conditional final subtraction in Montgomery assembler modules.
2007-06-17 17:10:03 +00:00
Andy Polyakov
a2a54ffc5f
s390x assembler pack.
2007-04-30 08:42:54 +00:00
Andy Polyakov
8b71d35458
nasm fixes.
2007-03-20 08:55:58 +00:00
Andy Polyakov
760e353528
sparcv9a-mont was modified to handle 32-bit aligned input, but check
...
for 64-bit alignment was not removed.
2007-03-20 08:54:51 +00:00
Andy Polyakov
64aecc6720
Make armv4t-mont module backward binary compatible with armv4 and rename it
...
accordingly.
2007-01-17 20:12:41 +00:00
Andy Polyakov
43b8fe1cd0
Montgomery multiplication for ARMv4.
2007-01-11 21:43:25 +00:00
Andy Polyakov
8876e58f34
Montgomery multiplication for MIPS III/IV. Not engaged.
2006-12-29 11:09:33 +00:00
Andy Polyakov
7321a84d4c
Minor clean-up in crypto/bn/asm.
2006-12-29 11:05:20 +00:00
Andy Polyakov
4cfe3df1f5
Minor performance improvements to x86-mont.pl.
2006-12-28 12:43:16 +00:00
Andy Polyakov
8f2d60ec26
Fix for "strange errors" exposed by ccgost engine. The fix is
...
two extra insructions in sqradd loop at line #503 .
2006-12-27 10:59:51 +00:00
Andy Polyakov
1702c8c4bf
x86-mont.pl sse2 tune-up and integer-only squaring procedure.
2006-12-22 15:28:07 +00:00
Andy Polyakov
87d3af6475
Eliminate 64-bit alignment limitation in sparcv9a-mont.
2006-12-08 15:18:41 +00:00
Andy Polyakov
98939a05b6
alpha-mont.pl: gcc portability fix and make-rule.
2006-12-08 14:18:58 +00:00
Andy Polyakov
d28134b8f3
Minor, +10%, tune-up for x86_64-mont.pl.
2006-12-08 10:13:51 +00:00
Andy Polyakov
8583eba015
Montgomery multiplication routine for Alpha.
2006-12-08 10:12:56 +00:00
Andy Polyakov
73b979e601
Clarify HAL SPARC64 support situation in sparcv9a-mont.pl.
2006-11-28 11:07:36 +00:00
Andy Polyakov
ebae8092cb
Minor optimizations based on intruction level profiler feedback.
2006-11-28 10:34:51 +00:00
Andy Polyakov
2e21922eb6
Modulo-schedule loops in sparcv9a-mont.pl. Overall improvement factor
...
over 0.9.8 is up to 3x on USI&II cores and up to 80% - on USIII&IV.
2006-11-28 07:24:26 +00:00
Andy Polyakov
1c3d2b94be
This is "informational" commit. Its mere purpose is to expose "modulo
...
factor" in inner loops.
2006-11-28 07:20:36 +00:00
Andy Polyakov
48d2335d73
Non-SSE2 path to bn_mul_mont. But it's disabled, because it currently
...
doesn't give performance improvement.
2006-11-27 14:59:35 +00:00
Andy Polyakov
31439046e0
bn/asm/ppc.pl to use ppc-xlate.pl.
2006-10-17 14:37:07 +00:00
Andy Polyakov
cecfdbf72d
VIA-specific Montgomery multiplication routine.
2006-10-17 07:04:48 +00:00
Andy Polyakov
8ea975d070
+20% tune-up for Power5.
2006-08-09 15:40:30 +00:00
Andy Polyakov
c8a0d0aaf9
Engage assembler in solaris64-x86_64-cc.
2006-07-31 22:28:40 +00:00
Andy Polyakov
67d990904e
Futher minor PPC assembler update.
2006-05-04 21:30:41 +00:00
Andy Polyakov
c09a0318b7
Minor PPC assembler updates.
2006-05-03 14:07:34 +00:00
Andy Polyakov
2c5d4daac5
Yet another "teaser" Montgomery multiplication module, for PowerPC.
2006-04-30 21:15:29 +00:00
Andy Polyakov
7a5dbeb782
Minor sparcv9 clean-ups.
2005-12-27 21:27:39 +00:00
Andy Polyakov
3b4a0225e2
As SPARCV9 CPU flavor is [expected to be] detected at run-time, we can
...
afford to relax SPARCV9/8+ compiler command line and produce "unversal"
binaries as we used to.
2005-12-19 09:10:06 +00:00
Andy Polyakov
a00e414faf
Unify sparcv9 assembler naming and build rules among 32- and 64-bit builds.
...
Engage run-time switch between bn_mul_mont_fpu and bn_mul_mont_int.
2005-12-16 17:39:57 +00:00
Andy Polyakov
68ea60683a
Add IALU-only bn_mul_mont for SPARCv9. See commentary section for details.
2005-12-15 22:43:33 +00:00
Andy Polyakov
6df8c74d5b
Switch 64-bit sparcv9 platforms from bn(64,64) to bn(64,32). This doesn't
...
have impact on performance, because amount of multiplications does not
increase with this switch, not on sparcv9 that is. On the contrary, it
actually improves performance, because it spares a load of instructions
used to chase carries. Not to mention that BN assembler modules can be
shared more freely between 32- and 64-bit builts.
2005-12-15 22:40:58 +00:00
Andy Polyakov
07645deeb8
Apply "better safe than sorry" approach after addressing sporadic SEGV in
...
bn_sub_words to the rest of the sparcv8plus.S.
2005-11-15 08:02:10 +00:00
Andy Polyakov
c52c82ffc1
Attempt to resolve sporadic SEGV crashes in bn_sub_words in OpenSSH. I'm
...
baffled why it crashes and does it sporadically...
2005-11-11 20:07:07 +00:00
Andy Polyakov
a4d729f31d
Clarify binary compatibility with HAL/Fujitsu SPARC64 family.
2005-10-25 15:39:47 +00:00
Andy Polyakov
aa2be094ae
Add support for 32-bit ABI to sparcv9a-mont.pl module.
2005-10-22 18:16:09 +00:00
Andy Polyakov
4d524040bc
Change bn_mul_mont declaration and BN_MONT_CTX. Update CHANGES.
2005-10-22 17:57:18 +00:00
Andy Polyakov
bcb43bb358
Yet another "teaser" Montgomery multiply module, for UltraSPARC. It's not
...
integrated yet, but it's tested and benchmarked [see commentary section
for further details].
2005-10-19 07:12:06 +00:00
Andy Polyakov
34736de4c0
Flip saved argument block and tp [required for non-SSE2 path].
2005-10-14 16:05:21 +00:00
Andy Polyakov
5f50d597f2
Make sure x86-mont.pl returns zero even if compiled with no-sse2.
2005-10-14 15:24:06 +00:00
Andy Polyakov
35593b33f4
Add timestamp to x86-mont.pl.
2005-10-09 10:26:56 +00:00
Andy Polyakov
54f3d200d3
Throw in bn/asm/x86-mont.pl Montgomery multiplication "teaser".
2005-10-09 09:53:58 +00:00
Andy Polyakov
7a2f4cbfe8
x86_64-mont.pl readability improvement.
2005-10-07 15:18:16 +00:00
Andy Polyakov
5ac7bde7c9
Throw in Montgomery multiplication assembler for x86_64.
2005-10-07 14:18:06 +00:00
Andy Polyakov
6f9afa68cd
IA-32 BN tune-up. Performance imrpovement varies with platform and
...
keylength, this time larger improvement for shorter keys, and reaches
15%. Both SSE2 and IALU code pathes are improved.
2005-09-20 12:26:54 +00:00
Andy Polyakov
7cfe2a5e65
Fix Intel assembler warnings.
2005-08-10 08:28:36 +00:00
Andy Polyakov
ef428d5681
Fix unwind directives in IA-64 assembler modules. This helps symbolic
...
debugging and doesn't affect functionality.
Submitted by: David Mosberger
Obtained from: http://www.hpl.hp.com/research/linux/crypto/
2005-07-18 09:54:14 +00:00
Andy Polyakov
31efffbdba
Trap condition should be 64-bit when it's due.
2005-07-03 09:17:50 +00:00
Andy Polyakov
aaa5dc614f
More elegant solution to "sparse decimal printout on PPC" problem.
2005-07-02 08:58:55 +00:00
Andy Polyakov
8be97c01d1
Decimal printout of a BN is wrong on PPC, it's sparse with very few
...
significant digits. As soon it verifies elsewhere it goes to 0.9.8 and
0.9.7.
2005-07-01 17:49:47 +00:00
Richard Levitte
4bb61becbb
Add emacs cache files to .cvsignore.
2005-04-11 14:17:07 +00:00
Andy Polyakov
60fd574cdf
Make bn/asm/x86_64-gcc.c gcc4 savvy. +r is likely to be initially
...
introduced for a reason [like bug in initial gcc port], but proposed
=&r is treated correctly by senior 3.2, so we can assume it's safe now.
PR: 1031
2005-04-03 18:53:29 +00:00
Andy Polyakov
da30c74a27
Remove unused assembler modules.
2005-02-06 13:43:02 +00:00
Andy Polyakov
2b247cf81f
OPENSSL_ia32cap final touches. Note that OPENSSL_ia32cap is no longer a
...
symbol, but a macro expanded as (*(OPENSSL_ia32cap_loc())). The latter
is the only one to be exported to application.
2004-08-29 16:36:05 +00:00
Andy Polyakov
859ceeeb51
Anchor AES and SHA-256/-512 assembler from C.
2004-07-18 17:26:01 +00:00
Andy Polyakov
e2f2a9af2c
New scalable bn_mul_add_words loop, which provides up to >20% overall
...
performance improvement. Make module more gcc friendly and clarify
copyright issues for division routine.
2004-07-01 11:10:38 +00:00
Andy Polyakov
1809e858bb
Eliminate compiler warnings and throw in performance table.
2004-05-28 10:15:58 +00:00
Andy Polyakov
d3adc3d3ed
SSE2 accelerated bn_mul_add_words. Code is currently disabled till proper
...
config and run-time support is added.
PR: 788
Submitted by: <dean@arctic.org>
Reviewed by: <appro>
Obtained from: http://arctic.org/~dean/crypto/rsa.html
2004-05-06 10:36:49 +00:00
Andy Polyakov
dd55880644
Improved PowerPC support. Proper ./config support for ppc targets,
...
especially for AIX. But most important BIGNUM assembler implementation
submitted by IBM.
Submitted by: Peter Waltenberg <pwalten@au1.ibm.com>
Reviewed by: appro
2004-04-27 22:05:50 +00:00
Andy Polyakov
1751034669
Typo in crypto/bn/asm/x86_64.c, bn_div_words().
...
PR: 821
2004-02-07 09:51:28 +00:00
Andy Polyakov
722d17cbac
This is an *initial* tune-up. This update puts Itanium2 back on par with
...
Itanium. I mean if overall performance improvement over C version was X
for Itanium, it's X even for Itanium2.
2003-01-19 21:29:59 +00:00
Richard Levitte
2f09524501
A few more files to ignore
2003-01-16 21:32:56 +00:00
Andy Polyakov
46a0d4fbcb
Support for ILP32 on HPUX-IA64.
2003-01-03 15:10:46 +00:00
Andy Polyakov
04945fda66
pa-risc2.s was not PIC, see RT#426. I strip call to fprintf as it's
...
never called anyway (it's a debugging assertion). If pa-risc2W.s is
PIC remains to be seen...
2003-01-03 10:52:40 +00:00
Richard Levitte
e9883d285d
Finally, a bn_div_words() in VAX assembler that goes through all tests.
...
PR: 413
2002-12-23 11:25:51 +00:00
Richard Levitte
9b58214e4a
More accurate comments.
2002-12-20 16:38:36 +00:00
Andy Polyakov
2f98abbcb6
x86_64 performance patch.
2002-12-14 20:42:05 +00:00
Richard Levitte
6ab285bf4c
I think I got it now. Apparently, the case of having to shift down
...
the divisor was a bit more complex than I first saw. The lost bit
can't just be discarded, as there are cases where it is important.
For example, look at dividing 320000 with 80000 vs. 80001 (all
decimals), the difference is crucial. The trick here is to check if
that lost bit was 1, and in that case, do the following:
1. subtract the quotient from the remainder
2. as long as the remainder is negative, add the divisor (the whole
divisor, not the shofted down copy) to it, and decrease the
quotient by one.
There's probably a nice mathematical proof for this already, but I
won't bother with that, unless someone requests it from me.
2002-12-02 21:31:45 +00:00
Richard Levitte
1d3159bcca
Make some names consistent.
2002-12-02 02:40:27 +00:00
Richard Levitte
f60ceb54eb
Through some experimentation and thinking, I think I finally got the
...
proper implementation of bn_div_words() for VAX.
If the tests go through well, the next step will be to test on Alpha.
2002-12-02 02:28:27 +00:00
Richard Levitte
0f995b2f40
Small bugfix: even when r == d, we need to adjust r and q.
...
PR: 366
2002-12-01 02:17:23 +00:00
Richard Levitte
a678430602
Redo the VAX assembler version of bn_div_words().
...
PR: 366
2002-12-01 00:49:36 +00:00
Andy Polyakov
622d3d3592
Support for Intel and HP-UXi assemblers.
2001-07-30 15:54:13 +00:00
Andy Polyakov
19a6e8b32c
This fixes "Spurious test failures on IRIX?" reported in April. Apparently
...
I was wrong in conclusions about when addition starts overflowing in combaX
routines.
2001-06-22 19:17:42 +00:00
Andy Polyakov
52c0d30078
Get rid of "possible WAW dependency" warnings.
...
Submitted by:
Reviewed by:
PR:
2001-06-11 12:47:52 +00:00
Andy Polyakov
a95541d61e
Get rid of RAW dependency warnings.
...
Submitted by:
Reviewed by:
PR:
2001-05-30 22:01:33 +00:00