Andy Polyakov
d3cdab1736
modes/asm/ghash-x86_64.pl: refine GNU assembler version detection.
...
Even though AVX support was added in GAS 2.19 vpclmulqdq was apparently
added in 2.20.
Reviewed-by: Rich Salz <rsalz@openssl.org>
2016-02-27 21:14:18 +01:00
Kurt Roeckx
df057ea6c8
Restore xmm7 from the correct address on win64
...
Reviewed-by: Richard Levitte <levitte@openssl.org>
RT: #4288 , MR: #1831
2016-02-04 15:42:13 +01:00
Andy Polyakov
b974943234
x86_64 assembly pack: tune clang version detection even further.
...
RT#4171
Reviewed-by: Kurt Roeckx <kurt@openssl.org>
2015-12-13 22:18:18 +01:00
Andy Polyakov
a285992763
ARMv4 assembly pack: allow Thumb2 even in iOS build,
...
and engage it in most modules.
Reviewed-by: Tim Hudson <tjh@openssl.org>
2015-12-07 12:06:06 +01:00
Andy Polyakov
76eba0d94b
x86_64 assembly pack: tune clang version detection.
...
RT#4142
Reviewed-by: Richard Levitte <levitte@openssl.org>
2015-11-23 16:00:06 +01:00
Andy Polyakov
fbab8badde
modes/asm/ghash-armv4.pl: extend Apple fix to all clang cases.
...
Triggered by RT#3989.
Reviewed-by: Matt Caswell <matt@openssl.org>
2015-11-11 22:09:18 +01:00
Andy Polyakov
b7f5503fa6
Skylake performance results.
...
Reviewed-by: Matt Caswell <matt@openssl.org>
2015-09-26 19:50:11 +02:00
Andy Polyakov
11208dcfb9
ARMv4 assembly pack: implement support for Thumb2.
...
As some of ARM processors, more specifically Cortex-Mx series, are
Thumb2-only, we need to support Thumb2-only builds even in assembly.
Reviewed-by: Tim Hudson <tjh@openssl.org>
2015-09-25 13:34:02 +02:00
Richard Levitte
053fa39af6
Conversion to UTF-8 where needed
...
This leaves behind files with names ending with '.iso-8859-1'. These
should be safe to remove. If something went wrong when re-encoding,
there will be some files with names ending with '.utf8' left behind.
Reviewed-by: Rich Salz <rsalz@openssl.org>
2015-07-14 01:10:01 +02:00
Andy Polyakov
9b6b470afe
modes/asm/ghashv8-armx.pl: additional performance data.
...
Reviewed-by: Rich Salz <rsalz@openssl.org>
2015-04-21 09:17:53 +02:00
Andy Polyakov
313e6ec11f
Add assembly support for 32-bit iOS.
...
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Richard Levitte <levitte@openssl.org>
2015-04-20 15:06:22 +02:00
Andy Polyakov
7eeeb49e11
modes/asm/ghashv8-armx.pl: up to 90% performance improvement.
...
Reviewed-by: Matt Caswell <matt@openssl.org>
2015-04-02 10:03:09 +02:00
Andy Polyakov
e390ae50e0
ARMv4 assembly pack: add Cortex-A15 performance data.
...
Reviewed-by: Tim Hudson <tjh@openssl.org>
2015-03-08 14:09:32 +01:00
Andy Polyakov
9b05cbc33e
Add assembly support to ios64-cross.
...
Fix typos in ios64-cross config line.
Reviewed-by: Tim Hudson <tjh@openssl.org>
2015-01-23 15:38:41 +01:00
Andy Polyakov
b3d7294976
Add Broadwell performance results.
...
Reviewed-by: Emilia Käsper <emilia@openssl.org>
2015-01-13 21:40:14 +01:00
Andy Polyakov
c1669e1c20
Remove inconsistency in ARM support.
...
This facilitates "universal" builds, ones that target multiple
architectures, e.g. ARMv5 through ARMv7. See commentary in
Configure for details.
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
2015-01-04 23:45:08 +01:00
Andy Polyakov
b59f92e75d
x86[_64] assembly pack: add Silvermont performance data.
...
Reviewed-by: Rich Salz <rsalz@openssl.org>
2014-08-30 19:13:49 +02:00
Andy Polyakov
f5b798f50c
Add GHASH for PowerISA 2.0.7.
...
Reviewed-by: Tim Hudson <tjh@openssl.org>
2014-07-20 14:14:26 +02:00
Andy Polyakov
e91718e80d
Revert "Add GHASH for PowerISA 2.07."
...
This reverts commit 927f2e5dea
.
2014-07-16 13:38:15 +02:00
Andy Polyakov
927f2e5dea
Add GHASH for PowerISA 2.07.
2014-07-16 08:01:41 +02:00
Andy Polyakov
1b0fe79f3e
x86_64 assembly pack: improve masm support.
2014-07-09 20:08:01 +02:00
Andy Polyakov
a356e488ad
x86_64 assembly pack: refine clang detection.
2014-06-28 17:23:21 +02:00
Andy Polyakov
7eb0488280
x86_64 assembly pack: addendum to last clang commit.
2014-06-24 08:37:05 +02:00
Andy Polyakov
ac171925ab
x86_64 assembly pack: allow clang to compile AVX code.
2014-06-24 08:24:25 +02:00
Andy Polyakov
0f777aeb50
ARMv8 assembly pack: add Cortex performance numbers.
2014-06-24 08:06:05 +02:00
Andy Polyakov
1cf8f57b43
ghash-x86_64.pl: optimize for upcoming Atom.
2014-06-11 11:34:18 +02:00
Andy Polyakov
5dcf70a1c5
ARM assembly pack: get ARMv7 instruction endianness right.
...
Pointer out and suggested by: Ard Biesheuvel.
2014-06-06 21:27:18 +02:00
Andy Polyakov
2d5a799d27
Add GHASH for ARMv8 Crypto Extension.
...
Result of joint effort with Ard Biesheuvel.
2014-06-06 20:43:02 +02:00
Andy Polyakov
bd227733b9
C64x+ assembly pack: make it work with older toolchain.
2014-05-04 16:38:32 +02:00
Andy Polyakov
f8cee9d081
bn/asm/armv4-gf2m.pl, modes/asm/ghash-armv4.pl: faster multiplication
...
algorithm suggested in following paper:
Câmara, D.; Gouvêa, C. P. L.; López, J. & Dahab, R.: Fast Software
Polynomial Multiplication on ARM Processors using the NEON Engine.
http://conradoplg.cryptoland.net/files/2010/12/mocrysen13.pdf
2014-04-24 10:24:53 +02:00
Andy Polyakov
98e143f118
ghash-x86[_64].pl: ~15% improvement on Atom Silvermont
...
(other processors unaffected).
2014-02-13 14:37:28 +01:00
Andy Polyakov
d162584b11
modes/asm/ghash-s390x.pl: +15% performance improvement on z10.
2014-02-02 00:09:17 +01:00
Andy Polyakov
5b63a39241
modes/asm/ghash-alpha.pl: fix typo.
2013-11-12 21:52:18 +01:00
Andy Polyakov
33446493f4
modes/asm/ghash-alpha.pl: make it work with older assembler for real.
...
PR: 3165
2013-11-09 11:41:59 +01:00
Andy Polyakov
d24d1d7daf
modes/asm/ghash-alpha.pl: make it work with older assembler.
...
PR: 3165
2013-11-08 22:56:44 +01:00
Andy Polyakov
7a1a12232a
crypto/modes/asm/aesni-gcm-x86_64.pl: minor optimization.
...
Avoid occasional up to 8% performance drops.
2013-09-09 21:43:21 +02:00
Veres Lajos
478b50cf67
misspellings fixes by https://github.com/vlajos/misspell_fixer
2013-09-05 21:39:42 +01:00
Andy Polyakov
02450ec69d
PA-RISC assembler pack: switch to bve in 64-bit builds.
...
PR: 3074
2013-06-18 10:37:00 +02:00
Andy Polyakov
b42759158d
ghash-x86_64.pl: add Haswell performance data.
2013-06-10 22:25:12 +02:00
Andy Polyakov
4e049c5259
Add AES-NI GCM stitch.
2013-03-29 20:45:33 +01:00
Andy Polyakov
1da5d3029e
ghash-x86_64.pl: add AVX code path.
2013-03-24 23:44:35 +01:00
Andy Polyakov
fbf7c44bbf
ghash-x86_64.pl: minor optimization.
2013-03-19 20:02:11 +01:00
Andy Polyakov
28997596f2
ghash-x86_64.pl: fix length handling bug.
...
Thanks to Shay Gueron & Vlad Krasnov for report.
2013-03-06 10:42:21 +01:00
Andy Polyakov
273a808180
ghash-x86[_64].pl: code refresh.
2013-02-14 16:28:09 +01:00
Andy Polyakov
46bf83f07a
x86_64 assembly pack: make Windows build more robust.
...
PR: 2963 and a number of others
2013-01-22 22:27:28 +01:00
Andy Polyakov
3766e7ccab
ghash-sparcv9.pl: shave off one more xmulx, improve T3 performance by 7%.
2012-12-04 20:21:24 +00:00
Andy Polyakov
904732f68b
C64x+ assembly pack: improve EABI support.
2012-11-28 13:19:10 +00:00
Andy Polyakov
24798c5e59
ghash-sparcv9.pl: 22% improvement on T4.
2012-11-05 08:47:26 +00:00
Andy Polyakov
23328d4b27
ghash-sparcv9.pl: add VIS3 code path.
2012-10-24 08:21:10 +00:00
Andy Polyakov
6251989eb6
x86_64 assembly pack: make it possible to compile with Perl located on
...
path with spaces.
PR: 2835
2012-06-27 10:08:23 +00:00
Andy Polyakov
d2e1803197
x86[_64] assembly pack: update benchmark results.
2012-06-12 14:18:21 +00:00
Andy Polyakov
f9c5e5d92e
perlasm: fix symptom-less bugs, missing semicolons and 'my' declarations.
2012-04-28 10:36:58 +00:00
Andy Polyakov
3e181369dd
C64x+ assembler pack. linux-c64xplus build is *not* tested nor can it be
...
tested, because kernel is not in shape to handle it *yet*. The code is
committed mostly to stimulate the kernel development.
2012-04-18 13:01:36 +00:00
Andy Polyakov
26e6bac143
ghash-s390x.pl: fix typo [that can induce SEGV in 31-bit build].
2012-04-12 06:44:34 +00:00
Andy Polyakov
5c88dcca5b
ghash-x86.pl: omit unreferenced rem_8bit from no-sse2 build.
2012-03-13 19:43:42 +00:00
Andy Polyakov
98909c1d5b
ghash-x86.pl: engage original MMX version in no-sse2 builds.
2012-01-25 17:56:08 +00:00
Andy Polyakov
2b9a8ca15b
x86gas.pl: add palignr and move pclmulqdq.
2011-05-16 18:07:00 +00:00
Andy Polyakov
b5c6aab57e
x86_64-xlate.pl: allow "base-less" effective address, add palignr, move
...
pclmulqdq.
2011-05-16 17:44:38 +00:00
Andy Polyakov
56c5f703c1
IA-64 assembler pack: fix typos and make it work on HP-UX.
2011-05-07 20:36:05 +00:00
Andy Polyakov
1e86318091
ARM assembler pack: profiler-assisted optimizations and NEON support.
2011-04-01 20:58:34 +00:00
Andy Polyakov
bc5b136c5c
ghash-x86.pl: optimize for Sandy Bridge.
2011-03-04 13:21:41 +00:00
Andy Polyakov
0ab8fd58e1
s390x assembler pack: tune-up and support for new z196 hardware.
2011-03-04 13:09:16 +00:00
Andy Polyakov
e822c756b6
s390x assembler pack: adapt for -m31 build, see commentary in Configure
...
for more details.
2010-11-29 20:52:43 +00:00
Andy Polyakov
8986e37249
ghash-s390x.pl: reschedule instructions for better performance.
2010-09-21 11:37:00 +00:00
Andy Polyakov
f8927c89d0
Alpha assembler pack: adapt for Linux.
...
PR: 2335
2010-09-13 13:28:52 +00:00
Andy Polyakov
7d1f55e9d9
Add ghash-s390x.pl.
2010-09-10 14:50:17 +00:00
Andy Polyakov
d52d5ad147
modes/asm/ghash-*.pl: switch to [more reproducible] performance results
...
collected with 'apps/openssl speed ghash'.
2010-09-05 19:52:14 +00:00
Andy Polyakov
a3b0c44b1b
ghash-ia64.pl: 50% performance improvement of gcm_ghash_4bit.
2010-09-05 19:49:54 +00:00
Andy Polyakov
85e28dfa6f
ghash-ia64.pl: excuse myself from implementing "528B" variant.
2010-07-26 21:54:21 +00:00
Andy Polyakov
133a7f9a50
perlasm/x86asm.pl: move aesni and pclmulqdq opcodes to aesni-x86.pl and
...
ghash-x86.pl.
2010-07-26 21:42:07 +00:00
Andy Polyakov
2d22e08083
ARM assembler pack: reschedule instructions for dual-issue pipeline.
...
Modest improvement coefficients mean that code already had some
parallelism and there was not very much room for improvement. Special
thanks to Ted Krovetz for benchmarking the code with such patience.
2010-07-13 14:03:31 +00:00
Andy Polyakov
396df7311e
crypto/*/Makefile: unify "catch-all" assembler make rules and harmonize
...
ARM assembler modules.
2010-07-08 15:03:42 +00:00
Andy Polyakov
acbcc271b1
ghash-armv4.pl: excuse myself from implementing "528B" flavour.
2010-07-02 08:14:12 +00:00
Andy Polyakov
b28750877c
ghash-sparcv9.pl: fix Makefile rule and add performance data for T1.
2010-07-02 08:09:30 +00:00
Andy Polyakov
c32fcca6f4
SPARCv9 assembler pack: refine CPU detection on Linux, fix for "unaligned
...
opcodes detected in executable segment" error.
2010-07-01 07:34:56 +00:00
Andy Polyakov
d364506a24
ghash-x86_64.pl: "528B" variant delivers further >30% improvement.
2010-06-09 15:05:59 +00:00
Andy Polyakov
04e2b793d6
ghash-x86.pl: commentary updates.
2010-06-09 15:05:14 +00:00
Andy Polyakov
8525950e7e
ghash-x86.pl: "528B" variant of gcm_ghash_4bit_mmx gives 20-40%
...
improvement.
2010-06-04 13:21:01 +00:00
Andy Polyakov
07e29c1234
ghash-x86.pl: MMX optimization (+20-40%) and commentary update.
2010-05-23 12:37:01 +00:00
Andy Polyakov
1aa8a6297c
ghash-x86[_64].pl: add due credit.
2010-05-13 17:21:52 +00:00
Andy Polyakov
c1f092d14e
GCM "jumbo" update:
...
- gcm128.c: support for Intel PCLMULQDQ, readability improvements;
- asm/ghash-x86.pl: splitted vanilla, MMX, PCLMULQDQ subroutines;
- asm/ghash-x86_64.pl: add PCLMULQDQ implementations.
2010-05-13 15:32:43 +00:00
Andy Polyakov
8a682556b4
Add ghash-armv4.pl.
2010-05-03 18:23:29 +00:00
Andy Polyakov
5e19ee96f6
Add ghash-parisc.pl.
2010-04-28 18:51:45 +00:00
Andy Polyakov
4f39edbff1
gcm128.c and assembler modules: change argument order for gcm_ghash_4bit.
...
ghash-x86*.pl: fix performance numbers for Core2, as it turned out
previous ones were "tainted" by variable clock frequency.
2010-04-14 19:04:51 +00:00
Andy Polyakov
42feba4797
Add ghash-alpha.pl assembler module.
2010-04-10 13:44:20 +00:00
Andy Polyakov
c3473126b1
GHASH assembler: new ghash-sparcv9.pl module and saner descriptions.
2010-03-22 17:24:18 +00:00
Andy Polyakov
480cd6ab6e
ghash-ia64.pl: new file, GHASH for Itanium.
...
ghash-x86_64.pl: minimize stack frame usage.
ghash-x86.pl: modulo-scheduling MMX loop in respect to input vector
results in up to 10% performance improvement.
2010-03-15 19:07:52 +00:00
Andy Polyakov
f093794e55
Add GHASH x86_64 assembler.
2010-03-11 16:19:46 +00:00
Andy Polyakov
e3a510f8a6
Add GHASH x86 assembler.
2010-03-09 23:03:33 +00:00