Commit Graph

71 Commits

Author SHA1 Message Date
Andy Polyakov
804515425a +20% performance improvement of P4-specific RC4_CHAR loop. 2005-05-15 22:43:00 +00:00
Andy Polyakov
0ee883650d Commentary update motivating code update in 0.9.7. 2005-05-04 14:51:38 +00:00
Andy Polyakov
647907918d Commentary update. 2005-05-03 21:16:42 +00:00
Andy Polyakov
5f1841cdca Rename amd64 modules to x86_64 and update RC4 implementation. 2005-05-03 15:42:05 +00:00
Andy Polyakov
1cfd258ed6 Throw in x86_64 AT&T to MASM assembler converter to facilitate development
of dual-ABI Unix/Win64 modules.
2005-04-17 21:05:57 +00:00
Richard Levitte
4bb61becbb Add emacs cache files to .cvsignore. 2005-04-11 14:17:07 +00:00
Andy Polyakov
81ee80ab88 +45% RC4 performance boost on Intel EM64T core. Unrolled loop providing
further +35% will follow...

Submitted by: Zou Nanhai
2005-04-06 09:45:42 +00:00
Andy Polyakov
f774accdbf Fix rc4-ia64.S to pass more exhaustive regression tests. 2004-12-02 10:07:55 +00:00
Andy Polyakov
7c69478064 I've introduced a bug to i386 RC4 assembler, which would emerge with
certain mix of calls to RC4 routine not covered by rc4test.c.
It's fixed now. In addition this patch inadvertently fixes minor
performance problem: in 0.9.7 context P4 was performing 12% slower
than the original implementation...
2004-12-01 15:28:18 +00:00
Andy Polyakov
b7b46c9a87 Add 0.9.7 specific comments to RC4 assembler modules. 2004-11-30 15:46:46 +00:00
Andy Polyakov
7a3240e319 Final touches to rc4/asm/rc4-596.pl, +52% better performance on AMD core. 2004-11-29 21:12:58 +00:00
Andy Polyakov
d675c74d14 RC4 IA-64 assembler implementation. 2004-11-26 15:07:50 +00:00
Andy Polyakov
376729e130 RC4 tune-up for Intel P4 core, both 32- and 64-bit ones. As it's
apparently impossible to compose blended code with would perform
satisfactory on all x86 and x86_64 cores, an extra RC4_CHAR
code-path is introduced and P4 core is detected at run-time. This
way we keep original performance on non-P4 implementations and
turbo-charge P4 performance by factor of 2.8x (on 32-bit core).
2004-11-21 10:36:25 +00:00
Andy Polyakov
68d9e764cb As was shown by Marc Bevand reordering of couple of load operations
results in even higher performance gain of 3.3x:-) At least on
Opteron...
2004-11-09 17:23:26 +00:00
Andy Polyakov
afe67fb28e Adapt rc4-amd64.pl for Win64/AMD64 assembler. 2004-07-23 17:51:17 +00:00
Andy Polyakov
e4528e48e3 RC4 tune-up for AMD64. Performance improvement of 2.22x is measured for
linux-x86_64 target.
2004-07-11 16:44:07 +00:00
Richard Levitte
2f09524501 A few more files to ignore 2003-01-16 21:32:56 +00:00
Bodo Möller
88f17a5e98 Remove Win32 assembler files. They are always rebuilt (with some
choice of parameters) when they are needed.
2000-03-13 08:04:20 +00:00
Ralf S. Engelschall
9ea0e64de7 Two more .cvsignore files for the assembler stuff 1999-03-08 09:47:30 +00:00
Ben Laurie
05dc84b82b Fix DWP when only given three parameters. 1999-03-07 15:08:04 +00:00
Ralf S. Engelschall
58964a4922 Import of old SSLeay release: SSLeay 0.9.0b 1998-12-21 10:56:39 +00:00