openssl

mirror of https://github.com/openssl/openssl.git synced 2025-02-23 14:42:15 +08:00

Author	SHA1	Message	Date
XiaokangQian	2507903eb7	Fix build issue with aes-gcm-armv8-unroll8_64.S on older aarch64 assemblers The EOR3 instruction is implemented with .inst, and the code here is enabled using run-time detection of the CPU capabilities, so no need to explicitly ask for the sha3 extension. Fixes #17773 Reviewed-by: Kurt Roeckx <kurt@roeckx.be> Reviewed-by: Tomas Mraz <tomas@openssl.org> (Merged from https://github.com/openssl/openssl/pull/17795)	2022-03-04 11:11:51 +01:00
Andrey Matyukov	63b996e752	AES-GCM enabled with AVX512 vAES and vPCLMULQDQ. Vectorized 'stitched' encrypt + ghash implementation of AES-GCM enabled with AVX512 vAES and vPCLMULQDQ instructions (available starting Intel's IceLake micro-architecture). The performance details for representative IceLake Server and Client platforms are shown below Performance data: OpenSSL Speed KBs/Sec Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (1Core/1Thread) Payload in Bytes 16 64 256 1024 8192 16384 AES-128-GCM Baseline 478708.27 1118296.96 2428092.52 3518199.4 4172355.99 4235762.07 Patched 534613.95 2009345.55 3775588.15 5059517.64 8476794.88 8941541.79 Speedup 1.12 1.80 1.55 1.44 2.03 2.11 AES-256-GCM Baseline 399237.27 961699.9 2136377.65 2979889.15 3554823.37 3617757.5 Patched 475948.13 1720128.51 3462407.12 4696832.2 7532013.16 7924953.91 Speedup 1.19 1.79 1.62 1.58 2.12 2.19 Intel(R) Core(TM) i7-1065G7 CPU @ 1.30GHz (1Core/1Thread) Payload in Bytes 16 64 256 1024 8192 16384 AES-128-GCM Baseline 259128.54 570756.43 1362554.16 1990654.57 2359128.88 2401671.58 Patched 292139.47 1079320.95 2001974.63 2829007.46 4510318.59 4705314.41 Speedup 1.13 1.89 1.47 1.42 1.91 1.96 AES-256-GCM Baseline 236000.34 550506.76 1234638.08 1716734.57 2011255.6 2028099.99 Patched 247256.32 919731.34 1773270.43 2553239.55 3953115.14 4111227.29 Speedup 1.05 1.67 1.44 1.49 1.97 2.03 Reviewed-by: TJ O'Dwyer, Marcel Cornu, Pablo de Lara Reviewed-by: Paul Dale <pauli@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> (Merged from https://github.com/openssl/openssl/pull/17239)	2022-02-10 15:10:12 +01:00
Danny Tsen	345c99b665	Fixed counter overflow Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Paul Dale <pauli@openssl.org> (Merged from https://github.com/openssl/openssl/pull/17607)	2022-02-07 11:29:18 +11:00
Dimitris Apostolou	07c5465e98	Fix typos Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Paul Dale <pauli@openssl.org> (Merged from https://github.com/openssl/openssl/pull/17634)	2022-02-07 11:23:28 +11:00
XiaokangQian	954f45ba4c	Optimize AES-GCM for uarchs with unroll and new instructions Increase the block numbers to 8 for every iteration. Increase the hash table capacity. Make use of EOR3 instruction to improve the performance. This can improve performance 25-40% on out-of-order microarchitectures with a large number of fast execution units, such as Neoverse V1. We also see 20-30% performance improvements on other architectures such as the M1. Assembly code reviewd by Tom Cosgrove (ARM). Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> Reviewed-by: Paul Dale <pauli@openssl.org> (Merged from https://github.com/openssl/openssl/pull/15916)	2022-01-25 14:30:00 +11:00
Danny Tsen	44a563dde1	AES-GCM performance optimzation with stitched method for p9+ ppc64le Assembly code reviewed by Shricharan Srivatsan <ssrivat@us.ibm.com> Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Paul Dale <pauli@openssl.org> (Merged from https://github.com/openssl/openssl/pull/16854)	2022-01-24 11:25:53 +11:00
David Benjamin	40c24d74de	Don't use __ARMEL__/__ARMEB__ in aarch64 assembly GCC's __ARMEL__ and __ARMEB__ defines denote little- and big-endian arm, respectively. They are not defined on aarch64, which instead use __AARCH64EL__ and __AARCH64EB__. However, OpenSSL's assembly originally used the 32-bit defines on both platforms and even define __ARMEL__ and __ARMEB__ in arm_arch.h. This is less portable and can even interfere with other headers, which use __ARMEL__ to detect little-endian arm. Over time, the aarch64 assembly has switched to the correct defines, such as in `32bbb62ea6`. This commit finishes the job: poly1305-armv8.pl needed a fix and the dual-arch armx.pl files get one more transform to convert from 32-bit to 64-bit. (There is an even more official endianness detector, __ARM_BIG_ENDIAN in the Arm C Language Extensions. But I've stuck with the GCC ones here as that would be a larger change.) Reviewed-by: Matt Caswell <matt@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Paul Dale <pauli@openssl.org> Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> (Merged from https://github.com/openssl/openssl/pull/17373)	2022-01-09 07:40:44 +01:00
Russ Butler	19e277dd19	aarch64: support BTI and pointer authentication in assembly This change adds optional support for - Armv8.3-A Pointer Authentication (PAuth) and - Armv8.5-A Branch Target Identification (BTI) features to the perl scripts. Both features can be enabled with additional compiler flags. Unless any of these are enabled explicitly there is no code change at all. The extensions are briefly described below. Please read the appropriate chapters of the Arm Architecture Reference Manual for the complete specification. Scope ----- This change only affects generated assembly code. Armv8.3-A Pointer Authentication -------------------------------- Pointer Authentication extension supports the authentication of the contents of registers before they are used for indirect branching or load. PAuth provides a probabilistic method to detect corruption of register values. PAuth signing instructions generate a Pointer Authentication Code (PAC) based on the value of a register, a seed and a key. The generated PAC is inserted into the original value in the register. A PAuth authentication instruction recomputes the PAC, and if it matches the PAC in the register, restores its original value. In case of a mismatch, an architecturally unmapped address is generated instead. With PAuth, mitigation against ROP (Return-oriented Programming) attacks can be implemented. This is achieved by signing the contents of the link-register (LR) before it is pushed to stack. Once LR is popped, it is authenticated. This way a stack corruption which overwrites the LR on the stack is detectable. The PAuth extension adds several new instructions, some of which are not recognized by older hardware. To support a single codebase for both pre Armv8.3-A targets and newer ones, only NOP-space instructions are added by this patch. These instructions are treated as NOPs on hardware which does not support Armv8.3-A. Furthermore, this patch only considers cases where LR is saved to the stack and then restored before branching to its content. There are cases in the code where LR is pushed to stack but it is not used later. We do not address these cases as they are not affected by PAuth. There are two keys available to sign an instruction address: A and B. PACIASP and PACIBSP only differ in the used keys: A and B, respectively. The keys are typically managed by the operating system. To enable generating code for PAuth compile with -mbranch-protection=<mode>: - standard or pac-ret: add PACIASP and AUTIASP, also enables BTI (read below) - pac-ret+b-key: add PACIBSP and AUTIBSP Armv8.5-A Branch Target Identification -------------------------------------- Branch Target Identification features some new instructions which protect the execution of instructions on guarded pages which are not intended branch targets. If Armv8.5-A is supported by the hardware, execution of an instruction changes the value of PSTATE.BTYPE field. If an indirect branch lands on a guarded page the target instruction must be one of the BTI <jc> flavors, or in case of a direct call or jump it can be any other instruction. If the target instruction is not compatible with the value of PSTATE.BTYPE a Branch Target Exception is generated. In short, indirect jumps are compatible with BTI <j> and <jc> while indirect calls are compatible with BTI <c> and <jc>. Please refer to the specification for the details. Armv8.3-A PACIASP and PACIBSP are implicit branch target identification instructions which are equivalent with BTI c or BTI jc depending on system register configuration. BTI is used to mitigate JOP (Jump-oriented Programming) attacks by limiting the set of instructions which can be jumped to. BTI requires active linker support to mark the pages with BTI-enabled code as guarded. For ELF64 files BTI compatibility is recorded in the .note.gnu.property section. For a shared object or static binary it is required that all linked units support BTI. This means that even a single assembly file without the required note section turns-off BTI for the whole binary or shared object. The new BTI instructions are treated as NOPs on hardware which does not support Armv8.5-A or on pages which are not guarded. To insert this new and optional instruction compile with -mbranch-protection=standard (also enables PAuth) or +bti. When targeting a guarded page from a non-guarded page, weaker compatibility restrictions apply to maintain compatibility between legacy and new code. For detailed rules please refer to the Arm ARM. Compiler support ---------------- Compiler support requires understanding '-mbranch-protection=<mode>' and emitting the appropriate feature macros (__ARM_FEATURE_BTI_DEFAULT and __ARM_FEATURE_PAC_DEFAULT). The current state is the following: ------------------------------------------------------- \| Compiler \| -mbranch-protection \| Feature macros \| +----------+---------------------+--------------------+ \| clang \| 9.0.0 \| 11.0.0 \| +----------+---------------------+--------------------+ \| gcc \| 9 \| expected in 10.1+ \| ------------------------------------------------------- Available Platforms ------------------ Arm Fast Model and QEMU support both extensions. https://developer.arm.com/tools-and-software/simulation-models/fast-models https://www.qemu.org/ Implementation Notes -------------------- This change adds BTI landing pads even to assembly functions which are likely to be directly called only. In these cases, landing pads might be superfluous depending on what code the linker generates. Code size and performance impact for these cases would be negligible. Interaction with C code ----------------------- Pointer Authentication is a per-frame protection while Branch Target Identification can be turned on and off only for all code pages of a whole shared object or static binary. Because of these properties if C/C++ code is compiled without any of the above features but assembly files support any of them unconditionally there is no incompatibility between the two. Useful Links ------------ To fully understand the details of both PAuth and BTI it is advised to read the related chapters of the Arm Architecture Reference Manual (Arm ARM): https://developer.arm.com/documentation/ddi0487/latest/ Additional materials: "Providing protection for complex software" https://developer.arm.com/architectures/learn-the-architecture/providing-protection-for-complex-software Arm Compiler Reference Guide Version 6.14: -mbranch-protection https://developer.arm.com/documentation/101754/0614/armclang-Reference/armclang-Command-line-Options/-mbranch-protection?lang=en Arm C Language Extensions (ACLE) https://developer.arm.com/docs/101028/latest Addional Notes -------------- This patch is a copy of the work done by Tamas Petz in boringssl. It contains the changes from the following commits: aarch64: support BTI and pointer authentication in assembly Change-Id: I4335f92e2ccc8e209c7d68a0a79f1acdf3aeb791 URL: https://boringssl-review.googlesource.com/c/boringssl/+/42084 aarch64: Improve conditional compilation Change-Id: I14902a64e5f403c2b6a117bc9f5fb1a4f4611ebf URL: https://boringssl-review.googlesource.com/c/boringssl/+/43524 aarch64: Fix name of gnu property note section Change-Id: I6c432d1c852129e9c273f6469a8b60e3983671ec URL: https://boringssl-review.googlesource.com/c/boringssl/+/44024 Change-Id: I2d95ebc5e4aeb5610d3b226f9754ee80cf74a9af Reviewed-by: Paul Dale <pauli@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> (Merged from https://github.com/openssl/openssl/pull/16674)	2021-10-01 09:35:38 +02:00
Matt Caswell	54b4053130	Update copyright year Reviewed-by: Richard Levitte <levitte@openssl.org> (Merged from https://github.com/openssl/openssl/pull/16176)	2021-07-29 15:41:35 +01:00
Tomas Mraz	52f7e44ec8	Split bignum code out of the sparcv9cap.c Fixes #15978 Reviewed-by: Paul Dale <pauli@openssl.org> (Merged from https://github.com/openssl/openssl/pull/16019)	2021-07-15 09:33:04 +02:00
Jung-uk Kim	cd84d8832d	Ignore vendor name in Clang version number. For example, FreeBSD prepends "FreeBSD" to version string, e.g., FreeBSD clang version 11.0.0 (git@github.com:llvm/llvm-project.git llvmorg-11.0.0-rc2-0-g414f32a9e86) Target: x86_64-unknown-freebsd13.0 Thread model: posix InstalledDir: /usr/bin This prevented us from properly detecting AVX support, etc. CLA: trivial Reviewed-by: Richard Levitte <levitte@openssl.org> Reviewed-by: Paul Dale <paul.dale@oracle.com> Reviewed-by: Ben Kaduk <kaduk@mit.edu> (Merged from https://github.com/openssl/openssl/pull/12725)	2020-08-27 20:27:26 -07:00
Matt Caswell	33388b44b6	Update copyright year Reviewed-by: Richard Levitte <levitte@openssl.org> (Merged from https://github.com/openssl/openssl/pull/11616)	2020-04-23 13:55:52 +01:00
David Benjamin	a21314dbbc	Also check for errors in x86_64-xlate.pl. In https://github.com/openssl/openssl/pull/10883, I'd meant to exclude the perlasm drivers since they aren't opening pipes and do not particularly need it, but I only noticed x86_64-xlate.pl, so arm-xlate.pl and ppc-xlate.pl got the change. That seems to have been fine, so be consistent and also apply the change to x86_64-xlate.pl. Checking for errors is generally a good idea. Reviewed-by: Richard Levitte <levitte@openssl.org> Reviewed-by: David Benjamin <davidben@google.com> (Merged from https://github.com/openssl/openssl/pull/10930)	2020-02-17 12:17:53 +10:00
H.J. Lu	98ad3fe82b	x86_64: Add endbranch at function entries for Intel CET To support Intel CET, all indirect branch targets must start with endbranch. Here is a patch to add endbranch to function entries in x86_64 assembly codes which are indirect branch targets as discovered by running openssl testsuite on Intel CET machine and visual inspection. Verified with $ CC="gcc -Wl,-z,cet-report=error" ./Configure shared linux-x86_64 -fcf-protection $ make $ make test and $ CC="gcc -mx32 -Wl,-z,cet-report=error" ./Configure shared linux-x32 -fcf-protection $ make $ make test # <<< passed with https://github.com/openssl/openssl/pull/10988 Reviewed-by: Tomas Mraz <tmraz@fedoraproject.org> Reviewed-by: Richard Levitte <levitte@openssl.org> (Merged from https://github.com/openssl/openssl/pull/10982)	2020-02-15 22:15:03 +01:00
David Benjamin	32be631ca1	Do not silently truncate files on perlasm errors If one of the perlasm xlate drivers crashes, OpenSSL's build will currently swallow the error and silently truncate the output to however far the driver got. This will hopefully fail to build, but better to check such things. Handle this by checking for errors when closing STDOUT (which is a pipe to the xlate driver). Reviewed-by: Richard Levitte <levitte@openssl.org> Reviewed-by: Tim Hudson <tjh@openssl.org> Reviewed-by: Tomas Mraz <tmraz@fedoraproject.org> (Merged from https://github.com/openssl/openssl/pull/10883)	2020-01-22 18:11:30 +01:00
Richard Levitte	9bb3e5fd87	For all assembler scripts where it matters, recognise clang > 9.x Fixes #10853 Reviewed-by: Paul Dale <paul.dale@oracle.com> (Merged from https://github.com/openssl/openssl/pull/10855)	2020-01-17 08:55:45 +01:00
Bernd Edlinger	275a048ffc	Add some missing cfi frame info in aesni-gcm-x86_64.pl Reviewed-by: Kurt Roeckx <kurt@roeckx.be> (Merged from https://github.com/openssl/openssl/pull/10677)	2019-12-23 20:23:27 +01:00
Fangming.Fang	31b59078c8	Optimize AES-GCM implementation on aarch64 Comparing to current implementation, this change can get more performance improved by tunning the loop-unrolling factor in interleave implementation as well as by enabling high level parallelism. Performance(A72) new type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes aes-128-gcm 113065.51k 375743.00k 848359.51k 1517865.98k 1964040.19k 1986663.77k aes-192-gcm 110679.32k 364470.63k 799322.88k 1428084.05k 1826917.03k 1848967.17k aes-256-gcm 104919.86k 352939.29k 759477.76k 1330683.56k 1663175.34k 1670430.72k old type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes aes-128-gcm 115595.32k 382348.65k 855891.29k 1236452.35k 1425670.14k 1429793.45k aes-192-gcm 112227.02k 369543.47k 810046.55k 1147948.37k 1286288.73k 1296941.06k aes-256-gcm 111543.90k 361902.36k 769543.59k 1070693.03k 1208576.68k 1207511.72k Change-Id: I28a2dca85c001a63a2a942e80c7c64f7a4fdfcf7 Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> Reviewed-by: Paul Dale <paul.dale@oracle.com> (Merged from https://github.com/openssl/openssl/pull/9818)	2019-12-19 12:36:07 +10:00
Richard Levitte	1aa89a7a3a	Unify all assembler file generators They now generally conform to the following argument sequence: script.pl "$(PERLASM_SCHEME)" [ C preprocessor arguments ... ] \ $(PROCESSOR) <output file> However, in the spirit of being able to use these scripts manually, they also allow for no argument, or for only the flavour, or for only the output file. This is done by only using the last argument as output file if it's a file (it has an extension), and only using the first argument as flavour if it isn't a file (it doesn't have an extension). While we're at it, we make all $xlate calls the same, i.e. the $output argument is always quoted, and we always die on error when trying to start $xlate. There's a perl lesson in this, regarding operator priority... This will always succeed, even when it fails: open FOO, "something" \|\| die "ERR: $!"; The reason is that '\|\|' has higher priority than list operators (a function is essentially a list operator and gobbles up everything following it that isn't lower priority), and since a non-empty string is always true, so that ends up being exactly the same as: open FOO, "something"; This, however, will fail if "something" can't be opened: open FOO, "something" or die "ERR: $!"; The reason is that 'or' has lower priority that list operators, i.e. it's performed after the 'open' call. Reviewed-by: Matt Caswell <matt@openssl.org> (Merged from https://github.com/openssl/openssl/pull/9884)	2019-09-16 16:29:57 +02:00
Andy Polyakov	6465321e40	ARM64 assembly pack: add ThunderX2 results. Reviewed-by: Tim Hudson <tjh@openssl.org> Reviewed-by: Richard Levitte <levitte@openssl.org> (Merged from https://github.com/openssl/openssl/pull/8776)	2019-04-17 21:08:13 +02:00
Shane Lontis	54d00677f3	cfi build fixes in x86-64 ghash assembly Reviewed-by: Richard Levitte <levitte@openssl.org> Reviewed-by: Matt Caswell <matt@openssl.org> Reviewed-by: Paul Dale <paul.dale@oracle.com> (Merged from https://github.com/openssl/openssl/pull/8281)	2019-02-21 07:39:14 +10:00
David Benjamin	c0e8e5007b	Fix some CFI issues in x86_64 assembly The add/double shortcut in ecp_nistz256-x86_64.pl left one instruction point that did not unwind, and the "slow" path in AES_cbc_encrypt was not annotated correctly. For the latter, add .cfi_{remember,restore}_state support to perlasm. Next, fill in a bunch of functions that are missing no-op .cfi_startproc and .cfi_endproc blocks. libunwind cannot unwind those stack frames otherwise. Finally, work around a bug in libunwind by not encoding rflags. (rflags isn't a callee-saved register, so there's not much need to annotate it anyway.) These were found as part of ABI testing work in BoringSSL. Reviewed-by: Richard Levitte <levitte@openssl.org> GH: #8109	2019-02-17 23:39:51 +01:00
Andy Polyakov	3405db97e5	ARM assembly pack: make it Windows-friendly. "Windows friendliness" means a) flipping .thumb and .text directives, b) always generate Thumb-2 code when asked(); c) Windows-specific references to external OPENSSL_armcap_P. () so far some modules were compiled as .code 32 even if Thumb-2 was targeted. It works at hardware level because processor can alternate between the modes with no overhead. But clang --target=arm-windows's builtin assembler just refuses to compile .code 32... Reviewed-by: Paul Dale <paul.dale@oracle.com> Reviewed-by: Richard Levitte <levitte@openssl.org> (Merged from https://github.com/openssl/openssl/pull/8252)	2019-02-16 16:59:23 +01:00
Richard Levitte	81cae8ce09	Following the license change, modify the boilerplates in crypto/modes/ [skip ci] Reviewed-by: Matt Caswell <matt@openssl.org> (Merged from https://github.com/openssl/openssl/pull/7803)	2018-12-06 15:06:37 +01:00
Matt Caswell	1212818eb0	Update copyright year Reviewed-by: Richard Levitte <levitte@openssl.org> (Merged from https://github.com/openssl/openssl/pull/7176)	2018-09-11 13:45:17 +01:00
Andy Polyakov	ce5eb5e814	modes/asm/ghash-armv4.pl: address "infixes are deprecated" warnings. Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/6615)	2018-07-01 11:51:44 +02:00
Andy Polyakov	1753d12374	PA-RISC assembly pack: make it work with GNU assembler for HP-UX. Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/6583)	2018-06-25 16:45:48 +02:00
Andy Polyakov	41013cd63c	PPC assembly pack: correct POWER9 results. As it turns out originally published results were skewed by "turbo" mode. VM apparently remains oblivious to dynamic frequency scaling, and reports that processor operates at "base" frequency at all times. While actual frequency gets increased under load. Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/6406)	2018-06-03 21:20:06 +02:00
Matt Caswell	83cf7abf8e	Update copyright year Reviewed-by: Richard Levitte <levitte@openssl.org> (Merged from https://github.com/openssl/openssl/pull/6371)	2018-05-29 13:16:04 +01:00
Andy Polyakov	13f6857db1	PPC assembly pack: add POWER9 results. Reviewed-by: Rich Salz <rsalz@openssl.org>	2018-05-10 11:44:21 +02:00
Matt Caswell	6ec5fce25e	Update copyright year Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/6145)	2018-05-01 13:34:30 +01:00
Andy Polyakov	198a2ed791	ARM assembly pack: make it work with older assembler. Reviewed-by: Richard Levitte <levitte@openssl.org> Reviewed-by: Paul Dale <paul.dale@oracle.com> (Merged from https://github.com/openssl/openssl/pull/6043)	2018-04-23 17:29:59 +02:00
Andy Polyakov	603ebe0352	modes/asm/ghashv8-armx.pl: handle lengths not divisible by 4x. Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/4830)	2017-12-04 17:21:23 +01:00
Andy Polyakov	aa7bf31698	modes/asm/ghashv8-armx.pl: optimize modulo-scheduled loop. Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/4830)	2017-12-04 17:21:20 +01:00
Andy Polyakov	9ee020f8dc	modes/asm/ghashv8-armx.pl: modulo-schedule loop. Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/4830)	2017-12-04 17:21:15 +01:00
Andy Polyakov	7ff2fa4b92	modes/asm/ghashv8-armx.pl: implement 4x aggregate factor. This initial commit is unoptimized reference version that handles input lengths divisible by 4 blocks. Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/4830)	2017-12-04 17:20:25 +01:00
Andy Polyakov	7533162322	ARMv8 assembly pack: add Qualcomm Kryo results. [skip ci] Reviewed-by: Tim Hudson <tjh@openssl.org>	2017-11-13 11:13:00 +01:00
Josh Soref	46f4e1bec5	Many spelling fixes/typo's corrected. Around 138 distinct errors found and fixed; thanks! Reviewed-by: Kurt Roeckx <kurt@roeckx.be> Reviewed-by: Tim Hudson <tjh@openssl.org> Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/3459)	2017-11-11 19:03:10 -05:00
Patrick Steuer	bc4e831ccd	s390x assembly pack: extend s390x capability vector. Extend the s390x capability vector to store the longer facility list available from z13 onwards. The bits indicating the vector extensions are set to zero, if the kernel does not enable the vector facility. Also add capability bits returned by the crypto instructions' query functions. Signed-off-by: Patrick Steuer <patrick.steuer@de.ibm.com> Reviewed-by: Andy Polyakov <appro@openssl.org> Reviewed-by: Tim Hudson <tjh@openssl.org> (Merged from https://github.com/openssl/openssl/pull/4542)	2017-10-30 14:31:32 +01:00
Patrick Steuer	af1d638730	s390x assembly pack: remove capability double-checking. An instruction's QUERY function is executed at initialization, iff the required MSA level is installed. Therefore, it is sufficient to check the bits returned by the QUERY functions. The MSA level does not have to be checked at every function call. crypto/aes/asm/aes-s390x.pl: The AES key schedule must be computed if the required KM or KMC function codes are not available. Formally, the availability of a KMC function code does not imply the availability of the corresponding KM function code. Signed-off-by: Patrick Steuer <patrick.steuer@de.ibm.com> Reviewed-by: Andy Polyakov <appro@openssl.org> Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/4501)	2017-10-17 21:55:33 +02:00
Rich Salz	e3713c365c	Remove email addresses from source code. Names were not removed. Some comments were updated. Replace Andy's address with openssl.org Reviewed-by: Andy Polyakov <appro@openssl.org> Reviewed-by: Paul Dale <paul.dale@oracle.com> (Merged from https://github.com/openssl/openssl/pull/4516)	2017-10-13 10:06:59 -04:00
Andy Polyakov	64d92d7498	x86_64 assembly pack: "optimize" for Knights Landing, add AVX-512 results. "Optimize" is in quotes because it's rather a "salvage operation" for now. Idea is to identify processor capability flags that drive Knights Landing to suboptimial code paths and mask them. Two flags were identified, XSAVE and ADCX/ADOX. Former affects choice of AES-NI code path specific for Silvermont (Knights Landing is of Silvermont "ancestry"). And 64-bit ADCX/ADOX instructions are effectively mishandled at decode time. In both cases we are looking at ~2x improvement. AVX-512 results cover even Skylake-X :-) Hardware used for benchmarking courtesy of Atos, experiments run by Romain Dolbeau <romain.dolbeau@atos.net>. Kudos! Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-07-21 14:07:32 +02:00
Rich Salz	28f298e70a	Undo commit `cd359b2` Original text: Clarify use of \|$end0\| in stitched x86-64 AES-GCM code. There was some uncertainty about what the code is doing with \|$end0\| and whether it was necessary for \|$len\| to be a multiple of 16 or 96. Hopefully these added comments make it clear that the code is correct except for the caveat regarding low memory addresses. Change-Id: Iea546a59dc7aeb400f50ac5d2d7b9cb88ace9027 Reviewed-on: https://boringssl-review.googlesource.com/7194 Reviewed-by: Adam Langley <agl@google.com> Reviewed-by: Richard Levitte <levitte@openssl.org> Reviewed-by: Tim Hudson <tjh@openssl.org> (Merged from https://github.com/openssl/openssl/pull/3700)	2017-07-05 17:06:57 -04:00
David Benjamin	e195c8a256	Remove filename argument to x86 asm_init. The assembler already knows the actual path to the generated file and, in other perlasm architectures, is left to manage debug symbols itself. Notably, in OpenSSL 1.1.x's new build system, which allows a separate build directory, converting .pl to .s as the scripts currently do result in the wrong paths. This also avoids inconsistencies from some of the files using $0 and some passing in the filename. Reviewed-by: Richard Levitte <levitte@openssl.org> Reviewed-by: Andy Polyakov <appro@openssl.org> (Merged from https://github.com/openssl/openssl/pull/3431)	2017-05-11 17:00:23 -04:00
Andy Polyakov	c93f06c12f	ARMv4 assembly pack: harmonize Thumb-ification of iOS build. Three modules were left behind in `a285992763`. Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from https://github.com/openssl/openssl/pull/2617)	2017-02-15 23:16:01 +01:00
Andy Polyakov	5c72e5ea7a	modes/asm/*-x86_64.pl: add CFI annotations. Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-02-13 14:14:24 +01:00
Andy Polyakov	384e6de4c7	x86_64 assembly pack: Win64 SEH face-lift. - harmonize handlers with guidelines and themselves; - fix some bugs in handlers; - add missing handlers in chacha and ecp_nistz256 modules; Reviewed-by: Rich Salz <rsalz@openssl.org>	2017-02-06 08:21:42 +01:00
Andy Polyakov	ace05265d2	x86_64 assembly pack: add Goldmont performance results. Reviewed-by: Richard Levitte <levitte@openssl.org>	2016-10-24 13:01:13 +02:00
David Benjamin	609b0852e4	Remove trailing whitespace from some files. The prevailing style seems to not have trailing whitespace, but a few lines do. This is mostly in the perlasm files, but a few C files got them after the reformat. This is the result of: find . -name '.pl' \| xargs sed -E -i '' -e 's/( \|'$'\t'')$//' find . -name '.c' \| xargs sed -E -i '' -e 's/( \|'$'\t'')$//' find . -name '.h' \| xargs sed -E -i '' -e 's/( \|'$'\t'')$//' Then bn_prime.h was excluded since this is a generated file. Note mkerr.pl has some changes in a heredoc for some help output, but other lines there lack trailing whitespace too. Reviewed-by: Kurt Roeckx <kurt@openssl.org> Reviewed-by: Matt Caswell <matt@openssl.org>	2016-10-10 23:36:21 +01:00
Andy Polyakov	6cf412c473	modes/asm/ghash-armv4.pl: improve interoperability with Android NDK. Reviewed-by: Tim Hudson <tjh@openssl.org>	2016-09-03 10:41:52 +02:00

1 2 3

150 Commits