openssl/crypto/sm4
zhangzhilei 13ba91cb02 SM4 optimization for non-asm mode
This patch use table-lookup borrow from aes in crypto/aes/aes_core.c.

Test on my PC(AMD Ryzen Threadripper 3990X 64-Core Processor),

before and after optimization:

debug mode:

Before:
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
SM4-CBC          40101.14k    41453.80k    42073.86k    42174.81k    42216.11k    42227.03k
SM4-ECB          41222.60k    42074.88k    42673.66k    42868.05k    42896.04k    42844.16k
SM4-CTR          35867.22k    36874.47k    37004.97k    37083.82k    37052.42k    37076.99k

After:
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
SM4-CBC          47273.51k    48957.40k    49665.19k    49810.77k    49859.24k    49834.67k
SM4-ECB          48100.01k    49323.34k    50224.04k    50273.28k    50533.72k    50730.12k
SM4-CTR          41352.64k    42621.29k    42971.22k    43061.59k    43089.92k    43100.84k

non-debug mode:

Before:
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
SM4-CBC         141596.59k   145102.93k   146794.50k   146540.89k   146650.45k   146877.10k
SM4-ECB         144774.71k   155106.28k   158166.36k   158279.00k   158520.66k   159280.97k
SM4-CTR         138021.10k   141577.60k   142493.53k   142736.38k   142852.10k   143125.16k

After:
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
SM4-CBC         142016.95k   150068.48k   152238.25k   152773.97k   153094.83k   152027.14k
SM4-ECB         148842.94k   159919.87k   163628.37k   164515.84k   164697.43k   164790.27k
SM4-CTR         141774.23k   146206.89k   147470.25k   147816.28k   146770.60k   148346.20k

Reviewed-by: Paul Dale <pauli@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/17766)
2022-03-03 13:19:55 +01:00
..
asm
build.info
sm4.c