rotated_mac is a 64-byte aligned buffer of size 64 and rotate_offset is secret.
Consider a weaker leakage model(CL) where only cacheline base address is leaked,
i.e address/32 for 32-byte cacheline(CL32).
Previous code used to perform two loads
1. rotated_mac[rotate_offset ^ 32] and
2. rotated_mac[rotate_offset++]
which would leak 2q + 1, 2q for 0 <= rotate_offset < 32
and 2q, 2q + 1 for 32 <= rotate_offset < 64
The proposed fix performs load operations which will always leak 2q, 2q + 1 and
selects the appropriate value in constant-time.
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/18033)