The rational is:
1. SSE42 has nearly identical logic so any benefit is minimal (3.4%
regression on Tigerlake using SSE42 versus AVX across the
benchtest suite).
2. AVX2 version covers the majority of targets that previously
prefered it.
3. The targets where AVX would still be best (SnB and IVB) are
becoming outdated.
All in all the saving the code size is worth it.
All string/memory tests pass.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit 305769b2a1)