math: Improve layout of exp/exp10 data

GCC aligns global data to 16 bytes if their size is >= 16 bytes.  This patch
changes the exp_data struct slightly so that the fields are better aligned
and without gaps.  As a result on targets that support them, more load-pair
instructions are used in exp.  Exp10 is improved by moving invlog10_2N later
so that neglog10_2hiN and neglog10_2loN can be loaded using load-pair.

The exp benchmark improves 2.5%, "144bits" by 7.2%, "768bits" by 12.7% on
Neoverse V2.  Exp10 improves by 1.5%.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
(cherry picked from commit 5afaf99edb326fd9f36eb306a828d129a3a1d7f7)
This commit is contained in:
Wilco Dijkstra 2024-12-13 15:43:07 +00:00
parent 009c5a2dca
commit e0bc5f64ea

View File

@ -195,16 +195,18 @@ check_uflow (double x)
extern const struct exp_data
{
double invln2N;
double shift;
double negln2hiN;
double negln2loN;
double poly[4]; /* Last four coefficients. */
double shift;
double exp2_shift;
double exp2_poly[EXP2_POLY_ORDER];
double invlog10_2N;
double neglog10_2hiN;
double neglog10_2loN;
double exp10_poly[5];
double invlog10_2N;
uint64_t tab[2*(1 << EXP_TABLE_BITS)];
} __exp_data attribute_hidden;