glibc/sysdeps/ieee754/dbl-64
Adhemerval Zanella Netto 34b9f8bc17 math: Improve fmod
This uses a new algorithm similar to already proposed earlier [1].
With x = mx * 2^ex and y = my * 2^ey (mx, my, ex, ey being integers),
the simplest implementation is:

   mx * 2^ex == 2 * mx * 2^(ex - 1)

   while (ex > ey)
     {
       mx *= 2;
       --ex;
       mx %= my;
     }

With mx/my being mantissa of double floating pointer, on each step the
argument reduction can be improved 11 (which is sizeo of uint64_t minus
MANTISSA_WIDTH plus the signal bit):

   while (ex > ey)
     {
       mx << 11;
       ex -= 11;
       mx %= my;
     }  */

The implementation uses builtin clz and ctz, along with shifts to
convert hx/hy back to doubles.  Different than the original patch,
this path assume modulo/divide operation is slow, so use multiplication
with invert values.

I see the following performance improvements using fmod benchtests
(result only show the 'mean' result):

  Architecture     | Input           | master   | patch
  -----------------|-----------------|----------|--------
  x86_64 (Ryzen 9) | subnormals      | 19.1584  | 12.5049
  x86_64 (Ryzen 9) | normal          | 1016.51  | 296.939
  x86_64 (Ryzen 9) | close-exponents | 18.4428  | 16.0244
  aarch64 (N1)     | subnormal       | 11.153   | 6.81778
  aarch64 (N1)     | normal          | 528.649  | 155.62
  aarch64 (N1)     | close-exponents | 11.4517  | 8.21306

I also see similar improvements on arm-linux-gnueabihf when running on
the N1 aarch64 chips, where it a lot of soft-fp implementation (for
modulo, clz, ctz, and multiplication):

  Architecture     | Input           | master   | patch
  -----------------|-----------------|----------|--------
  armhf (N1)       | subnormal       | 15.908   | 15.1083
  armhf (N1)       | normal          | 837.525  | 244.833
  armhf (N1)       | close-exponents | 16.2111  | 21.8182

Instead of using the math_private.h definitions, I used the
math_config.h instead which is used on newer math implementations.

Co-authored-by: kirill <kirill.okhotnikov@gmail.com>

[1] https://sourceware.org/pipermail/libc-alpha/2020-November/119794.html
Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2023-04-03 16:36:24 -03:00
..
asincos.tbl Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
atnat2.h Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
atnat.h Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
branred.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
branred.h Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
dbl2mpn.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
dla.h Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
e_acos.c
e_acosh.c
e_asin.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
e_atan2.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
e_atanh.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
e_cosh.c
e_exp2.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
e_exp10.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
e_exp_data.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
e_exp.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
e_fmod.c math: Improve fmod 2023-04-03 16:36:24 -03:00
e_gamma_r.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
e_hypot.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
e_ilogb.c
e_j0.c
e_j1.c
e_jn.c
e_lgamma_r.c
e_log2_data.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
e_log2.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
e_log10.c
e_log_data.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
e_log.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
e_pow_log_data.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
e_pow.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
e_remainder.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
e_sinh.c
e_sqrt.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
gamma_product.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
gamma_productf.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
k_rem_pio2.c
k_tan.c
lgamma_neg.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
lgamma_product.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
Makefile
math_config.h math: Improve fmod 2023-04-03 16:36:24 -03:00
math_err.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
mpn2dbl.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
mydefs.h Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
powtwo.tbl Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
root.tbl Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_asinh.c
s_atan.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_cbrt.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_ceil.c
s_copysign.c
s_cos.c
s_erf.c
s_expm1.c
s_f32xaddf64.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_f32xdivf64.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_f32xfmaf64.c
s_f32xmulf64.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_f32xsqrtf64.c
s_f32xsubf64.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_fabs.c
s_fadd.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_fdiv.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_ffma.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_finite.c
s_floor.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_fma.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_fmaf.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_fmul.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_fpclassify.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_frexp.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_fromfp_main.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_fromfp.c
s_fromfpx.c
s_fsqrt.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_fsub.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_getpayload.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_isinf.c
s_isnan.c
s_issignaling.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_llrint.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_llround.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_log1p.c
s_logb.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_lrint.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_lround.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_modf.c
s_nearbyint.c
s_nexttoward.c
s_nextup.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_remquo.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_rint.c
s_round.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_roundeven.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_scalbln.c
s_scalbn.c
s_setpayload_main.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_setpayload.c
s_setpayloadsig.c
s_signbit.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_sin.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_sincos.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_tan.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_tanh.c
s_totalorder.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_totalordermag.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_trunc.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
s_ufromfp.c
s_ufromfpx.c
sincostab.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
uasncs.h Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
uatan.tbl Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
urem.h Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
usncs.h Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
utan.h Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
utan.tbl Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
w_exp2.c
w_exp.c
w_hypot.c
w_log2.c
w_log.c
w_pow.c
x2y2m1.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
x2y2m1f.c Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00