Commit Graph

1418 Commits

Author SHA1 Message Date
Matheus Castanho
40d055a2dd powerpc: Update libm-test-ulps
Generated with 'make regen-ulps'

Tested on powerpc, powerpc64, and powerpc64le
2021-03-02 10:08:07 -03:00
Florian Weimer
9fc813e1a3 Implement <unwind-link.h> for dynamically loading the libgcc_s unwinder
This will be used to consolidate the libgcc_s access for backtrace
and pthread_cancel.

Unlike the existing backtrace implementations, it provides some
hardening based on pointer mangling.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2021-03-01 15:58:01 +01:00
Florian Weimer
035c012e32 Reduce the statically linked startup code [BZ #23323]
It turns out the startup code in csu/elf-init.c has a perfect pair of
ROP gadgets (see Marco-Gisbert and Ripoll-Ripoll, "return-to-csu: A
New Method to Bypass 64-bit Linux ASLR").  These functions are not
needed in dynamically-linked binaries because DT_INIT/DT_INIT_ARRAY
are already processed by the dynamic linker.  However, the dynamic
linker skipped the main program for some reason.  For maximum
backwards compatibility, this is not changed, and instead, the main
map is consulted from __libc_start_main if the init function argument
is a NULL pointer.

For statically linked binaries, the old approach based on linker
symbols is still used because there is nothing else available.

A new symbol version __libc_start_main@@GLIBC_2.34 is introduced because
new binaries running on an old libc would not run their ELF
constructors, leading to difficult-to-debug issues.
2021-02-25 12:13:02 +01:00
Raoni Fassina Firmino
5ee506ed35 powerpc64: Workaround sigtramp vdso return call
A not so recent kernel change[1] changed how the trampoline
`__kernel_sigtramp_rt64` is used to call signal handlers.

This was exposed on the test misc/tst-sigcontext-get_pc

Before kernel 5.9, the kernel set LR to the trampoline address and
jumped directly to the signal handler, and at the end the signal
handler, as any other function, would `blr` to the address set.  In
other words, the trampoline was executed just at the end of the signal
handler and the only thing it did was call sigreturn.  But since
kernel 5.9 the kernel set CTRL to the signal handler and calls to the
trampoline code, the trampoline then `bctrl` to the address in CTRL,
setting the LR to the next instruction in the middle of the
trampoline, when the signal handler returns, the rest of the
trampoline code executes the same code as before.

Here is the full trampoline code as of kernel 5.11.0-rc5 for
reference:

    V_FUNCTION_BEGIN(__kernel_sigtramp_rt64)
    .Lsigrt_start:
            bctrl   /* call the handler */
            addi    r1, r1, __SIGNAL_FRAMESIZE
            li      r0,__NR_rt_sigreturn
            sc
    .Lsigrt_end:
    V_FUNCTION_END(__kernel_sigtramp_rt64)

This new behavior breaks how `backtrace()` uses to detect the
trampoline frame to correctly reconstruct the stack frame when it is
called from inside a signal handling.

This workaround rely on the fact that the trampoline code is at very
least two (maybe 3?) instructions in size (as it is in the 32 bits
version, only on `li` and `sc`), so it is safe to check the return
address be in the range __kernel_sigtramp_rt64 .. + 4.

[1] subject: powerpc/64/signal: Balance return predictor stack in signal trampoline
    commit: 0138ba5783ae0dcc799ad401a1e8ac8333790df9
    url: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0138ba5783ae0dcc799ad401a1e8ac8333790df9

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-01-28 13:57:50 -03:00
Florian Weimer
527c89cd32 powerpc64: Select POWER9 machine for the scv instruction
It is not available with the baseline ISA.

Fixes commit 68ab82f566
("powerpc: Runtime selection between sc and scv for syscalls").

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2021-01-22 10:45:27 +01:00
Joseph Myers
a031b3abad Update powerpc-nofpu libm-test-ulps. 2021-01-18 20:21:07 +00:00
Paul Eggert
2b778ceb40 Update copyright dates with scripts/update-copyrights
I used these shell commands:

../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")

and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 6694 files FOO.
I then removed trailing white space from benchtests/bench-pthread-locks.c
and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this
diagnostic from Savannah:
remote: *** pre-commit check failed ...
remote: *** error: lines with trailing whitespace found
remote: error: hook declined to update refs/heads/master
2021-01-02 12:17:34 -08:00
Matheus Castanho
68ab82f566 powerpc: Runtime selection between sc and scv for syscalls
Linux kernel v5.9 added support for system calls using the scv
instruction for POWER9 and later.  The new codepath provides better
performance (see below) if compared to using sc.  For the
foreseeable future, both sc and scv mechanisms will co-exist, so this
patch enables glibc to do a runtime check and use scv when it is
available.

Before issuing the system call to the kernel, we check hwcap2 in the TCB
for PPC_FEATURE2_SCV to see if scv is supported by the kernel.  If not,
we fallback to sc and keep the old behavior.

The kernel implements a different error return convention for scv, so
when returning from a system call we need to handle the return value
differently depending on the instruction we used to enter the kernel.

For syscalls implemented in ASM, entry and exit are implemented by
different macros (PSEUDO and PSEUDO_RET, resp.), which may be used in
sequence (e.g. for templated syscalls) or with other instructions in
between (e.g. clone).  To avoid accessing the TCB a second time on
PSEUDO_RET to check which instruction we used, the value read from
hwcap2 is cached on a non-volatile register.

This is not needed when using INTERNAL_SYSCALL macro, since entry and
exit are bundled into the same inline asm directive.

The dynamic loader may issue syscalls before the TCB has been setup
so it always uses sc with no extra checks.  For the static case, there
is no compile-time way to determine if we are inside startup code,
so we also check the value of the thread pointer before effectively
accessing the TCB.  For such situations in which the availability of
scv cannot be determined, sc is always used.

Support for scv in syscalls implemented in their own ASM file (clone and
vfork) will be added later. For now simply use sc as before.

Average performance over 1M calls for each syscall "type":
  - stat: C wrapper calling INTERNAL_SYSCALL
  - getpid: templated ASM syscall
  - syscall: call to gettid using syscall function

  Standard:
     stat : 1.573445 us / ~3619 cycles
   getpid : 0.164986 us / ~379 cycles
  syscall : 0.162743 us / ~374 cycles

  With scv:
     stat : 1.537049 us / ~3535 cycles <~ -84 cycles  / -2.32%
   getpid : 0.109923 us / ~253 cycles  <~ -126 cycles / -33.25%
  syscall : 0.116410 us / ~268 cycles  <~ -106 cycles / -28.34%

Tested on powerpc, powerpc64, powerpc64le (with and without scv)

Tested-by: Lucas A. M. Magalhães <lamm@linux.ibm.com>
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-12-30 18:26:25 -03:00
Florian Weimer
2aa8ec7dd7 powerpc: Regenerate ulps
For new inputs added in commit cad5ad81d2,
as seen on a POWER8 system.
2020-12-22 19:22:44 +01:00
Florian Weimer
4c38c1a229 powerpc64le: Add glibc-hwcaps support
The "power10" and "power9" subdirectories are selected in a way
that matches the -mcpu=power10 and -mcpu=power9 options of GCC.
2020-12-04 14:50:49 +01:00
Paul E. Murphy
33fc34521d powerpc64le: ifunc select *f128 routines in multiarch mode
Programatically generate simple wrappers for interesting libm *f128
objects.  Selected functions are transcendental functions or
those with trivial compiler builtins.  This can result in a 2-3x
speedup (e.g logf128 and expf128).

A second set of implementation files are generated which include
the first implementation encountered along the search path.  This
usually works, except when a wrapper is overriden and makefile
search order slightly diverges from include order.  Likewise,
wrapper object files are created for each generated file.  These
hold the ifunc selection routines which export ABI.

Next, several shared headers are intercepted to control renaming of
asm function redirects are used first, and sometimes macro renames
if the former is impractical.

Notably, if the request machine supports hardware IEEE128 (i.e POWER9
and newer) this ifunc machinery is disabled.  Likewise existing
ifunc support for float128 is consolidated into this (e.g sqrtf128
and fmaf128).

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-11-30 09:56:14 -06:00
Matheus Castanho
1e0a7fd099 powerpc: Make PT_THREAD_POINTER available to assembly code
PT_THREAD_POINTER is currenty defined inside a #ifndef __ASSEMBLER__ block, but
its usage should not be limited to C code, as it can be useful when accessing
the TLS from assembly code as well.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-11-24 14:15:01 -03:00
Florian Weimer
1daccf403b nptl: Move stack list variables into _rtld_global
Now __thread_gscope_wait (the function behind THREAD_GSCOPE_WAIT,
formerly __wait_lookup_done) can be implemented directly in ld.so,
eliminating the unprotected GL (dl_wait_lookup_done) function
pointer.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2020-11-16 19:33:30 +01:00
Florian Weimer
d5c4cce9c3 powerpc: Eliminate UP macro conditionals
The macro is never defined.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2020-11-13 15:20:07 +01:00
Raphael M Zinsly
7beee7b39a powerpc: Add optimized stpncpy for POWER9
Add stpncpy support into the POWER9 strncpy.

Reviewed-by: Matheus Castanho <msc@linux.ibm.com>
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-11-12 13:16:36 -03:00
Raphael M Zinsly
b9d83bf3eb powerpc: Add optimized strncpy for POWER9
Similar to the strcpy P9 optimization, this version uses VSX to improve
performance.

Reviewed-by: Matheus Castanho <msc@linux.ibm.com>
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-11-12 13:12:24 -03:00
Szabolcs Nagy
238032ead6 aarch64: enforce >=64K guard size [BZ #26691]
There are several compiler implementations that allow large stack
allocations to jump over the guard page at the end of the stack and
corrupt memory beyond that. See CVE-2017-1000364.

Compilers can emit code to probe the stack such that the guard page
cannot be skipped, but on aarch64 the probe interval is 64K by default
instead of the minimum supported page size (4K).

This patch enforces at least 64K guard on aarch64 unless the guard
is disabled by setting its size to 0.  For backward compatibility
reasons the increased guard is not reported, so it is only observable
by exhausting the address space or parsing /proc/self/maps on linux.

On other targets the patch has no effect. If the stack probe interval
is larger than a page size on a target then ARCH_MIN_GUARD_SIZE can
be defined to get large enough stack guard on libc allocated stacks.

The patch does not affect threads with user allocated stacks.

Fixes bug 26691.
2020-10-02 09:57:44 +01:00
Raphael Moreira Zinsly
3322ecbfe2 powerpc: Protect dl_powerpc_cpu_features on INIT_ARCH() [BZ #26615]
dl_powerpc_cpu_features also needs to be protected by __GLRO to check
for the _rtld_global_ro realocation before accessing it.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-09-22 17:45:12 -03:00
Raphael Moreira Zinsly
07f3ecdba6 powerpc: fix ifunc implementation list for POWER9 strlen and stpcpy
__strlen_power9 and __stpcpy_power9 were added to their ifunc lists
using the wrong function names.
2020-09-17 11:00:42 -05:00
Matheus Castanho
c71d13a098 Update powerpc libm-test-ulps
Before this patch, the following tests were failing:

ppc and ppc64:
    FAIL: math/test-ldouble-j0

ppc64le:
    FAIL: math/test-float128-j0
    FAIL: math/test-float64x-j0
    FAIL: math/test-ibm128-j0
    FAIL: math/test-ldouble-j0
2020-09-10 15:52:01 -03:00
Florian Weimer
7650321ce0 powerpc: Fix incorrect cache line size load in memset (bug 26332)
__GLRO loaded the word after the requested variable on big-endian
PowerPC, where LOWORD is 4.  This can cause the memset implement
go wrong because the masking with the cache line size produces
wrong results, particularly if the loaded value happens to be 1.

The __GLRO macro is not used in any place where loading the lower
32-bit word of a 64-bit value is desired, so the +4 offset is always
wrong.

Fixes commit 18363b4f01
("powerpc: Move cache line size to rtld_global_ro") and bug 26332.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-08-03 18:07:19 +02:00
Tulio Magno Quites Machado Filho
f6add169c8 powerpc: Fix POWER10 selection
Add a line that was missing from a previous commit.
Without increasing str, the null-byte is not validated, and
_dl_string_platform returns -1.

Fixes: d2ba3677da ("powerpc: Add support for POWER10")

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2020-07-21 18:01:39 -03:00
Paul E. Murphy
c79607a474 powerpc64le: guarantee a .gnu.attributes section [BZ #26220]
Upstream GCC 11 development is now building the ibm128 runtime
support (in libgcc) without a .gnu.attributes section on ppc64le.
Ensure we have one to replace by building one ibm128 file in
libc and libm with attributes.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-07-21 09:03:01 -05:00
Joseph Myers
469c03907b Update powerpc-nofpu libm-test-ulps. 2020-07-20 20:16:25 +00:00
Tulio Magno Quites Machado Filho
7c7bcf3634 powerpc64: Fix calls when r2 is not used [BZ #26173]
Teach the linker that __mcount_internal, __sigjmp_save_symbol,
__syscall_error and __GI_exit do not use r2, so that it does not need to
recover r2 after the call.

Test at configure time if the assembler supports @notoc and define
USE_PPC64_NOTOC.
2020-07-10 19:41:06 -03:00
Tulio Magno Quites Machado Filho
d2ba3677da powerpc: Add support for POWER10
1. Add the directories to hold POWER10 files.

2. Add support to select POWER10 libraries based on AT_PLATFORM.

3. Let submachine=power10 be set automatically.
2020-06-29 10:08:38 -03:00
Tulio Magno Quites Machado Filho
ae725e3f9c powerpc: Add new hwcap values
Linux commit ID ee988c11acf6f9464b7b44e9a091bf6afb3b3a49 reserved 2 new
bits in AT_HWCAP2:
 - PPC_FEATURE2_ARCH_3_1 indicates the availability of the POWER ISA
   3.1;
 - PPC_FEATURE2_MMA indicates the availability of the Matrix-Multiply
   Assist facility.
2020-06-23 18:15:06 -03:00
Adhemerval Zanella
169ea8f928 powerpc: Use sqrt{f} builtin
The powerpc sqrt implementation is also simplified:

  - the static constants are open coded within the implementation.
  - for !USE_SQRT_BUILTIN the function is implemented directly on
    __ieee754_sqrt (it avoid an superflous extra jump).

Checked on powerpc-linux-gnu and powerpc64le-linux-gnu.
2020-06-22 11:09:49 -03:00
Adhemerval Zanella
e80501a5c9 math: Decompose math-use-builtins.h
Each symbol definitions are moved on a separated file and it
cover all symbol type definitions (float, double, long double,
and float128).

It allows to set support for architectures without the boiler
place of copying default values.

Checked with a build on the affected ABIs.
2020-06-22 11:09:45 -03:00
Paul E. Murphy
b637306d3e powerpc64le: refactor e_sqrtf128.c
Combine both implementations into a single file to allow
building twice with appropriate multiarch support when possible.
2020-06-16 13:50:44 -05:00
Paul E. Murphy
146fea0764 powerpc: Automatic CPU detection in preconfigure
Added a check to detect the CPU value in preconfigure, so that glibc is
built with the correct --with-cpu value.  And move existing checks into
preconfigure.ac.

Co-Authored-By: Carlos Eduardo Seo  <cseo@linux.vnet.ibm.com>
Co-Authored-By: Tulio Magno Quites Machado Filho  <tuliom@linux.vnet.ibm.com>
2020-06-11 17:15:49 -05:00
Paul E. Murphy
a23bd00f9d powerpc64le: add optimized strlen for P9
This started as a trivial change to Anton's rawmemchr.  I got
carried away.  This is a hybrid between P8's asympotically
faster 64B checks with extremely efficient small string checks
e.g <64B (and sometimes a little bit more depending on alignment).

The second trick is to align to 64B by running a 48B checking loop
16B at a time until we naturally align to 64B (i.e checking 48/96/144
bytes/iteration based on the alignment after the first 5 comparisons).
This allieviates the need to check page boundaries.

Finally, explicly use the P7 strlen with the runtime loader when building
P9.  We need to be cautious about vector/vsx extensions here on P9 only
builds.
2020-06-05 15:30:00 -05:00
Paul E. Murphy
6ef4227509 powerpc64le: use common fmaf128 implementation
This defines the macro such that it should behave best on all
supported powerpc targets.  Likewise, this allows us to remove the
ppc64le specific s_fmaf128.c.

I have verified powerpc64le multiarch and powerpc64le power9
no-multiarch builds continue to generate optimize fmaf128.
2020-06-05 15:29:44 -05:00
Adhemerval Zanella
6f10ff02cb powerpc: Fix powerpc64le due a7a3435c9a
The build uses an undefined macro evaluation for fmaf128 build.
For now set USE_FMAL_BUILTIN and USE_FMAF128_BUILTIN to 0.

Checked with a build for:

  powerpc64le-linux-gnu-power9-disable-multi-arch
  powerpc64le-linux-gnu-power9
  powerpc64le-linux-gnu
  powerpc64-linux-gnu-power8
  powerpc64-linux-gnu
  powerpc-linux-gnu-power4
  powerpc-linux-gnu
2020-06-04 09:05:41 -03:00
Vineet Gupta
a7a3435c9a powerpc/fpu: use generic fma functions
Tested with build-many-glibcs for powerpc-linux-gnu

This is a non functional change and powerpc libm before/after was
byte invariant as compared below:

| cd /SCRATCH/vgupta/gnu/install-glibc-A-baseline
| for i in `find . -name libm-2.31.9000.so`; do
|   echo $i; diff $i /SCRATCH/vgupta/gnu/install-glibc-C-reduce-scope/$i ;
|   echo $?;
| done

| ./aarch64-linux-gnu/lib64/libm-2.31.9000.so
| 0
| ./arm-linux-gnueabi/lib/libm-2.31.9000.so
| 0
| ./x86_64-linux-gnu/lib64/libm-2.31.9000.so
| 0
| ./arm-linux-gnueabihf/lib/libm-2.31.9000.so
| 0
| ./riscv64-linux-gnu-rv64imac-lp64/lib64/lp64/libm-2.31.9000.so
| 0
| ./riscv64-linux-gnu-rv64imafdc-lp64/lib64/lp64/libm-2.31.9000.so
| 0
| ./powerpc-linux-gnu/lib/libm-2.31.9000.so
| 0
| ./microblaze-linux-gnu/lib/libm-2.31.9000.so
| 0
| ./nios2-linux-gnu/lib/libm-2.31.9000.so
| 0
| ./hppa-linux-gnu/lib/libm-2.31.9000.so
| 0
| ./s390x-linux-gnu/lib64/libm-2.31.9000.so
| 0

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2020-06-03 10:23:33 -07:00
Anton Blanchard
765de945ef powerpc: Optimized rawmemchr for POWER9
This version uses vector instructions and is up to 60% faster on medium
matches and up to 90% faster on long matches, compared to the POWER7
version. A few examples:

                            __rawmemchr_power9  __rawmemchr_power7
Length   32, alignment  0:   2.27566             3.77765
Length   64, alignment  2:   2.46231             3.51064
Length 1024, alignment  0:  17.3059             32.6678
2020-05-18 17:08:54 -05:00
Anton Blanchard via Libc-alpha
aa70d05632 powerpc: Optimized stpcpy for POWER9
Add stpcpy support to the POWER9 strcpy. This is up to 40% faster on
small strings and up to 90% faster on long relatively unaligned strings,
compared to the POWER8 version. A few examples:

                                        __stpcpy_power9  __stpcpy_power8
Length   20, alignments in bytes  4/ 4:  2.58246          4.8788
Length 1024, alignments in bytes  1/ 6: 24.8186          47.8528
2020-05-18 08:26:22 -05:00
Anton Blanchard via Libc-alpha
3903704850 powerpc: Optimized strcpy for POWER9
This version uses VSX store vector with length instructions and is
significantly faster on small strings and relatively unaligned large
strings, compared to the POWER8 version. A few examples:

                                        __strcpy_power9  __strcpy_power8
Length   16, alignments in bytes  0/ 0: 2.52454          4.62695
Length  412, alignments in bytes  4/ 0: 11.6             22.9185
2020-05-18 08:26:22 -05:00
Paul E. Murphy
4a4db1de2f powerpc64le/power9: guard power9 strcmp against rtld usage [BZ# 25905]
strcmp is used while resolving PLT references.  Vector registers
should not be used during this.  The P9 strcmp makes heavy use of
vector registers, so it should be avoided in rtld.

This prevents quiet vector register corruption when glibc is configured
with --disable-multi-arch and --with-cpu=power9.  This can be seen with
test-float64x-compat_totalordermag during the first call into
totalordermagf64x@GLIBC_2.27.

Add a guard to fallback to the power8 implementation when building
power9 strcmp for libraries other than libc.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2020-05-04 13:27:31 -05:00
Gabriel F. T. Gomes
051be01f6b powerpc64le: Enable support for IEEE long double
On platforms where long double may have two different formats, i.e.: the
same format as double (64-bits) or something else (128-bits), building
with -mlong-double-128 is the default and function calls in the user
program match the name of the function in Glibc.  When building with
-mlong-double-64, Glibc installed headers redirect such calls to the
appropriate function.

Likewise, the internals of glibc are now built against IEEE long double.
However, the only (minimally) notable usage of long double is difftime.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-04-30 08:52:08 -05:00
Paul E. Murphy
5c7ccc2983 powerpc64le: blacklist broken GCC compilers (e.g GCC 7.5.0)
GCC 7.5.0 (PR94200) will refuse to compile if both -mabi=% and
-mlong-double-128 are passed on the command line.  Surprisingly,
it will work happily if the latter is not.  For the sake of
maintaining status quo, test for and blacklist such compilers.

Tested with a GCC 8.3.1 and GCC 7.5.0 compiler for ppc64le.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-04-30 08:52:08 -05:00
Paul E. Murphy
3a0acbdcc5 powerpc64le: bump binutils version requirement to >= 2.26
This is a small step up from 2.25 which brings in support for
rewriting the .gnu.attributes section of libc/libm.so.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-04-30 08:52:08 -05:00
Paul E. Murphy
50545f5aa0 powerpc64le: raise GCC requirement to 7.4 for long double transition
Add compiler feature tests to ensure we can build ieee128 long double.
These test for -mabi=ieeelongdouble, -mno-gnu-attribute, and -Wno-psabi.

Likewise, verify some compiler bugs have been addressed.  These aren't
helpful for building glibc, but may cause test failures when testing
the new long double.  See notes below from Raji.

On powerpc64le, some older compiler versions give error for the function
signbit() for 128-bit floating point types.  This is fixed by PR83862
in gcc 8.0 and backported to gcc6 and gcc7.  This patch adds a test
to check compiler version to avoid compiler errors during make check.

Likewise, test for -mno-gnu-attribute support which was

On powerpc64le, a few files are built on IEEE long double mode
(-mabi=ieeelongdouble), whereas most are built on IBM long double mode
(-mabi=ibmlongdouble, the default for -mlong-double-128).  Since binutils
2.31, linking object files with different long double modes causes
errors similar to:

  ld: libc_pic.a(s_isinfl.os) uses IBM long double,
      libc_pic.a(ieee128-qefgcvt.os) uses IEEE long double.
  collect2: error: ld returned 1 exit status
  make[2]: *** [../Makerules:649: libc_pic.os] Error 1

The warnings are fair and correct, but in order for glibc to have
support for both long double modes on powerpc64le, they have to be
ignored.  This can be accomplished with the use of -mno-gnu-attribute
option when building the few files that require IEEE long double mode.

However, -mno-gnu-attribute is not available in GCC 6, the minimum
version required to build glibc, so this patch adds a test for this
feature in powerpc64le builds, and fails early if it's not available.

Co-Authored-By: Rajalakshmi Srinivasaraghavan  <raji@linux.vnet.ibm.com>
Co-Authored-By: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-04-30 08:52:08 -05:00
Tulio Magno Quites Machado Filho
bd6cdfc18c powerpc: Update ULPs and xfail more ibm128 outputs
There are 2 new input values that require to be marked as
xfail-rounding:ibm128-libgcc as they're known to fail because of libgcc
issues with different rounding modes.
Otherwise, the other tests just need an increase in ULP.
2020-04-07 11:41:29 -03:00
Paul E. Murphy
4531ba8ebf powerpc64le: enforce non-specific long double in .gnu.attributes section
We turn off this feature to avoid polluting our shared libary with
a specific value.  However, static libgcc is not under our control,
and has enabled this for ibm128 routines.  This pollutes the
resulting shared libraries with it.

Attach a post-linking hook to replace this section with one crafted
as hard-float + indeterminate ldbl.  This allows IEEE ldbl users to
avoid having to disable the gnu attributes feature which should
protect them from linking ibm ldbl libraries using the gnu attributes
feature.

Currently, this only replaces libc and libm which support both ldbl
formats and rely on application code to explicitly determine which
is to be used.

Strictly speaking, the section could be deleted with minimal lost value.
However correctly set attributes could prove useful for some future change,
and similarly missing attributes.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-04-06 10:23:58 -05:00
Paul E. Murphy
8e72163b16 powerpc64le: workaround ieee long double / _Float128 stdc++ bug
-mabi=ieeelongdouble triggers the stdc++ libraries _Float128
support, which then breaks if algorithm is included.  For now,
explicitly disable _Float128 for such tests.

I have opened up GCC BZ 94080 to track this.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-04-06 10:23:58 -05:00
Paul E. Murphy
6f82d05034 powerpc64le: Enforce -mabi=ibmlongdouble when -mfloat128 used
I have observed a bug on 7.4.0 whereby __mulkc3 calls are
swapped with __multc3 depending on ABI selection.  For the
sake of being overly cautious, build all _Float128 files
with ibm128 to workaround these compilers.  This has been
noted in GCC BZ 84914, and will not be fixed for GCC 7.

Likewise, non-math files built with _Float128 are assumed
to have ibm long double.  Explicilty preserve this
assumption.

Finally, add some bootstrapping code to avoid applying
these options until IEEE long double is enabled as they
require GCC 7 and above.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
2020-04-06 10:23:58 -05:00
Paul E. Murphy
25ee3931f0 powerpc64le/multiarch: don't generate strong aliases for fmaf128-ppc64
This prevents generating a second alias for __fmaieee128 when
compiling with ldouble == ieee128 redirects.
2020-04-06 10:23:58 -05:00
Raphael Moreira Zinsly
66807aebad powerpc: Add support for fmaf128() in hardware
Adds a POWER9 version of fmaf128 that uses the xsmaddqp
instruction.

Co-authored-by: Tulio Magno Quites Machado Filho  <tuliom@linux.ibm.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2020-03-30 18:04:27 -03:00
Adhemerval Zanella
5f34491510 math: Remove fenvinline.h
Similar to string2.h (18b10de7ce) and string3.h (09a596cc2c) this
patch removes the fenvinline.h on all architectures.  Currently
only powerpc implements some optimizations.  This kind of optimization
is better implemented by the compiler (which handles the architecture
ISA transparently).

Also, for the specific optimized powerpc implementation the code is
becoming convoluted and these micro-optimization are hardly wildly
used, even more being a possible hotspot in realword cases
(non-default rounding are used only on specific cases and exception
handling are done most likely only on errors path).  Only x86
implements similar optimization (on fenv.h) also indicates that
these should no be on libc.

The math/test-fenv already covers all math/test-fenvinline tests,
so it is safe to remove it.

The powerpc fegetround optimization is moved to internal
fenv_libc.h.

The BZ#94193 [1] the corresponding GCC bug for adding replacements
for these on powerpc.

Checked on x86_64-linux-gnu and powerpc64le-linux-gnu.

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94193
2020-03-30 10:52:25 -03:00