Now __thread_gscope_wait (the function behind THREAD_GSCOPE_WAIT,
formerly __wait_lookup_done) can be implemented directly in ld.so,
eliminating the unprotected GL (dl_wait_lookup_done) function
pointer.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
There are several compiler implementations that allow large stack
allocations to jump over the guard page at the end of the stack and
corrupt memory beyond that. See CVE-2017-1000364.
Compilers can emit code to probe the stack such that the guard page
cannot be skipped, but on aarch64 the probe interval is 64K by default
instead of the minimum supported page size (4K).
This patch enforces at least 64K guard on aarch64 unless the guard
is disabled by setting its size to 0. For backward compatibility
reasons the increased guard is not reported, so it is only observable
by exhausting the address space or parsing /proc/self/maps on linux.
On other targets the patch has no effect. If the stack probe interval
is larger than a page size on a target then ARCH_MIN_GUARD_SIZE can
be defined to get large enough stack guard on libc allocated stacks.
The patch does not affect threads with user allocated stacks.
Fixes bug 26691.
The arm string/tst-memmove-overflow XFAIL has been added in commit
eca1b23332 ("arm: XFAIL string/tst-memmove-overflow due to bug 25620")
as a way to reproduce the reported bug.
Now that this bug has been fixed in commits 79a4fa341b ("arm:
CVE-2020-6096: fix memcpy and memmove for negative length [BZ #25620]")
and beea361050 ("arm: CVE-2020-6096: Fix multiarch memcpy for negative
length [BZ #25620]"), let's remove the XFAIL.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Using C code allows the compiler to add target specific object file
markings based on CFLAGS.
The arm specific abi-note.S is removed and similar object file fix
up will be avoided on AArch64 with standard branch protection.
Unsigned branch instructions could be used for r2 to fix the wrong
behavior when a negative length is passed to memcpy.
This commit fixes the armv7 version.
Unsigned branch instructions could be used for r2 to fix the wrong
behavior when a negative length is passed to memcpy and memmove.
This commit fixes the generic arm implementation of memcpy amd memmove.
This consolidates the copy-pasted arch specific semaphore header into
single version (based on s390) which suffices 32-bit and and 64-bit
arch/ABI based on the canonical WORDSIZE.
For now I've left out arches which use alternate defines to choose for
32 vs 64-bit builds (aarch64, mips) which in theory can also use the same
header.
Passes build-many for
aarch64-linux-gnu arm-linux-gnueabi arm-linux-gnueabihf
riscv64-linux-gnu-rv64imac-lp64 riscv64-linux-gnu-rv64imafdc-lp64
x86_64-linux-gnu microblaze-linux-gnu nios2-linux-gnu
Suggested-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Most gmp-mparam.h headers in glibc define various macros to the same
values they would be defined to by the generic version of that header,
plus macros IEEE_DOUBLE_BIG_ENDIAN or IEEE_DOUBLE_MIXED_ENDIAN related
to the representation of double. The latter macros are in turn only
used in gmp-impl.h to define union ieee_double_extract, which is not
used in glibc. Thus all of these headers, except for the generic one
and those that define _LONG_LONG_LIMB for ILP32 configurations with
64-bit registers, are redundant, and this patch removes them.
Tested with build-many-glibcs.py that installed stripped shared
libraries are unchanged by this patch.
With mathinline removal there is no need to keep building and testing
inline math tests.
The gen-libm-tests.py support to generate ULP_I_* is removed and all
libm-test-ulps files are updated to longer have the
i{float,double,ldouble} entries. The support for no-test-inline is
also removed from both gen-auto-libm-tests and the
auto-libm-test-out-* were regenerated.
Checked on x86_64-linux-gnu and i686-linux-gnu.
The commit "arm: Split BE/LE abilist"
(1673ba87fe) changed the soft-fp order for
ARM selection when __SOFTFP__ is defined by the compiler.
On 2.30 the sysdeps order is:
2.30
sysdeps/unix/sysv/linux/arm
sysdeps/arm/nptl
sysdeps/unix/sysv/linux
sysdeps/nptl
sysdeps/pthread
sysdeps/gnu
sysdeps/unix/inet
sysdeps/unix/sysv
sysdeps/unix/arm
sysdeps/unix
sysdeps/posix
sysdeps/arm/nofpu
sysdeps/ieee754/soft-fp
sysdeps/arm
sysdeps/wordsize-32
sysdeps/ieee754/flt-32
sysdeps/ieee754/dbl-64
sysdeps/ieee754
sysdeps/generic
While on master is:
sysdeps/unix/sysv/linux/arm/le
sysdeps/unix/sysv/linux/arm
sysdeps/arm/nptl
sysdeps/unix/sysv/linux
sysdeps/nptl
sysdeps/pthread
sysdeps/gnu
sysdeps/unix/inet
sysdeps/unix/sysv
sysdeps/unix/arm
sysdeps/unix
sysdeps/posix
sysdeps/arm/le
sysdeps/arm
sysdeps/wordsize-32
sysdeps/ieee754/flt-32
sysdeps/ieee754/dbl-64
sysdeps/arm/nofpu
sysdeps/ieee754/soft-fp
sysdeps/ieee754
sysdeps/generic
It make the build select some routines (fadd, fdiv, fmul, fsub, and fma)
on ieee754/flt-32 and ieee754/dbl-64 that requires fenv support to be
correctly rounded which in turns lead to math failures since the
__SOFTFP__ does not have fenv support.
With this patch the order is now:
sysdeps/unix/sysv/linux/arm/le
sysdeps/unix/sysv/linux/arm
sysdeps/arm/nptl
sysdeps/unix/sysv/linux
sysdeps/nptlsysdeps/pthread
sysdeps/gnu
sysdeps/unix/inet
sysdeps/unix/sysv
sysdeps/unix/arm
sysdeps/unix
sysdeps/posix
sysdeps/arm/le/nofpu
sysdeps/arm/nofpu
sysdeps/ieee754/soft-fp
sysdeps/arm/le
sysdeps/arm
sysdeps/wordsize-32
sysdeps/ieee754/flt-32
sysdeps/ieee754/dbl-64
sysdeps/ieee754
sysdeps/generic
Checked on arm-linux-gnuaebi.
This supersedes the init_array sysdeps directory. It allows us to
check for ELF_INITFINI in both C and assembler code, and skip DT_INIT
and DT_FINI processing completely on newer architectures.
A new header file is needed because <dl-machine.h> is incompatible
with assembler code. <sysdep.h> is compatible with assembler code,
but it cannot be included in all assembler files because on some
architectures, it redefines register names, and some assembler files
conflict with that.
<elf-initfini.h> is replicated for legacy architectures which need
DT_INIT/DT_FINI support. New architectures follow the generic default
and disable it.
This patch adds a new macro, libm_alias_finite, to define all _finite
symbol. It sets all _finite symbol as compat symbol based on its first
version (obtained from the definition at built generated first-versions.h).
The <fn>f128_finite symbols were introduced in GLIBC 2.26 and so need
special treatment in code that is shared between long double and float128.
It is done by adding a list, similar to internal symbol redifinition,
on sysdeps/ieee754/float128/float128_private.h.
Alpha also needs some tricky changes to ensure we still emit 2 compat
symbols for sqrt(f).
Passes buildmanyglibc.
Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
This commit adds missing skip_ifunc checks to aarch64, arm, i386,
sparc, and x86_64. A new test case ensures that IRELATIVE IFUNC
resolvers do not run in various diagnostic modes of the dynamic
loader.
Reviewed-By: Szabolcs Nagy <szabolcs.nagy@arm.com>
This patch adds a default pthread-offsets.h based on default
thread definitions from struct_mutex.h and struct_rwlock.h.
The idea is to simplify new ports inclusion.
Checked with a build on affected abis.
Change-Id: I7785a9581e651feb80d1413b9e03b5ac0452668a
This patch adds a default pthreadtypes-arch.h, the idea is to simpify
new ports inclusion and an override is required only if the architecture
adds some arch-specific extensions or requirement.
The default values on the new generic header are based on current
architecture define value and they are not optimal compared to current
code requirements as below.
- On 64 bits __SIZEOF_PTHREAD_BARRIER_T is defined as 32 while is
sizeof (struct pthread_barrier) is 20 bytes.
- On 32 bits __SIZEOF_PTHREAD_ATTR_T is defined as 36 while
sizeof (struct pthread_attr) is 32.
The default values are not changed so the generic header could be
used by some architectures.
Checked with a build on affected abis.
Change-Id: Ie0cd586258a2650f715c1af0c9fe4e7063b0409a
This patch adds a new generic __pthread_rwlock_arch_t definition meant
to be used by new ports. Its layout mimics the current usage on some
64 bits ports and it allows some ports to use the generic definition.
The arch __pthread_rwlock_arch_t definition is moved from
pthreadtypes-arch.h to another arch-specific header (struct_rwlock.h).
Also the static intialization macro for pthread_rwlock_t is set to use
an arch defined on (__PTHREAD_RWLOCK_INITIALIZER) which simplifies its
implementation.
The default pthread_rwlock_t layout differs from current ports with:
1. Internal layout is the same for 32 bits and 64 bits.
2. Internal flag is an unsigned short so it should not required
additional padding to align for word boundary (if it is the case
for the ABI).
Checked with a build on affected abis.
Change-Id: I776a6a986c23199929d28a3dcd30272db21cd1d0
The current way of defining the common mutex definition for POSIX and
C11 on pthreadtypes-arch.h (added by commit 06be6368da) is
not really the best options for newer ports. It requires define some
misleading flags that should be always defined as 0
(__PTHREAD_COMPAT_PADDING_MID and __PTHREAD_COMPAT_PADDING_END), it
exposes options used solely for linuxthreads compat mode
(__PTHREAD_MUTEX_USE_UNION and __PTHREAD_MUTEX_NUSERS_AFTER_KIND), and
requires newer ports to explicit define them (adding more boilerplate
code).
This patch adds a new default __pthread_mutex_s definition meant to
be used by newer ports. Its layout mimics the current usage on both
32 and 64 bits ports and it allows most ports to use the generic
definition. Only ports that use some arch-specific definition (such
as hardware lock-elision or linuxthreads compat) requires specific
headers.
For 32 bit, the generic definitions mimic the other 32-bit ports
of using an union to define the fields uses on adaptive and robust
mutexes (thus not allowing both usage at same time) and by using a
single linked-list for robust mutexes. Both decisions seemed to
follow what recent ports have done and make the resulting
pthread_mutex_t/mtx_t object smaller.
Also the static intialization macro for pthread_mutex_t is set to use
a macro __PTHREAD_MUTEX_INITIALIZER where the architecture can redefine
in its struct_mutex.h if it requires additional fields to be
initialized.
Checked with a build on affected abis.
Change-Id: I30a22c3e3497805fd6e52994c5925897cffcfe13
The new rwlock implementation added by cc25c8b4c1 (2.25) removed
support for lock-elision. This patch removes remaining the
arch-specific unused definitions.
Checked with a build against all affected ABIs.
Change-Id: I5dec8af50e3cd56d7351c52ceff4aa3771b53cd6
This patch new build tests to check for internal fields offsets for
internal pthread_rwlock_t definition. Althoug the '__data.__flags'
field layout should be preserved due static initializators, the patch
also adds tests for the futexes that may be used in a shared memory
(although using different libc version in such scenario is not really
supported).
Checked with a build against all affected ABIs.
Change-Id: Iccc103d557de13d17e4a3f59a0cad2f4a640c148
The offsets of pthread_mutex_t __data.__nusers, __data.__spins,
__data.elision, __data.list are not required to be constant over
the releases. Only the __data.__kind is used for static
initializers.
This patch also adds an additional size check for __data.__kind.
Checked with a build against affected ABIs.
Change-Id: I7a4e48cc91b4c4ada57e9a5d1b151fb702bfaa9f
It adds the missing Implies for armv7, armv6, armv6t2 after the
commit 1673ba87fe. Without the Implies a build with the
compiler targeting the aforementioned architecture does not select
the arch-specific optimization including the ifunc selectors.
I checked with a build against armv5, armv6, armv6t2, armv7, and
armv7-neon for both LE and BE. For armv6 and armv7 I also checked
that both sysdeps selection and the resulting implementation built
is the expected ones.
With only two exceptions (sys/types.h and sys/param.h, both of which
historically might have defined BYTE_ORDER) the public headers that
include <endian.h> only want to be able to test __BYTE_ORDER against
__*_ENDIAN.
This patch creates a new bits/endian.h that can be included by any
header that wants to be able to test __BYTE_ORDER and/or
__FLOAT_WORD_ORDER against the __*_ENDIAN constants, or needs
__LONG_LONG_PAIR. It only defines macros in the implementation
namespace.
The existing bits/endian.h (which could not be included independently
of endian.h, and only defines __BYTE_ORDER and maybe __FLOAT_WORD_ORDER)
is renamed to bits/endianness.h. I also took the opportunity to
canonicalize the form of this header, which we are stuck with having
one copy of per architecture. Since they are so short, this means git
doesn’t understand that they were renamed from existing headers, sigh.
endian.h itself is a nonstandard header and its only remaining use
from a standard header is guarded by __USE_MISC, so I dropped the
__USE_MISC conditionals from around all of the public-namespace things
it defines. (This means, an application that requests strict library
conformance but includes endian.h will still see the definition of
BYTE_ORDER.)
A few changes to specific bits/endian(ness).h variants deserve
mention:
- sysdeps/unix/sysv/linux/ia64/bits/endian.h is moved to
sysdeps/ia64/bits/endianness.h. If I remember correctly, ia64 did
have selectable endianness, but we have assembly code in
sysdeps/ia64 that assumes it’s little-endian, so there is no reason
to treat the ia64 endianness.h as linux-specific.
- The C-SKY port does not fully support big-endian mode, the compile
will error out if __CSKYBE__ is defined.
- The PowerPC port had extra logic in its bits/endian.h to detect a
broken compiler, which strikes me as unnecessary, so I removed it.
- The only files that defined __FLOAT_WORD_ORDER always defined it to
the same value as __BYTE_ORDER, so I removed those definitions.
The SH bits/endian(ness).h had comments inconsistent with the
actual setting of __FLOAT_WORD_ORDER, which I also removed.
- I *removed* copyright boilerplate from the few bits/endian(ness).h
headers that had it; these files record a single fact in a fashion
dictated by an external spec, so I do not think they are copyrightable.
As long as I was changing every copy of ieee754.h in the tree, I
noticed that only the MIPS variant includes float.h, because it uses
LDBL_MANT_DIG to decide among three different versions of
ieee854_long_double. This patch makes it not include float.h when
GCC’s intrinsic __LDBL_MANT_DIG__ is available.
* string/endian.h: Unconditionally define LITTLE_ENDIAN,
BIG_ENDIAN, PDP_ENDIAN, and BYTE_ORDER. Condition byteswapping
macros only on !__ASSEMBLER__. Move the definitions of
__BIG_ENDIAN, __LITTLE_ENDIAN, __PDP_ENDIAN, __FLOAT_WORD_ORDER,
and __LONG_LONG_PAIR to...
* string/bits/endian.h: ...this new file, which includes
the renamed header bits/endianness.h for the definition of
__BYTE_ORDER and possibly __FLOAT_WORD_ORDER.
* string/Makefile: Install bits/endianness.h.
* include/bits/endian.h: New wrapper.
* bits/endian.h: Rename to bits/endianness.h.
Add multiple-include guard. Rewrite the comment explaining what
the machine-specific variants of this file should do.
* sysdeps/unix/sysv/linux/ia64/bits/endian.h:
Move to sysdeps/ia64.
* sysdeps/aarch64/bits/endian.h
* sysdeps/alpha/bits/endian.h
* sysdeps/arm/bits/endian.h
* sysdeps/csky/bits/endian.h
* sysdeps/hppa/bits/endian.h
* sysdeps/ia64/bits/endian.h
* sysdeps/m68k/bits/endian.h
* sysdeps/microblaze/bits/endian.h
* sysdeps/mips/bits/endian.h
* sysdeps/nios2/bits/endian.h
* sysdeps/powerpc/bits/endian.h
* sysdeps/riscv/bits/endian.h
* sysdeps/s390/bits/endian.h
* sysdeps/sh/bits/endian.h
* sysdeps/sparc/bits/endian.h
* sysdeps/x86/bits/endian.h:
Rename to endianness.h; canonicalize form of file; remove
redundant definitions of __FLOAT_WORD_ORDER.
* sysdeps/powerpc/bits/endianness.h: Remove logic to check for
broken compilers.
* ctype/ctype.h
* sysdeps/aarch64/nptl/bits/pthreadtypes-arch.h
* sysdeps/arm/nptl/bits/pthreadtypes-arch.h
* sysdeps/csky/nptl/bits/pthreadtypes-arch.h
* sysdeps/ia64/ieee754.h
* sysdeps/ieee754/ieee754.h
* sysdeps/ieee754/ldbl-128/ieee754.h
* sysdeps/ieee754/ldbl-128ibm/ieee754.h
* sysdeps/m68k/nptl/bits/pthreadtypes-arch.h
* sysdeps/microblaze/nptl/bits/pthreadtypes-arch.h
* sysdeps/mips/ieee754/ieee754.h
* sysdeps/mips/nptl/bits/pthreadtypes-arch.h
* sysdeps/nios2/nptl/bits/pthreadtypes-arch.h
* sysdeps/nptl/pthread.h
* sysdeps/riscv/nptl/bits/pthreadtypes-arch.h
* sysdeps/sh/nptl/bits/pthreadtypes-arch.h
* sysdeps/sparc/sparc32/ieee754.h
* sysdeps/unix/sysv/linux/generic/bits/stat.h
* sysdeps/unix/sysv/linux/generic/bits/statfs.h
* sysdeps/unix/sysv/linux/sys/acct.h
* wctype/bits/wctype-wchar.h:
Include bits/endian.h, not endian.h.
* sysdeps/unix/sysv/linux/hppa/pthread.h: Don’t include endian.h.
* sysdeps/mips/ieee754/ieee754.h: Use __LDBL_MANT_DIG__
in ifdefs, instead of LDBL_MANT_DIG. Only include float.h
when __LDBL_MANT_DIG__ is not predefined, in which case
define __LDBL_MANT_DIG__ to equal LDBL_MANT_DIG.
The fix for BZ#18231 requires new symbols only for armeb. This patch
adds the required folder and files for both BE and LE abilist. No
semantic changes are expected.
Checked with check-abi for arm-linux-gnueabihf and armeb-linux-gnueabihf.
* sysdeps/arm/preconfigure.ac: Set machine based on endianness.
* sysdeps/arm/preconfigure: Regenerate.
* sysdeps/arm/be/Implies: New file.
* sysdeps/arm/be/armv6/Implies: Likewise.
* sysdeps/arm/be/armv6t2/Implies: Likewise.
* sysdeps/arm/be/armv7/Implies: Likewise.
* sysdeps/arm/le/Implies: Likewise.
* sysdeps/unix/sysv/linux/arm/be/Implies: Likewise.
* sysdeps/unix/sysv/linux/arm/le/Implies: Likewise.
* sysdeps/unix/sysv/linux/arm/*.abilist: Move to
sysdeps/unix/sysv/linux/arm/le/*.abilist.
* sysdeps/unix/sysv/linux/arm/be/l*.abilist: New files.
With the default "nor" constraint, current GCC will use the "o"
constraint for constants, after emitting the constant to memory. That
results in unparseable Systemtap probe notes such as "-4@.L1052".
Removing the "o" alternative and using "nr" instead avoids this.
Similar to the x86_64 build issues, glibc fails to build for armv7
with current mainline GCC because of aliases declared in the course of
defining IFUNCs, which copy their attributes from a header
declaration, ending up with fewer attributes than the (built-in)
string function they alias: the relevant attributes (nonnull, leaf)
are present on the header declaration, but elided therefrom when glibc
itself if being built (whatever the reasons are for disabling the
nonnull and leaf attributes in that case, and whether or not those
reasons are actually still valid). This patch fixes the issue
similarly to the x86_64 fix, by adding an addition __attribute_copy__
use (in this case, on the definition of arm_libc_ifunc_hidden_def).
Tested with build-many-glibcs.py build for armeb-linux-gnueabi-be8.
* sysdeps/arm/arm-ifunc.h [SHARED] (arm_libc_ifunc_hidden_def):
Use __attribute_copy__ to copy attributes from name.
Optimized exp and exp2 implementations using a lookup table for
fractional powers of 2. There are several variants, see e_exp_data.c,
they can be selected by modifying math_config.h allowing different
tradeoffs.
The default selection should be acceptable as generic libm code.
Worst case error is 0.509 ULP for exp and 0.507 ULP for exp2, on
aarch64 the rodata size is 2160 bytes, shared between exp and exp2.
On aarch64 .text + .rodata size decreased by 24912 bytes.
The non-nearest rounding error is less than 1 ULP even on targets
without efficient round implementation (although the error rate is
higher in that case). Targets with single instruction, rounding mode
independent, to nearest integer rounding and conversion can use them
by setting TOINT_INTRINSICS and adding the necessary code to their
math_private.h.
The __exp1 code uses the same algorithm, so the error bound of pow
increased a bit.
New double precision error handling code was added following the
style of the single precision error handling code.
Improvements on Cortex-A72 compared to current glibc master:
exp thruput: 1.61x in [-9.9 9.9]
exp latency: 1.53x in [-9.9 9.9]
exp thruput: 1.13x in [0.5 1]
exp latency: 1.30x in [0.5 1]
exp2 thruput: 2.03x in [-9.9 9.9]
exp2 latency: 1.64x in [-9.9 9.9]
For small (< 1) inputs the current exp code uses a separate algorithm
so the speed up there is less.
Was tested on
aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and
arm-linux-gnueabihf (!TOINT_INTRINSICS, no fma contraction) and
x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction) and
powerpc64le-linux-gnu (!TOINT_INTRINSICS, fma contraction) targets,
only non-nearest rounding ulp errors increase and they are within
acceptable bounds (ulp updates are in separate patches).
* NEWS: Mention exp and exp2 improvements.
* math/Makefile (libm-support): Remove t_exp.
(type-double-routines): Add math_err and e_exp_data.
* sysdeps/aarch64/libm-test-ulps: Update.
* sysdeps/arm/libm-test-ulps: Update.
* sysdeps/i386/fpu/e_exp_data.c: New file.
* sysdeps/i386/fpu/math_err.c: New file.
* sysdeps/i386/fpu/t_exp.c: Remove.
* sysdeps/ia64/fpu/e_exp_data.c: New file.
* sysdeps/ia64/fpu/math_err.c: New file.
* sysdeps/ia64/fpu/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/e_exp.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp2.c: Rewrite.
* sysdeps/ieee754/dbl-64/e_exp_data.c: New file.
* sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Update error bound.
* sysdeps/ieee754/dbl-64/eexp.tbl: Remove.
* sysdeps/ieee754/dbl-64/math_config.h: New file.
* sysdeps/ieee754/dbl-64/math_err.c: New file.
* sysdeps/ieee754/dbl-64/t_exp.c: Remove.
* sysdeps/ieee754/dbl-64/t_exp2.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.h: Remove.
* sysdeps/ieee754/dbl-64/uexp.tbl: Remove.
* sysdeps/m68k/m680x0/fpu/e_exp_data.c: New file.
* sysdeps/m68k/m680x0/fpu/math_err.c: New file.
* sysdeps/m68k/m680x0/fpu/t_exp.c: Remove.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
On some architectures, the parts of math_private.h relating to the
floating-point environment are in a separate file fenv_private.h
included from math_private.h. As this is purely an
architecture-specific convention used by several architectures,
however, all such architectures still need their own math_private.h,
even if it has nothing to do beyond #include <fenv_private.h> and
peculiarity of including the i386 file directly instead of having a
shared file in sysdeps/x86.
This patch makes the fenv_private.h name an architecture-independent
convention in glibc. The include of fenv_private.h from
math_private.h becomes architecture-independent (until callers are
updated to include fenv_private.h directly so the include from
math_private.h is no longer needed). Some architecture math_private.h
headers are removed if no longer needed, or renamed to fenv_private.h
if all they define belongs in that header; architecture fenv_private.h
headers now do require #include_next <fenv_private.h>. The i386
fenv_private.h file moves to sysdeps/x86/fpu/ to reflect how it is
actually shared with x86_64. The generic math_private.h gets a new
include of <stdbool.h>, as needed for bool in some prototypes in that
header (previously that was indirectly included via include/fenv.h,
which now only gets included too late in math_private.h, after those
prototypes).
Tested for x86_64 and x86, and tested with build-many-glibcs.py that
installed stripped shared libraries are unchanged by the patch.
* sysdeps/aarch64/fpu/fenv_private.h: New file. Based on ....
* sysdeps/aarch64/fpu/math_private.h: ... this file. All contents
moved to fenv_private.h except for ...
(TOINT_INTRINSICS): Kept in math_private.h.
(roundtoint): Likewise.
(converttoint): Likewise.
* sysdeps/arm/fenv_private.h: Change multiple-include guard to
[ARM_FENV_PRIVATE_H]. Include next <fenv_private.h>.
* sysdeps/arm/math_private.h: Remove.
* sysdeps/generic/fenv_private.h: New file. Contents moved from
....
* sysdeps/generic/math_private.h: ... this file. Include
<stdbool.h>. Do not include <fenv.h> or <get-rounding-mode.h>.
Include <fenv_private.h>. Remove functions and macros moved to
fenv_private.h.
* sysdeps/i386/fpu/math_private.h: Remove.
* sysdeps/mips/math_private.h: Move to ....
* sysdeps/mips/fpu/fenv_private.h: ... here. Change
multiple-include guard to [MIPS_FENV_PRIVATE_H]. Remove
[__mips_hard_float] conditional. Include next <fenv_private.h>.
* sysdeps/powerpc/fpu/fenv_private.h: Change multiple-include
guard to [POWERPC_FENV_PRIVATE_H]. Include next <fenv_private.h>.
* sysdeps/powerpc/fpu/math_private.h: Do not include
<fenv_private.h>.
* sysdeps/riscv/rvf/math_private.h: Move to ....
* sysdeps/riscv/rvf/fenv_private.h: ... here. Change
multiple-include guard to [RISCV_FENV_PRIVATE_H]. Include next
<fenv_private.h>.
* sysdeps/sparc/fpu/fenv_private.h: Change multiple-include guard
to [SPARC_FENV_PRIVATE_H]. Include next <fenv_private.h>.
* sysdeps/sparc/fpu/math_private.h: Remove.
* sysdeps/i386/fpu/fenv_private.h: Move to ....
* sysdeps/x86/fpu/fenv_private.h: ... here. Change
multiple-include guard to [X86_FENV_PRIVATE_H]. Include next
<fenv_private.h>.
* sysdeps/x86_64/fpu/math_private.h: Do not include
<sysdeps/i386/fpu/fenv_private.h>.
Continuing moving macros out of math-tests.h to smaller headers
following typo-proof conventions instead of using #ifndef, this patch
moves the EXCEPTION_ENABLE_SUPPORTED macro out to its own
math-tests-trap.h header.
Tested with build-many-glibcs.py.
* sysdeps/generic/math-tests-trap.h: New file.
* sysdeps/generic/math-tests.h: Include <math-tests-trap.h>.
(EXCEPTION_ENABLE_SUPPORTED): Do not define here.
* sysdeps/aarch64/math-tests.h: Remove file.
* sysdeps/arm/math-tests.h: Likewise.
* sysdeps/riscv/math-tests.h: Likewise.
* sysdeps/aarch64/math-tests-trap.h: New file.
* sysdeps/arm/math-tests-trap.h: Likewise.
* sysdeps/riscv/math-tests-trap.h: Likewise.
Continuing moving macros out of math-tests.h to smaller headers
following typo-proof conventions instead of using #ifndef, this patch
moves the EXCEPTION_TESTS_* macros for individual types out to their
own sysdeps header.
As with ROUNDING_TESTS_*, there is no need to define these macros if
FE_ALL_EXCEPT == 0 and the individual exception macros are undefined;
thus, math-tests-exceptions.h headers are only needed for soft-float
ARM and RISC-V, while the other cases that defined these macros do not
need to do so (and the associated math-tests.h headers are thus
removed without needing replacement by math-tests-exceptions.h
headers).
Tested with build-many-glibcs.py.
* sysdeps/generic/math-tests-exceptions.h: New file.
* sysdeps/generic/math-tests.h: Include <math-tests-exceptions.h>.
(EXCEPTION_TESTS_float): Do not define here.
(EXCEPTION_TESTS_double): Likewise.
(EXCEPTION_TESTS_long_double): Likewise.
(EXCEPTION_TESTS_float128): Likewise.
* sysdeps/arm/math-tests.h [__SOFTFP__] (EXCEPTION_TESTS_float):
Likewise.
[__SOFTFP__] (EXCEPTION_TESTS_double): Likewise.
[__SOFTFP__] (EXCEPTION_TESTS_long_double): Likewise.
* sysdeps/arm/nofpu/math-tests-exceptions.h: New file.
* sysdeps/m68k/coldfire/math-tests.h: Remove file.
* sysdeps/mips/math-tests.h: Likewise.
* sysdeps/nios2/math-tests.h: Likewise.
* sysdeps/riscv/math-tests.h [!__riscv_flen]
(EXCEPTION_TESTS_float): Do not define here.
[!__riscv_flen] (EXCEPTION_TESTS_double): Likewise.
[!__riscv_flen] (EXCEPTION_TESTS_long_double): Likewise.
* sysdeps/riscv/nofpu/math-tests-exceptions.h: New file.
Continuing moving macros out of math-tests.h to smaller headers
following typo-proof conventions instead of using #ifndef, this patch
moves the ROUNDING_TESTS_* macros for individual types out to their
own sysdeps header.
In the soft-float case where FE_TONEAREST is the only rounding mode
macro defined, there is no need to define ROUNDING_TESTS_*; it is only
necessary when rounding modes macros are defined that may not be
supported at runtime. Thus, the ROUNDING_TESTS_* definitions for some
configurations are just removed, not moved to new
math-tests-rounding.h headers; the only architectures needing
math-tests-rounding.h are those where the macros are defined in
bits/fenv.h because of the possibility of a soft-float compilation
using a hard-float glibc with the same ABI (i.e., ARM and RISC-V).
The test-*-vlen*.h headers, by using #undef, do not yet follow
typo-proof conventions (but they no longer implicitly rely on being
included before math-tests.h, and this area can always be cleaned up
further in future).
Tested with build-many-glibcs.py.
* sysdeps/generic/math-tests-rounding.h: New file.
* sysdeps/generic/math-tests.h: Include <math-tests-rounding.h>.
(ROUNDING_TESTS_float): Do not define here.
(ROUNDING_TESTS_double): Likewise.
(ROUNDING_TESTS_long_double): Likewise.
(ROUNDING_TESTS_float128): Likewise.
* math/test-double-vlen2.h: Include <math-tests-rounding.h>.
(ROUNDING_TESTS_double): Undefine before defining.
* math/test-double-vlen4.h: Include <math-tests-rounding.h>.
(ROUNDING_TESTS_double): Undefine before defining.
* math/test-double-vlen8.h: Include <math-tests-rounding.h>.
(ROUNDING_TESTS_double): Undefine before defining.
* math/test-float-vlen16.h: Include <math-tests-rounding.h>.
(ROUNDING_TESTS_float): Undefine before defining.
* math/test-float-vlen4.h: Include <math-tests-rounding.h>.
(ROUNDING_TESTS_float): Undefine before defining.
* math/test-float-vlen8.h: Include <math-tests-rounding.h>.
(ROUNDING_TESTS_float): Undefine before defining.
* sysdeps/arm/nofpu/math-tests-rounding.h: New file.
* sysdeps/arm/math-tests.h [__SOFTFP__] (ROUNDING_TESTS_float): Do
not define here.
[__SOFTFP__] (ROUNDING_TESTS_double): Likewise.
[__SOFTFP__] (ROUNDING_TESTS_long_double): Likewise.
* sysdeps/riscv/nofpu/math-tests-rounding.h: New file.
* sysdeps/riscv/math-tests.h [!__riscv_flen]
(ROUNDING_TESTS_float): Do not define here.
[!__riscv_flen] (ROUNDING_TESTS_double): Likewise.
[!__risv_flen] (ROUNDING_TESTS_long_double): Likewise.
* sysdeps/m68k/coldfire/math-tests.h [!__mcffpu__]
(ROUNDING_TESTS_float): Likewise.
[!__mcffpu__] (ROUNDING_TESTS_double): Likewise.
[!__mcffpu__] (ROUNDING_TESTS_long_double): Likewise.
* sysdeps/mips/math-tests.h [__mips_soft_float]
(ROUNDING_TESTS_float): Likewise.
[__mips_soft_float] (ROUNDING_TESTS_double): Likewise.
[__mips_soft_float] (ROUNDING_TESTS_long_double): Likewise.
* sysdeps/nios2/math-tests.h (ROUNDING_TESTS_float): Likewise.
(ROUNDING_TESTS_double): Likewise.
(ROUNDING_TESTS_long_double): Likewise.
_init and _fini are special functions provided by glibc for linker to
define DT_INIT and DT_FINI in executable and shared library. They
should never be put in dynamic symbol table. This patch marks them as
hidden to remove them from dynamic symbol table.
Tested with build-many-glibcs.py.
[BZ #23145]
* elf/Makefile (tests-special): Add $(objpfx)check-initfini.out.
($(all-built-dso:=.dynsym): New target.
(common-generated): Add $(all-built-dso:$(common-objpfx)%=%.dynsym).
($(objpfx)check-initfini.out): New target.
(generated): Add check-initfini.out.
* scripts/check-initfini.awk: New file.
* sysdeps/aarch64/crti.S (_init): Mark as hidden.
(_fini): Likewise.
* sysdeps/alpha/crti.S (_init): Mark as hidden.
(_fini): Likewise.
* sysdeps/arm/crti.S (_init): Mark as hidden.
(_fini): Likewise.
* sysdeps/hppa/crti.S (_init): Mark as hidden.
(_fini): Likewise.
* sysdeps/i386/crti.S (_init): Mark as hidden.
(_fini): Likewise.
* sysdeps/ia64/crti.S (_init): Mark as hidden.
(_fini): Likewise.
* sysdeps/m68k/crti.S (_init): Mark as hidden.
(_fini): Likewise.
* sysdeps/microblaze/crti.S (_init): Mark as hidden.
(_fini): Likewise.
* sysdeps/mips/mips32/crti.S (_init): Mark as hidden.
(_fini): Likewise.
* sysdeps/mips/mips64/n32/crti.S (_init): Mark as hidden.
(_fini): Likewise.
* sysdeps/mips/mips64/n64/crti.S (_init): Mark as hidden.
(_fini): Likewise.
* sysdeps/nios2/crti.S (_init): Mark as hidden.
(_fini): Likewise.
* sysdeps/powerpc/powerpc32/crti.S (_init): Mark as hidden.
(_fini): Likewise.
* sysdeps/powerpc/powerpc64/crti.S (_init): Mark as hidden.
(_fini): Likewise.
* sysdeps/s390/s390-32/crti.S (_init): Mark as hidden.
(_fini): Likewise.
* sysdeps/s390/s390-64/crti.S (_init): Mark as hidden.
(_fini): Likewise.
* sysdeps/sh/crti.S (_init): Mark as hidden.
(_fini): Likewise.
* sysdeps/sparc/crti.S (_init): Mark as hidden.
(_fini): Likewise.
* sysdeps/x86_64/crti.S (_init): Mark as hidden.
(_fini): Likewise.
This patch removes the ununsed ARM code path for armv6t2 memchr and
strlen and armv7 memch and strcmp. In all implementation, the ARM
code is not used in any possible build (unless glibc is explicit
build with the non-documented NO_THUMB compiler flag) and for armv7
the resulting code either produces wrong results (memchr) and throw
build error (strcmp).
Checked on arm-linux-gnueabihf built targeting both armv6 and
armv7.
* sysdeps/arm/armv6t2/memchr.S (memchr): Remove ARM code path.
* sysdeps/arm/armv6t2/strlen.S (memchr): Likewise.
* sysdeps/arm/armv7/multiarch/memchr_neon.S (memchr): Likewise.
* sysdeps/arm/armv7/strcmp.S (strcmp): Likewise.
Wrap symbol address run-time calculation into a macro and use it
throughout, replacing inline calculations.
There are a couple of variants, most of them different in a functionally
insignificant way. Most calculations are right following RESOLVE_MAP,
at which point either the map or the symbol returned can be checked for
validity as the macro sets either both or neither. In some places both
the symbol and the map has to be checked however.
My initial implementation therefore always checked both, however that
resulted in code larger by as much as 0.3%, as many places know from
elsewhere that no check is needed. I have decided the size growth was
unacceptable.
Having looked closer I realized that it's the map that is the culprit.
Therefore I have modified LOOKUP_VALUE_ADDRESS to accept an additional
boolean argument telling it to access the map without checking it for
validity. This in turn has brought quite nice results, with new code
actually being smaller for i686, and MIPS o32, n32 and little-endian n64
targets, unchanged in size for x86-64 and, unusually, marginally larger
for big-endian MIPS n64, as follows:
i686:
text data bss dec hex filename
152255 4052 192 156499 26353 ld-2.27.9000-base.so
152159 4052 192 156403 262f3 ld-2.27.9000-elf-symbol-value.so
MIPS/o32/el:
text data bss dec hex filename
142906 4396 260 147562 2406a ld-2.27.9000-base.so
142890 4396 260 147546 2405a ld-2.27.9000-elf-symbol-value.so
MIPS/n32/el:
text data bss dec hex filename
142267 4404 260 146931 23df3 ld-2.27.9000-base.so
142171 4404 260 146835 23d93 ld-2.27.9000-elf-symbol-value.so
MIPS/n64/el:
text data bss dec hex filename
149835 7376 408 157619 267b3 ld-2.27.9000-base.so
149787 7376 408 157571 26783 ld-2.27.9000-elf-symbol-value.so
MIPS/o32/eb:
text data bss dec hex filename
142870 4396 260 147526 24046 ld-2.27.9000-base.so
142854 4396 260 147510 24036 ld-2.27.9000-elf-symbol-value.so
MIPS/n32/eb:
text data bss dec hex filename
142019 4404 260 146683 23cfb ld-2.27.9000-base.so
141923 4404 260 146587 23c9b ld-2.27.9000-elf-symbol-value.so
MIPS/n64/eb:
text data bss dec hex filename
149763 7376 408 157547 2676b ld-2.27.9000-base.so
149779 7376 408 157563 2677b ld-2.27.9000-elf-symbol-value.so
x86-64:
text data bss dec hex filename
148462 6452 400 155314 25eb2 ld-2.27.9000-base.so
148462 6452 400 155314 25eb2 ld-2.27.9000-elf-symbol-value.so
[BZ #19818]
* sysdeps/generic/ldsodefs.h (LOOKUP_VALUE_ADDRESS): Add `set'
parameter.
(SYMBOL_ADDRESS): New macro.
[!ELF_FUNCTION_PTR_IS_SPECIAL] (DL_SYMBOL_ADDRESS): Use
SYMBOL_ADDRESS for symbol address calculation.
* elf/dl-runtime.c (_dl_fixup): Likewise.
(_dl_profile_fixup): Likewise.
* elf/dl-symaddr.c (_dl_symbol_address): Likewise.
* elf/rtld.c (dl_main): Likewise.
* sysdeps/aarch64/dl-machine.h (elf_machine_rela): Likewise.
* sysdeps/alpha/dl-machine.h (elf_machine_rela): Likewise.
* sysdeps/arm/dl-machine.h (elf_machine_rel): Likewise.
(elf_machine_rela): Likewise.
* sysdeps/hppa/dl-machine.h (elf_machine_rela): Likewise.
* sysdeps/hppa/dl-symaddr.c (_dl_symbol_address): Likewise.
* sysdeps/i386/dl-machine.h (elf_machine_rel): Likewise.
(elf_machine_rela): Likewise.
* sysdeps/ia64/dl-machine.h (elf_machine_rela): Likewise.
* sysdeps/m68k/dl-machine.h (elf_machine_rela): Likewise.
* sysdeps/microblaze/dl-machine.h (elf_machine_rela): Likewise.
* sysdeps/mips/dl-machine.h (ELF_MACHINE_BEFORE_RTLD_RELOC):
Likewise.
(elf_machine_reloc): Likewise.
(elf_machine_got_rel): Likewise.
* sysdeps/mips/dl-trampoline.c (__dl_runtime_resolve): Likewise.
* sysdeps/nios2/dl-machine.h (elf_machine_rela): Likewise.
* sysdeps/powerpc/powerpc32/dl-machine.h (elf_machine_rela):
Likewise.
* sysdeps/powerpc/powerpc64/dl-machine.h (elf_machine_rela):
Likewise.
* sysdeps/riscv/dl-machine.h (elf_machine_rela): Likewise.
* sysdeps/s390/s390-32/dl-machine.h (elf_machine_rela):
Likewise.
* sysdeps/s390/s390-64/dl-machine.h (elf_machine_rela):
Likewise.
* sysdeps/sh/dl-machine.h (elf_machine_rela): Likewise.
* sysdeps/sparc/sparc32/dl-machine.h (elf_machine_rela):
Likewise.
* sysdeps/sparc/sparc64/dl-machine.h (elf_machine_rela):
Likewise.
* sysdeps/tile/dl-machine.h (elf_machine_rela): Likewise.
* sysdeps/x86_64/dl-machine.h (elf_machine_rela): Likewise.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The general rule in glibc is that it's better for a macro to be always
defined, and tested with #if, than for it to be tested with #ifdef,
because the latter is prone to typos in the macro name as well as to
the header with the macro accidentally not being included in a file
testing it. (Testing with an "if" statement is even better, in those
cases where it's possible to do things that way, as it then means both
cases in the code get checked for syntax in glibc builds with either
value of the condition.)
math_private.h has several different groups of macros, meaning that
architectures wanting to override some of them need to define those
then include the generic version, which then defines macros if not
already defined. It's hard to avoid that arrangement completely, but
various cases can be improved by splitting out macros or groups of
macros into separate files.
This patch splits out the LDBL_CLASSIFY_COMPAT macro into a separate
ldbl-classify-compat.h header. This macro is tested with #ifdef; this
patch changes it to testing with #if, with a default definition to 0
in the generic header and then architecture-specific headers defining
it to 1.
Tested with build-many-glibcs.py that installed stripped shared
libraries are unchanged by the patch.
* sysdeps/generic/ldbl-classify-compat.h: New file.
* sysdeps/arm/ldbl-classify-compat.h: Likewise.
* sysdeps/m68k/coldfire/ldbl-classify-compat.h: Likewise.
* sysdeps/microblaze/ldbl-classify-compat.h: Likewise.
* sysdeps/mips/ldbl-classify-compat.h: Likewise.
* sysdeps/nios2/ldbl-classify-compat.h: Likewise.
* sysdeps/sh/ldbl-classify-compat.h: Likewise.
* sysdeps/ieee754/dbl-64/s_finite.c: Include
<ldbl-classify-compat.h>.
[LDBL_CLASSIFY_COMPAT]: Test value, not whether defined.
* sysdeps/ieee754/dbl-64/s_isinf.c: Include
<ldbl-classify-compat.h>.
[LDBL_CLASSIFY_COMPAT]: Test value, not whether defined.
* sysdeps/ieee754/dbl-64/s_isnan.c: Include
<ldbl-classify-compat.h>.
[LDBL_CLASSIFY_COMPAT]: Test value, not whether defined.
* sysdeps/ieee754/dbl-64/wordsize-64/s_finite.c: Include
<ldbl-classify-compat.h>.
[LDBL_CLASSIFY_COMPAT]: Test value, not whether defined.
* sysdeps/ieee754/dbl-64/wordsize-64/s_isinf.c: Include
<ldbl-classify-compat.h>.
[LDBL_CLASSIFY_COMPAT]: Test value, not whether defined.
* sysdeps/ieee754/dbl-64/wordsize-64/s_isnan.c: Include
<ldbl-classify-compat.h>.
[LDBL_CLASSIFY_COMPAT]: Test value, not whether defined.
* sysdeps/arm/math_private.h (LDBL_CLASSIFY_COMPAT): Remove macro.
* sysdeps/mips/math_private.h (LDBL_CLASSIFY_COMPAT): Likewise.
* sysdeps/m68k/coldfire/math_private.h: Remove file.
* sysdeps/microblaze/math_private.h: Likewise.
* sysdeps/nios2/math_private.h: Likewise.
* sysdeps/sh/math_private.h: Likewise.
The default sysdeps/ieee754 fma implementations rely on exceptions and
rounding modes to achieve correct results through internal use of
round-to-odd. Thus, glibc configurations without support for
exceptions and rounding modes instead need to use implementations of
fma based on soft-fp.
At present, this is achieved via having implementation files in
soft-fp/ that are #included by sysdeps files for each glibc
configuration that needs them. In general this means such a
configuration has its own s_fma.c and s_fmaf.c.
TS 18661-1 adds functions that do an operation (+ - * / sqrt fma) on
arguments wider than the return type, with a single rounding of the
infinite-precision result to that return type. These are also
naturally implemented using round-to-odd on platforms with hardware
support for rounding modes and exceptions but lacking hardware support
for these narrowing operations themselves. (Platforms that have
direct hardware support for such narrowing operations include at least
ia64, and Power ISA 2.07 or later, which I think means POWER8 or
later.)
So adding the remaining TS 18661-1 functions would mean at least six
narrowing function implementations (fadd fsub fmul fdiv ffma fsqrt),
with aliases for other types and further implementations in some
configurations, that need to be overridden for configurations lacking
hardware exceptions and rounding modes. Requiring all such
configurations (currently seven of them) to have their own source
files for all those functions seems undesirable.
Thus, this patch adds a directory sysdeps/ieee754/soft-fp to contain
libm function implementations based on soft-fp. This directory is
then used via Implies from all the configurations that need it, so no
more files need adding to every such configuration when adding more
functions with soft-fp implementations. A configuration can still
selectively #include a particular file from this directory if desired;
thus, the MIPS #include of the fmal implementation is retained, since
that's appropriate even for hard float (because long double is always
implementated in software for MIPS64, so the soft-fp implementation of
fmal is better than the ldbl-128 one).
This also provides additional motivation for my recent patch removing
--with-fp / --without-fp: previously there was no need for correct use
of --without-fp for no-FPU ARM or SH3, and now we have autodetection
nofpu/ sysdeps directories can be used by this patch for those
configurations without imposing any new requirements on how glibc is
configured.
(The mips64/*/fpu/s_fma.c files added by this patch are needed to keep
the dbl-64 version of fma for double, rather than the ldbl-128 one,
used in that case.)
Tested with build-many-glibcs.py that installed stripped shared
libraries are unchanged by this patch.
* soft-fp/fmadf4.c: Move to ....
* sysdeps/ieee754/soft-fp/s_fma.c: ... here.
* soft-fp/fmasf4.c: Move to ....
* sysdeps/ieee754/soft-fp/s_fmaf.c: ... here.
* soft-fp/fmatf4.c: Move to ....
* sysdeps/ieee754/soft-fp/s_fmal.c: ... here.
* sysdeps/ieee754/soft-fp/Makefile: New file.
* sysdeps/arm/preconfigure.ac: Define with_fp_cond.
* sysdeps/arm/preconfigure: Regenerated.
* sysdeps/arm/nofpu/Implies: New file.
* sysdeps/arm/s_fma.c: Remove file.
* sysdeps/arm/s_fmaf.c: Likewise.
* sysdeps/m68k/coldfire/nofpu/Implies: New file.
* sysdeps/m68k/coldfire/nofpu/s_fma.c: Remove file.
* sysdeps/m68k/coldfire/nofpu/s_fmaf.c: Likewise.
* sysdeps/microblaze/Implies: Add ieee754/soft-fp.
* sysdeps/microblaze/s_fma.c: Remove file.
* sysdeps/microblaze/s_fmaf.c: Likewise.
* sysdeps/mips/mips32/nofpu/Implies: New file.
* sysdeps/mips/mips64/n32/fpu/s_fma.c: Likewise.
* sysdeps/mips/mips64/n32/nofpu/Implies: Likewise.
* sysdeps/mips/mips64/n64/fpu/s_fma.c: Likewise.
* sysdeps/mips/mips64/n64/nofpu/Implies: Likewise.
* sysdeps/mips/ieee754/s_fma.c: Remove file.
* sysdeps/mips/ieee754/s_fmaf.c: Likewise.
* sysdeps/mips/ieee754/s_fmal.c: Update include for move of fmal
implementation.
* sysdeps/nios2/Implies: Add ieee754/soft-fp.
* sysdeps/nios2/s_fma.c: Remove file.
* sysdeps/nios2/s_fmaf.c: Likewise.
* sysdeps/sh/nofpu/Implies: New file.
* sysdeps/sh/s_fma.c: Remove file.
* sysdeps/sh/s_fmaf.c: Likewise.
* sysdeps/tile/Implies: Add ieee754/soft-fp.
* sysdeps/tile/s_fma.c: Remove file.
* sysdeps/tile/s_fmaf.c: Likewise.