It improve fortify checks for read, pread, pread64, readlink,
readlinkat, getcwd, getwd, confstr, getgroups, ttyname_r, getlogin_r,
gethostname, and getdomainname. The compile and runtime checks have
similar coverage as with GCC.
Checked on aarch64, armhf, x86_64, and i686.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
It improve fortify checks for realpath, ptsname_r, wctomb, mbstowcs,
and wcstombs. The runtime and compile checks have similar coverage as
with GCC.
Checked on aarch64, armhf, x86_64, and i686.
Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
It improve fortify checks for strcpy, stpcpy, strncpy, stpncpy, strcat,
strncat, strlcpy, and strlcat. The runtime and compile checks have
similar coverage as with GCC.
Checked on aarch64, armhf, x86_64, and i686.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
It improve fortify checks for sprintf, vsprintf, vsnsprintf, fprintf,
dprintf, asprintf, __asprintf, obstack_printf, gets, fgets,
fgets_unlocked, fread, and fread_unlocked. The runtime checks have
similar support coverage as with GCC.
For function with variadic argument (sprintf, snprintf, fprintf, printf,
dprintf, asprintf, __asprintf, obstack_printf) the fortify wrapper calls
the va_arg version since clang does not support __va_arg_pack.
Checked on aarch64, armhf, x86_64, and i686.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
For instance, the read wrapper is currently expanded as:
extern __inline
__attribute__((__always_inline__))
__attribute__((__artificial__))
__attribute__((__warn_unused_result__))
ssize_t read (int __fd, void *__buf, size_t __nbytes)
{
return __glibc_safe_or_unknown_len (__nbytes,
sizeof (char),
__glibc_objsize0 (__buf))
? __read_alias (__fd, __buf, __nbytes)
: __glibc_unsafe_len (__nbytes,
sizeof (char),
__glibc_objsize0 (__buf))
? __read_chk_warn (__fd,
__buf,
__nbytes,
__builtin_object_size (__buf, 0))
: __read_chk (__fd,
__buf,
__nbytes,
__builtin_object_size (__buf, 0));
}
The wrapper relies on __builtin_object_size call lowers to a constant at
compile-time and many other operations in the wrapper depends on
having a single, known value for parameters. Because this is
impossible to have for function parameters, the wrapper depends heavily
on inlining to work and While this is an entirely viable approach on
GCC, it is not fully reliable on clang. This is because by the time llvm
gets to inlining and optimizing, there is a minimal reliable source and
type-level information available (more information on a more deep
explanation on how to fortify wrapper works on clang [1]).
To allow the wrapper to work reliably and with the same functionality as
with GCC, clang requires a different approach:
* __attribute__((diagnose_if(c, “str”, “warning”))) which is a function
level attribute; if the compiler can determine that 'c' is true at
compile-time, it will emit a warning with the text 'str1'. If it would
be better to emit an error, the wrapper can use "error" instead of
"warning".
* __attribute__((overloadable)) which is also a function-level attribute;
and it allows C++-style overloading to occur on C functions.
* __attribute__((pass_object_size(n))) which is a parameter-level
attribute; and it makes the compiler evaluate
__builtin_object_size(param, n) at each call site of the function
that has the parameter, and passes it in as a hidden parameter.
This attribute has two side-effects that are key to how FORTIFY works:
1. It can overload solely on pass_object_size (e.g. there are two
overloads of foo in
void foo(char * __attribute__((pass_object_size(0))) c);
void foo(char *);
(The one with pass_object_size attribute has precende over the
default one).
2. A function with at least one pass_object_size parameter can never
have its address taken (and overload resolution respects this).
Thus the read wrapper can be implemented as follows, without
hindering any fortify coverage compile and runtime:
extern __inline
__attribute__((__always_inline__))
__attribute__((__artificial__))
__attribute__((__overloadable__))
__attribute__((__warn_unused_result__))
ssize_t read (int __fd,
void *const __attribute__((pass_object_size (0))) __buf,
size_t __nbytes)
__attribute__((__diagnose_if__ ((((__builtin_object_size (__buf, 0)) != -1ULL
&& (__nbytes) > (__builtin_object_size (__buf, 0)) / (1))),
"read called with bigger length than size of the destination buffer",
"warning")))
{
return (__builtin_object_size (__buf, 0) == (size_t) -1)
? __read_alias (__fd,
__buf,
__nbytes)
: __read_chk (__fd,
__buf,
__nbytes,
__builtin_object_size (__buf, 0));
}
To avoid changing the current semantic for GCC, a set of macros is
defined to enable the clang required attributes, along with some changes
on internal macros to avoid the need to issue the symbol_chk symbols
(which are done through the __diagnose_if__ attribute for clang).
The read wrapper is simplified as:
__fortify_function __attribute_overloadable__ __wur
ssize_t read (int __fd,
__fortify_clang_overload_arg0 (void *, ,__buf),
size_t __nbytes)
__fortify_clang_warning_only_if_bos0_lt (__nbytes, __buf,
"read called with bigger length than "
"size of the destination buffer")
{
return __glibc_fortify (read, __nbytes, sizeof (char),
__glibc_objsize0 (__buf),
__fd, __buf, __nbytes);
}
There is no expected semantic or code change when using GCC.
Also, clang does not support __va_arg_pack, so variadic functions are
expanded to call va_arg implementations. The error function must not
have bodies (address takes are expanded to nonfortified calls), and
with the __fortify_function compiler might still create a body with the
C++ mangling name (due to the overload attribute). In this case, the
function is defined with __fortify_function_error_function macro
instead.
[1] https://docs.google.com/document/d/1DFfZDICTbL7RqS74wJVIJ-YnjQOj1SaoqfhbgddFYSM/edit
Checked on aarch64, armhf, x86_64, and i686.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
This includes a fix for big-endian in AdvSIMD log, some cosmetic
changes, and numerous small optimisations mainly around inlining and
using indexed variants of MLA intrinsics.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Starting with commit e57d8fc97b
"S390: Always use svc 0"
clone clobbers the call-saved register r7 in error case:
function or stack is NULL.
This patch restores the saved registers also in the error case.
Furthermore the existing test misc/tst-clone is extended to check
all error cases and that clone does not clobber registers in this
error case.
When glibc is built with ISA level 3 or higher by default, the resulting
glibc binaries won't run on SSE or FMA4 processors. Exclude SSE, AVX and
FMA4 variants in libm multiarch when ISA level 3 or higher is enabled by
default.
When glibc is built with ISA level 2 enabled by default, only keep SSE4.1
variant.
Fixes BZ 31335.
NB: elf/tst-valgrind-smoke test fails with ISA level 4, because valgrind
doesn't support AVX512 instructions:
https://bugs.kde.org/show_bug.cgi?id=383010
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Reflow and sort Makefile.
Code generation changes present due to link order changes.
No regressions on x86_64 and i686.
Tested with build-many-glibcs.py for x86_64-gnu.
Reflow and sort Makefile.
No code generation changes in non-test binary artifacts.
No regressions on x86_64 and i686.
Tested with build-many-glibcs.py for x86_64-gnu.
Add APX registers to STATE_SAVE_MASK so that APX registers are saved in
ld.so trampoline. This fixes BZ #31371.
Also update STATE_SAVE_OFFSET and STATE_SAVE_MASK for i386 which will
be used by i386 _dl_tlsdesc_dynamic.
Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
This patch adds more benchtests for rounding functions.
The double inputs are copied from trunc-inputs, the float inputs are copied from truncf-inputs. and the rintf is copied from rint-inputs.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Testing for `None`-ness with `==` operator is frowned upon and causes
warnings in at least "LGTM" python linter. Fix that.
Signed-off-by: Konstantin Kharlamov <Hi-Angel@yandex.ru>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The optimization is not faster than the generic algorithm,
using the bench-strstr the geometric mean running on a POWER10 machine
using gcc 13.1.1 is 482.47 while the default __strstr_ppc is 340.97
(which uses the generic implementation).
Also, there is no need to redirect the internal str*/mem* call
to optimized version, internal ifunc is supported and enabled
for internal calls (meaning that the generic implementation
will use any asm optimization if available).
Checked on powerpc64le-linux-gnu.
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
This patch adds some --disable-multi-arch variants for s390x.
As the used IFUNC variants and __GI symbols depend on the used
gcc -march=cpu-level, there are multiple new configurations.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The FSR version field is read-only and might be non-zero.
This allows math/test-fpucw* to correctly pass when the version is
non-zero.
Signed-off-by: Daniel Cederman <cederman@gaisler.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Commit ff026950e2 ("Add a C wrapper for
prctl [BZ #25896]") replaced the assembler wrapper with a C function.
However, on powerpc64le-linux-gnu, the C variadic function
implementation requires extra work in the caller to set up the
parameter save area. Calling a function that needs a parameter save
area without one (because the prototype used indicates the function is
not variadic) corrupts the caller's stack. The Linux manual pages
project documents prctl as a non-variadic function. This has resulted
in various projects over the years using non-variadic prototypes,
including the sanitizer libraries in LLVm and GCC (GCC PR 113728).
This commit switches back to the assembler implementation on most
targets and only keeps the C implementation for x86-64 x32.
Also add the __prctl_time64 alias from commit
b39ffab860 ("Linux: Add time64 alias for
prctl") to sysdeps/unix/sysv/linux/syscalls.list; it was not yet
present in commit ff026950e2.
This restores the old ABI on powerpc64le-linux-gnu, thus fixing
bug 29770.
Reviewed-By: Simon Chopin <simon.chopin@canonical.com>
Before this change, we incorrectly used the SSE2 variant in the
implementation, without checking that the system actually supports
SSE2.
Tested-by: Sam James <sam@gentoo.org>
'_' is used in Makefile variable names and many variables end with
"^# name". Relax sort-makefile-lines.py to allow '_' in name and
"^# name" as variable end. This fixes BZ #31385.
"number of arguments, from zero to five" is wrong, because on Linux maximal number
of arguments is 6, not 5. Also, maximal number of arguments is kernel-dependent,
so let's not include it here at all.
Moreover, "Each kind of system call has a definite number of arguments" is questionable.
Think about SYS_open on Linux, which takes 2 or 3 arguments. Or SYS_clone on Linux x86_64, which
takes 2 to 5 arguments. So I propose to fully remove this sentence.
Signed-off-by: Askar Safin <safinaskar@zohomail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
__builtin_ffs{,ll} basically on __builtin_ctz{,ll} in MIPS GCC compiler.
The hardware ctz instructions were available after MIPS{32,64} Release1. By using builtin ctz. It can also reduce code size of ffs/ffsll.
Checked on mips o32. mips64.
Signed-off-by: Junxian Zhu <zhujunxian@oss.cipunited.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>
For AMD Zen3+ architecture, the performance of the vectorized loop is
slightly better than ERMS.
Checked on x86_64-linux-gnu on Zen3.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
The REP MOVSB usage on memcpy/memmove does not show much performance
improvement on Zen3/Zen4 cores compared to the vectorized loops. Also,
as from BZ 30994, if the source is aligned and the destination is not
the performance can be 20x slower.
The performance difference is noticeable with small buffer sizes, closer
to the lower bounds limits when memcpy/memmove starts to use ERMS. The
performance of REP MOVSB is similar to vectorized instruction on the
size limit (the L2 cache). Also, there is no drawback to multiple cores
sharing the cache.
Checked on x86_64-linux-gnu on Zen3.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Some shadow stack test scripts use the '==' operator with the 'test'
command to validate exit codes resulting in the following error:
sysdeps/x86_64/tst-shstk-legacy-1e.sh: 31: test: 139: unexpected operator
The '==' operator is invalid for the 'test' command, use '-eq' like the
previous call to 'test'.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Change "\(" and "\)" to "\\(" and "\\)" in test_printers_common.py. This
fixes the test warning:
.../scripts/test_printers_common.py:101: SyntaxWarning: invalid escape sequence '\('
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Linux 6.7 adds a constant SOL_VSOCK (recall that various constants in
include/linux/socket.h are in fact part of the kernel-userspace API
despite that not being a uapi header). Add it to glibc's
bits/socket.h.
Tested for x86_64.