__pthread_attr_copy can fail and does not initialize the attribute
structure in that case.
If __pthread_attr_copy is never called and there is no allocated
attribute, pthread_attr_destroy should not be called, otherwise
there is a null pointer dereference in rt/tst-mqueue6.
Fixes commit 42d359350510506b87101cf77202fefcbfc790cb
("Use __pthread_attr_copy in mq_notify (bug 27896)").
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
(cherry picked from commit 217b6dc298156bdb0d6aea9ea93e7e394a5ff091)
Make a deep copy of the pthread attribute object to remove a potential
use-after-free issue.
(cherry picked from commit 42d359350510506b87101cf77202fefcbfc790cb)
- Add a newline to the end of error messages in transfer().
- Fixed the name of support_subprocess_init().
(cherry picked from commit 95c68080a3ded882789b1629f872c3ad531efda0)
Pass environ to posix_spawn so that the child process can inherit
environment of the test.
(cherry picked from commit e958490f8c74e660bd93c128b3bea746e268f3f6)
When parse_tunables tries to erase a tunable marked as SXID_ERASE for
setuid programs, it ends up setting the envvar string iterator
incorrectly, because of which it may parse the next tunable
incorrectly. Given that currently the implementation allows malformed
and unrecognized tunables pass through, it may even allow SXID_ERASE
tunables to go through.
This change revamps the SXID_ERASE implementation so that:
- Only valid tunables are written back to the tunestr string, because
of which children of SXID programs will only inherit a clean list of
identified tunables that are not SXID_ERASE.
- Unrecognized tunables get scrubbed off from the environment and
subsequently from the child environment.
- This has the side-effect that a tunable that is not identified by
the setxid binary, will not be passed on to a non-setxid child even
if the child could have identified that tunable. This may break
applications that expect this behaviour but expecting such tunables
to cross the SXID boundary is wrong.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit 2ed18c5b534d9e92fc006202a5af0df6b72e7aca)
Instead of passing GLIBC_TUNABLES via the environment, pass the
environment variable from parent to child. This allows us to test
multiple variables to ensure better coverage.
The test list currently only includes the case that's already being
tested. More tests will be added later.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit 061fe3f8add46a89b7453e87eabb9c4695005ced)
Use the support_capture_subprogram_self_sgid to spawn an sgid child.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit ca335281068a1ed549a75ee64f90a8310755956f)
Add a new function support_capture_subprogram_self_sgid that spawns an
sgid child of the running program with its own image and returns the
exit code of the child process. This functionality is used by at
least three tests in the testsuite at the moment, so it makes sense to
consolidate.
There is also a new function support_subprogram_wait which should
provide simple system() like functionality that does not set up file
actions. This is useful in cases where only the return code of the
spawned subprocess is interesting.
This patch also ports tst-secure-getenv to this new function. A
subsequent patch will port other tests. This also brings an important
change to tst-secure-getenv behaviour. Now instead of succeeding, the
test fails as UNSUPPORTED if it is unable to spawn a setgid child,
which is how it should have been in the first place.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit 716a3bdc41b2b4b864dc64475015ba51e35e1273)
The arch13 memmove variant is currently selected by the ifunc selector
if the Miscellaneous-Instruction-Extensions Facility 3 facility bit
is present, but the function is also using vector instructions.
If the vector support is not present, one is receiving an operation
exception.
Therefore this patch also checks for vector support in the ifunc
selector and in ifunc-impl-list.c.
Just to be sure, the configure check is now also testing an arch13
vector instruction and an arch13 Miscellaneous-Instruction-Extensions
Facility 3 instruction.
(cherry picked from commit 7759be2593b689cb1eafc0f52ee7f59c639e5d2f)
A not so recent kernel change[1] changed how the trampoline
`__kernel_sigtramp_rt64` is used to call signal handlers.
This was exposed on the test misc/tst-sigcontext-get_pc
Before kernel 5.9, the kernel set LR to the trampoline address and
jumped directly to the signal handler, and at the end the signal
handler, as any other function, would `blr` to the address set. In
other words, the trampoline was executed just at the end of the signal
handler and the only thing it did was call sigreturn. But since
kernel 5.9 the kernel set CTRL to the signal handler and calls to the
trampoline code, the trampoline then `bctrl` to the address in CTRL,
setting the LR to the next instruction in the middle of the
trampoline, when the signal handler returns, the rest of the
trampoline code executes the same code as before.
Here is the full trampoline code as of kernel 5.11.0-rc5 for
reference:
V_FUNCTION_BEGIN(__kernel_sigtramp_rt64)
.Lsigrt_start:
bctrl /* call the handler */
addi r1, r1, __SIGNAL_FRAMESIZE
li r0,__NR_rt_sigreturn
sc
.Lsigrt_end:
V_FUNCTION_END(__kernel_sigtramp_rt64)
This new behavior breaks how `backtrace()` uses to detect the
trampoline frame to correctly reconstruct the stack frame when it is
called from inside a signal handling.
This workaround rely on the fact that the trampoline code is at very
least two (maybe 3?) instructions in size (as it is in the 32 bits
version, only on `li` and `sc`), so it is safe to check the return
address be in the range __kernel_sigtramp_rt64 .. + 4.
[1] subject: powerpc/64/signal: Balance return predictor stack in signal trampoline
commit: 0138ba5783ae0dcc799ad401a1e8ac8333790df9
url: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0138ba5783ae0dcc799ad401a1e8ac8333790df9
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit 5ee506ed35a2c9184bcb1fb5e79b6cceb9bb0dd1)
In commit 745664bd798ec8fd50438605948eea594179fba1 a use-after-free
was fixed, but this led to an occasional double-free. This patch
tracks the "live" allocation better.
Tested manually by a third party.
Related: RHBZ 1927877
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit dca565886b5e8bd7966e15f0ca42ee5cff686673)
The conversion loop to the internal encoding does not follow
the interface contract that __GCONV_FULL_OUTPUT is only returned
after the internal wchar_t buffer has been filled completely. This
is enforced by the first of the two asserts in iconv/skeleton.c:
/* We must run out of output buffer space in this
rerun. */
assert (outbuf == outerr);
assert (nstatus == __GCONV_FULL_OUTPUT);
This commit solves this issue by queuing a second wide character
which cannot be written immediately in the state variable, like
other converters already do (e.g., BIG5-HKSCS or TSCII).
Reported-by: Tavis Ormandy <taviso@gmail.com>
(cherry picked from commit 7d88c6142c6efc160c0ee5e4f85cde382c072888)
A bti c was missing from rcrt1.o which made all -static-pie
binaries fail at program startup on BTI enabled systems.
Fixes bug 27068.
(cherry picked from commit d4136903a29baabeec8987b53081def8b4a49826)
As noted in <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97264>, the cast
in the call to the read_int function is an aliasing violation. Change the
type of local variable f to a pointer to unsigned, which allows to
eliminate most casts while only adding three new ones.
(cherry picked from commit c0e9ddf59e73e21afe15fca4e94cf7b4b7359bf2)
Re-mmap executable segments if possible instead of using mprotect
to add PROT_BTI. This allows using BTI protection with security
policies that prevent mprotect with PROT_EXEC.
If the fd of the ELF module is not available because it was kernel
mapped then mprotect is used and failures are ignored. To protect
the main executable even when mprotect is filtered the linux kernel
will have to be changed to add PROT_BTI to it.
The delayed failure reporting is mainly needed because currently
_dl_process_gnu_properties does not propagate failures such that
the required cleanups happen. Using the link_map_machine struct for
error propagation is not ideal, but this seemed to be the least
intrusive solution.
Fixes bug 26831.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit cd543b5eb3642d76e365a131ce676f31fe3f1dd4)
To handle GNU property notes on aarch64 some segments need to
be mmaped again, so the fd of the loaded ELF module is needed.
When the fd is not available (kernel loaded modules), then -1
is passed.
The fd is passed to both _dl_process_pt_gnu_property and
_dl_process_pt_note for consistency. Target specific note
processing functions are updated accordingly.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit c00452d7757a300931ee186d043c43b48eeb0875)
Program headers are processed in two pass: after the first pass
load segments are mmapped so in the second pass target specific
note processing logic can access the notes.
The second pass is moved later so various link_map fields are
set up that may be useful for note processing such as l_phdr.
The second pass should be before the fd is closed so that is
available.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit 38a3836011f3fe3290a94ab136dcb5f3c5c9f4e2)
elf: Fix dl-load.c
Rebasing broke commit 38a3836011f3fe3290a94ab136dcb5f3c5c9f4e2
it was supposed to move code.
(cherry picked from commit 751acde7ec335506b54e94ed6f2c998f6c0a22c6)
Handle unaligned executable load segments (the bfd linker is not
expected to produce such binaries, but other linkers may).
Computing the mapping bounds follows _dl_map_object_from_fd more
closely now.
Fixes bug 26988.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit 8b8f616e6a594b91d0afb152384bf2a9f72b7288)
The _dl_open_check and _rtld_main_check hooks are not called on the
dependencies of a loaded module, so BTI protection was missed on
every module other than the main executable and directly dlopened
libraries.
The fix just iterates over dependencies to enable BTI.
Fixes bug 26926.
(cherry picked from commit 72739c79f61989a76b7dd719f34fcfb7b8eadde9)
Calling an IFUNC function defined in unrelocated executable also leads to
segfault. Issue a fatal error message when calling IFUNC function defined
in the unrelocated executable from a shared library.
On x86, ifuncmain6pie failed with:
[hjl@gnu-cfl-2 build-i686-linux]$ ./elf/ifuncmain6pie --direct
./elf/ifuncmain6pie: IFUNC symbol 'foo' referenced in '/export/build/gnu/tools-build/glibc-32bit/build-i686-linux/elf/ifuncmod6.so' is defined in the executable and creates an unsatisfiable circular dependency.
[hjl@gnu-cfl-2 build-i686-linux]$ readelf -rW elf/ifuncmod6.so | grep foo
00003ff4 00000706 R_386_GLOB_DAT 0000400c foo_ptr
00003ff8 00000406 R_386_GLOB_DAT 00000000 foo
0000400c 00000401 R_386_32 00000000 foo
[hjl@gnu-cfl-2 build-i686-linux]$
Remove non-JUMP_SLOT relocations against foo in ifuncmod6.so, which
trigger the circular IFUNC dependency, and build ifuncmain6pie with
-Wl,-z,lazy.
(cherry picked from commits 6ea5b57afa5cdc9ce367d2b69a2cebfb273e4617
and 7137d682ebfcb6db5dfc5f39724718699922f06c)
Update dl_cet_check() to set header.feature_1 in TCB when both IBT and
SHSTK are always on.
(cherry picked from commit 2ef23b520597f4ea1790a669b83e608f24f4cf12)
When copying with "rep movsb", if the distance between source and
destination is N*4GB + [1..63] with N >= 0, performance may be very
slow. This patch updates memmove-vec-unaligned-erms.S for AVX and
AVX512 versions with the distance in RCX:
cmpl $63, %ecx
// Don't use "rep movsb" if ECX <= 63
jbe L(Don't use rep movsb")
Use "rep movsb"
Benchtests data with bench-memcpy, bench-memcpy-large, bench-memcpy-random
and bench-memcpy-walk on Skylake, Ice Lake and Tiger Lake show that its
performance impact is within noise range as "rep movsb" is only used for
data size >= 4KB.
(cherry picked from commit 3ec5d83d2a237d39e7fd6ef7a0bc8ac4c171a4a5)
The byte 0xfe as input to the EUC-KR conversion denotes a user-defined
area and is not allowed. The from_euc_kr function used to skip two bytes
when told to skip over the unknown designation, potentially running over
the buffer end.
(cherry picked from commit ee7a3144c9922808181009b7b3e50e852fb4999b)
This new variable allows various subsystems in glibc to run all or
some of their tests with MALLOC_CHECK_=3. This patch adds
infrastructure support for this variable as well as an implementation
in malloc/Makefile to allow running some of the tests with
MALLOC_CHECK_=3.
At present some tests in malloc/ have been excluded from the mcheck
tests either because they're specifically testing MALLOC_CHECK_ or
they are failing in master even without the Memory Tagging patches
that prompted this work. Some tests were reviewed and found to need
specific error points that MALLOC_CHECK_ defeats by terminating early
but a thorough review of all tests is needed to bring them into mcheck
coverage.
Backported from 4f969166ce4ab535fa798dcbaa5de4c4e05773ec.
The IBM1364, IBM1371, IBM1388, IBM1390 and IBM1399 character sets
share converter logic (iconvdata/ibm1364.c) which would reject
redundant shift sequences when processing input in these character
sets. This led to a hang in the iconv program (CVE-2020-27618).
This commit adjusts the converter to ignore redundant shift sequences
and adds test cases for iconv_prog hangs that would be triggered upon
their rejection. This brings the implementation in line with other
converters that also ignore redundant shift sequences (e.g. IBM930
etc., fixed in commit 692de4b3960d).
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
(cherry picked from commit 9a99c682144bdbd40792ebf822fe9264e0376fb5)
The commit 605f38177db (sh: Split BE/LE abilist) did not take in
consideration the SH4 fpu support.
Checked with a build for sh4-linux-gnu and manually checked that
the implementations at sysdeps/sh/sh4/fpu/ are selected.
John Paul Adrian Glaubitz also confirmed it fixes the build issues
he encontered.
(cherry-picked from 9ff2674ef82eccd5ae5dfa6bb733c0e3613764c6)
__attribute__((__aligned__)) selects an alignment that depends on
the micro-architecture selected by GCC flags. Enabling vector
extensions may increase the allignment. This is a problem when
building glibc as a collection of ELF multilibs with different
GCC flags because ld.so and libc.so/libpthread.so/&c may end up
with a different layout of struct pthread because of the
changing offset of its struct _Unwind_Exception field.
Tested-By: Matheus Castanho <msc@linux.ibm.com>
(cherry picked from commit 30af7c7fa13e17d82c3f1f91536384715844f432)
When switching name servers, response processing by two server
threads clobbers the global test state. (There is still some
risk that this test is negatively impact by packet drops and
packet reordering, but this applies to many of the resolver tests
and is difficult to avoid.)
Fixes commit f1f00c072138af90ae6da180f260111f09afe7a3 ("resolv:
Handle transaction ID collisions in parallel queries (bug 26600)").
(cherry picked from commit b8b53b338f6da91e86d115a39da860cefac736ad)
If the transaction IDs are equal, the old check attributed both
responses to the first query, not recognizing the second response.
This fixes bug 26600.
(cherry picked from commit f1f00c072138af90ae6da180f260111f09afe7a3)
The macro is not used anymore, so remove it and warning-nop.c.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
(cherry-picked from 34aec973e15a81926198f4b71ff99081dff87a92)
Non-gcc compilers (clang and possibly other compilers that do not
masquerade as gcc 5.0 or later) are unable to use
__warn_memset_zero_len since the symbol is no longer available on
glibc built with gcc 5.0 or later. While it was likely an oversight
that caused this omission, the fact that it wasn't noticed until
recently (when clang closed the gap on _FORTIFY_SUPPORT) that the
symbol was missing.
Given that both gcc and clang are capable of doing this check in the
compiler, drop all remaining signs of __warn_memset_zero_len from
glibc so that no more objects are built with this symbol in future.
(cherry-picked from dc274b141666766b8ef70992d887e3c0c5e41bed)
This adds CFI directives which communicate that the stack ends
with this function.
Fixes bug 26853.
(cherry picked from commit 5edf3d9fd6efe06fda37b2a460e60690a90457a4)
The variant PCS support was ineffective because in the common case
linkmap->l_mach.plt == 0 but then the symbol table flags were ignored
and normal lazy binding was used instead of resolving the relocs early.
(This was a misunderstanding about how GOT[1] is setup by the linker.)
In practice this mainly affects SVE calls when the vector length is
more than 128 bits, then the top bits of the argument registers get
clobbered during lazy binding.
Fixes bug 26798.
(cherry picked from commit 558251bd8785760ad40fcbfeaaee5d27fa5b0fe4)
Modifying the shareable cache '__x86_shared_cache_size', which is a
factor in computing the non-temporal threshold parameter
'__x86_shared_non_temporal_threshold' to optimize memcpy for AMD Zen
architectures.
In the existing implementation, the shareable cache is computed as 'L3
per thread, L2 per core'. Recomputing this shareable cache as 'L3 per
CCX(Core-Complex)' has brought in performance gains.
As per the large bench variant results, this patch also addresses the
regression problem on AMD Zen architectures.
Backport of commit 59803e81f96b479c17f583b31eac44b57591a1bf upstream,
with the fix from cb3a749a22a55645dc6a52659eea765300623f98 ("x86:
Restore processing of cache size tunables in init_cacheinfo") applied.
Reviewed-by: Premachandra Mallappa <premachandra.mallappa@amd.com>
Co-Authored-by: Florian Weimer <fweimer@redhat.com>
The __x86_shared_non_temporal_threshold determines when memcpy on x86
uses non_temporal stores to avoid pushing other data out of the last
level cache.
This patch proposes to revert the calculation change made by H.J. Lu's
patch of June 2, 2017.
H.J. Lu's patch selected a threshold suitable for a single thread
getting maximum performance. It was tuned using the single threaded
large memcpy micro benchmark on an 8 core processor. The last change
changes the threshold from using 3/4 of one thread's share of the
cache to using 3/4 of the entire cache of a multi-threaded system
before switching to non-temporal stores. Multi-threaded systems with
more than a few threads are server-class and typically have many
active threads. If one thread consumes 3/4 of the available cache for
all threads, it will cause other active threads to have data removed
from the cache. Two examples show the range of the effect. John
McCalpin's widely parallel Stream benchmark, which runs in parallel
and fetches data sequentially, saw a 20% slowdown with this patch on
an internal system test of 128 threads. This regression was discovered
when comparing OL8 performance to OL7. An example that compares
normal stores to non-temporal stores may be found at
https://vgatherps.github.io/2018-09-02-nontemporal/. A simple test
shows performance loss of 400 to 500% due to a failure to use
nontemporal stores. These performance losses are most likely to occur
when the system load is heaviest and good performance is critical.
The tunable x86_non_temporal_threshold can be used to override the
default for the knowledgable user who really wants maximum cache
allocation to a single thread in a multi-threaded system.
The manual entry for the tunable has been expanded to provide
more information about its purpose.
modified: sysdeps/x86/cacheinfo.c
modified: manual/tunables.texi
(cherry picked from commit d3c57027470b78dba79c6d931e4e409b1fecfc80)
Both commands are Linux extensions where the third argument is either
a 'struct shminfo' (IPC_INFO) or a 'struct shm_info' (SHM_INFO) instead
of 'struct shmid_ds'. And their information does not contain any time
related fields, so there is no need to extra conversion for __IPC_TIME64.
The regression testcase checks for Linux specifix SysV ipc message
control extension. For SHM_INFO it tries to match the values against the
tunable /proc values and for MSG_STAT/MSG_STAT_ANY it check if the create\
shared memory is within the global list returned by the kernel.
Checked on x86_64-linux-gnu and on i686-linux-gnu (Linux v5.4 and on
Linux v4.15).
(cherry picked from commit a49d7fd4f764e97ccaf922e433046590ae52fce9)
Both commands are Linux extensions where the third argument is a
'struct msginfo' instead of 'struct msqid_ds' and its information
does not contain any time related fields (so there is no need to
extra conversion for __IPC_TIME64.
The regression testcase checks for Linux specifix SysV ipc message
control extension. For IPC_INFO/MSG_INFO it tries to match the values
against the tunable /proc values and for MSG_STAT/MSG_STAT_ANY it
check if the create message queue is within the global list returned
by the kernel.
Checked on x86_64-linux-gnu and on i686-linux-gnu (Linux v5.4 and on
Linux v4.15).
(cherry picked from commit 20a00dbefca5695cccaa44846a482db8ccdd85ab)
Handle SEM_STAT_ANY the same way as SEM_STAT so that the buffer argument
of SEM_STAT_ANY is properly passed to the kernel and back.
The regression testcase checks for Linux specifix SysV ipc message
control extension. For IPC_INFO/SEM_INFO it tries to match the values
against the tunable /proc values and for SEM_STAT/SEM_STAT_ANY it
check if the create message queue is within the global list returned
by the kernel.
Checked on x86_64-linux-gnu and on i686-linux-gnu (Linux v5.4 and on
Linux v4.15).
Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit 574500a108be1d2a6a0dc97a075c9e0a98371aba)
Add CPU detection of Neoverse N2 and Neoverse V1, and select __memcpy_simd as
the memcpy/memmove ifunc.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit e11ed9d2b4558eeacff81557dc9557001af42a6b)
On some microarchitectures performance of the backwards memmove improves if
the stores use STR with decreasing addresses. So change the memmove loop
in memcpy_advsimd.S to use 2x STR rather than STP.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit bd394d131c10c9ec22c6424197b79410042eed99)
The RELEASE macro was accidentaly set to "release" instead of
the expected "stable" by the release manager. This is a mistake
that leads to the build using "-g -O1" instead of "-g -O2" if
configure was executed with "CFLAGS=" (CFLAGS set but empty).
It returns the string of the error constant, not its description (as
strerrordesc_np). To handle the Hurd error mapping, the ERR_MAP was
removed from errlist.h to errlist.c.
Also, the testcase test-strerr (added on 325081b9eb2) was not added
on the check build neither it builds correctly. This patch also
changed it to decouple from errlist.h, the expected return values
are added explicitly for both both strerrorname_np and strerrordesc_np
directly.
Checked on x86_64-linux-gnu and i686-linux-gnu. I also run a make
check for i686-gnu.
(cherry picked from commit cef95fdc2e4002ee6357d8d40ef73c8d875720e3)
Commit 91927b7c7643 (Rewrite iconv option parsing [BZ #19519]) did not
handle cases where the output codeset for translations (via the `gettext'
family of functions) might have a caller specified encoding suffix such as
TRANSLIT or IGNORE. This led to a regression where translations did not
work when the codeset had a suffix.
This commit fixes the above issue by parsing any suffixes passed to
__dcigettext and adds two new test-cases to intl/tst-codeset.c to
verify correct behaviour. The iconv-internal function __gconv_create_spec
and the static iconv-internal function gconv_destroy_spec are now visible
internally within glibc and used in intl/dcigettext.c.
(cherry picked from commit 7d4ec75e111291851620c6aa2c4460647b7fd50d)
A typo in commit 107e6a3c2212ba7a3a4ec7cae8d82d73f7c95d0b causes the
FMA4 code path to be taken on systems that support FMA, even if they do
not support FMA4. Fix this to detect FMA4.
(cherry picked from commit 23af890b3f04e80da783ba64e6b6d94822e01d54)