glibc/stdlib/cxa_thread_atexit_impl.c

167 lines
6.8 KiB
C
Raw Permalink Normal View History

/* Register destructors for C++ TLS variables declared with thread_local.
Copyright (C) 2013-2024 Free Software Foundation, Inc.
This file is part of the GNU C Library.
The GNU C Library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.
The GNU C Library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with the GNU C Library; if not, see
Prefer https to http for gnu.org and fsf.org URLs Also, change sources.redhat.com to sourceware.org. This patch was automatically generated by running the following shell script, which uses GNU sed, and which avoids modifying files imported from upstream: sed -ri ' s,(http|ftp)(://(.*\.)?(gnu|fsf|sourceware)\.org($|[^.]|\.[^a-z])),https\2,g s,(http|ftp)(://(.*\.)?)sources\.redhat\.com($|[^.]|\.[^a-z]),https\2sourceware.org\4,g ' \ $(find $(git ls-files) -prune -type f \ ! -name '*.po' \ ! -name 'ChangeLog*' \ ! -path COPYING ! -path COPYING.LIB \ ! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \ ! -path manual/texinfo.tex ! -path scripts/config.guess \ ! -path scripts/config.sub ! -path scripts/install-sh \ ! -path scripts/mkinstalldirs ! -path scripts/move-if-change \ ! -path INSTALL ! -path locale/programs/charmap-kw.h \ ! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \ ! '(' -name configure \ -execdir test -f configure.ac -o -f configure.in ';' ')' \ ! '(' -name preconfigure \ -execdir test -f preconfigure.ac ';' ')' \ -print) and then by running 'make dist-prepare' to regenerate files built from the altered files, and then executing the following to cleanup: chmod a+x sysdeps/unix/sysv/linux/riscv/configure # Omit irrelevant whitespace and comment-only changes, # perhaps from a slightly-different Autoconf version. git checkout -f \ sysdeps/csky/configure \ sysdeps/hppa/configure \ sysdeps/riscv/configure \ sysdeps/unix/sysv/linux/csky/configure # Omit changes that caused a pre-commit check to fail like this: # remote: *** error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines git checkout -f \ sysdeps/powerpc/powerpc64/ppc-mcount.S \ sysdeps/unix/sysv/linux/s390/s390-64/syscall.S # Omit change that caused a pre-commit check to fail like this: # remote: *** error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S
2019-09-07 13:40:42 +08:00
<https://www.gnu.org/licenses/>. */
Also use l_tls_dtor_count to decide on object unload (BZ #18657) When an TLS destructor is registered, we set the DF_1_NODELETE flag to signal that the object should not be destroyed. We then clear the DF_1_NODELETE flag when all destructors are called, which is wrong - the flag could have been set by other means too. This patch replaces this use of the flag by using l_tls_dtor_count directly to determine whether it is safe to unload the object. This change has the added advantage of eliminating the lock taking when calling the destructors, which could result in a deadlock. The patch also fixes the test case tst-tls-atexit - it was making an invalid dlclose call, which would just return an error silently. I have also added a detailed note on concurrency which also aims to justify why I chose the semantics I chose for accesses to l_tls_dtor_count. Thanks to Torvald for his help in getting me started on this and (literally) teaching my how to approach the problem. Change verified on x86_64; the test suite does not show any regressions due to the patch. ChangeLog: [BZ #18657] * elf/dl-close.c (_dl_close_worker): Don't unload DSO if there are pending TLS destructor calls. * include/link.h (struct link_map): Add concurrency note for L_TLS_DTOR_COUNT. * stdlib/cxa_thread_atexit_impl.c (__cxa_thread_atexit_impl): Don't touch the link map flag. Atomically increment l_tls_dtor_count. (__call_tls_dtors): Atomically decrement l_tls_dtor_count. Avoid taking the load lock and don't touch the link map flag. * stdlib/tst-tls-atexit-nodelete.c: New test case. * stdlib/Makefile (tests): Use it. * stdlib/tst-tls-atexit.c (do_test): dlopen tst-tls-atexit-lib.so again before dlclose. Add conditionals to allow tst-tls-atexit-nodelete test case to use it.
2015-07-23 13:46:18 +08:00
/* CONCURRENCY NOTES:
This documents concurrency for the non-POD TLS destructor registration,
calling and destruction. The functions __cxa_thread_atexit_impl,
_dl_close_worker and __call_tls_dtors are the three main routines that may
run concurrently and access shared data. The shared data in all possible
combinations of all three functions are the link map list, a link map for a
DSO and the link map member l_tls_dtor_count.
__cxa_thread_atexit_impl acquires the dl_load_lock before accessing any
shared state and hence multiple of its instances can safely execute
concurrently.
Also use l_tls_dtor_count to decide on object unload (BZ #18657) When an TLS destructor is registered, we set the DF_1_NODELETE flag to signal that the object should not be destroyed. We then clear the DF_1_NODELETE flag when all destructors are called, which is wrong - the flag could have been set by other means too. This patch replaces this use of the flag by using l_tls_dtor_count directly to determine whether it is safe to unload the object. This change has the added advantage of eliminating the lock taking when calling the destructors, which could result in a deadlock. The patch also fixes the test case tst-tls-atexit - it was making an invalid dlclose call, which would just return an error silently. I have also added a detailed note on concurrency which also aims to justify why I chose the semantics I chose for accesses to l_tls_dtor_count. Thanks to Torvald for his help in getting me started on this and (literally) teaching my how to approach the problem. Change verified on x86_64; the test suite does not show any regressions due to the patch. ChangeLog: [BZ #18657] * elf/dl-close.c (_dl_close_worker): Don't unload DSO if there are pending TLS destructor calls. * include/link.h (struct link_map): Add concurrency note for L_TLS_DTOR_COUNT. * stdlib/cxa_thread_atexit_impl.c (__cxa_thread_atexit_impl): Don't touch the link map flag. Atomically increment l_tls_dtor_count. (__call_tls_dtors): Atomically decrement l_tls_dtor_count. Avoid taking the load lock and don't touch the link map flag. * stdlib/tst-tls-atexit-nodelete.c: New test case. * stdlib/Makefile (tests): Use it. * stdlib/tst-tls-atexit.c (do_test): dlopen tst-tls-atexit-lib.so again before dlclose. Add conditionals to allow tst-tls-atexit-nodelete test case to use it.
2015-07-23 13:46:18 +08:00
_dl_close_worker acquires the dl_load_lock before accessing any shared state
as well and hence can concurrently execute multiple of its own instances as
Also use l_tls_dtor_count to decide on object unload (BZ #18657) When an TLS destructor is registered, we set the DF_1_NODELETE flag to signal that the object should not be destroyed. We then clear the DF_1_NODELETE flag when all destructors are called, which is wrong - the flag could have been set by other means too. This patch replaces this use of the flag by using l_tls_dtor_count directly to determine whether it is safe to unload the object. This change has the added advantage of eliminating the lock taking when calling the destructors, which could result in a deadlock. The patch also fixes the test case tst-tls-atexit - it was making an invalid dlclose call, which would just return an error silently. I have also added a detailed note on concurrency which also aims to justify why I chose the semantics I chose for accesses to l_tls_dtor_count. Thanks to Torvald for his help in getting me started on this and (literally) teaching my how to approach the problem. Change verified on x86_64; the test suite does not show any regressions due to the patch. ChangeLog: [BZ #18657] * elf/dl-close.c (_dl_close_worker): Don't unload DSO if there are pending TLS destructor calls. * include/link.h (struct link_map): Add concurrency note for L_TLS_DTOR_COUNT. * stdlib/cxa_thread_atexit_impl.c (__cxa_thread_atexit_impl): Don't touch the link map flag. Atomically increment l_tls_dtor_count. (__call_tls_dtors): Atomically decrement l_tls_dtor_count. Avoid taking the load lock and don't touch the link map flag. * stdlib/tst-tls-atexit-nodelete.c: New test case. * stdlib/Makefile (tests): Use it. * stdlib/tst-tls-atexit.c (do_test): dlopen tst-tls-atexit-lib.so again before dlclose. Add conditionals to allow tst-tls-atexit-nodelete test case to use it.
2015-07-23 13:46:18 +08:00
well as those of __cxa_thread_atexit_impl safely. Not all accesses to
l_tls_dtor_count are protected by the dl_load_lock, so we need to
synchronize using atomics.
Also use l_tls_dtor_count to decide on object unload (BZ #18657) When an TLS destructor is registered, we set the DF_1_NODELETE flag to signal that the object should not be destroyed. We then clear the DF_1_NODELETE flag when all destructors are called, which is wrong - the flag could have been set by other means too. This patch replaces this use of the flag by using l_tls_dtor_count directly to determine whether it is safe to unload the object. This change has the added advantage of eliminating the lock taking when calling the destructors, which could result in a deadlock. The patch also fixes the test case tst-tls-atexit - it was making an invalid dlclose call, which would just return an error silently. I have also added a detailed note on concurrency which also aims to justify why I chose the semantics I chose for accesses to l_tls_dtor_count. Thanks to Torvald for his help in getting me started on this and (literally) teaching my how to approach the problem. Change verified on x86_64; the test suite does not show any regressions due to the patch. ChangeLog: [BZ #18657] * elf/dl-close.c (_dl_close_worker): Don't unload DSO if there are pending TLS destructor calls. * include/link.h (struct link_map): Add concurrency note for L_TLS_DTOR_COUNT. * stdlib/cxa_thread_atexit_impl.c (__cxa_thread_atexit_impl): Don't touch the link map flag. Atomically increment l_tls_dtor_count. (__call_tls_dtors): Atomically decrement l_tls_dtor_count. Avoid taking the load lock and don't touch the link map flag. * stdlib/tst-tls-atexit-nodelete.c: New test case. * stdlib/Makefile (tests): Use it. * stdlib/tst-tls-atexit.c (do_test): dlopen tst-tls-atexit-lib.so again before dlclose. Add conditionals to allow tst-tls-atexit-nodelete test case to use it.
2015-07-23 13:46:18 +08:00
__call_tls_dtors accesses the l_tls_dtor_count without taking the lock; it
decrements the value by one. It does not need the big lock because it does
not access any other shared state except for the current DSO link map and
its member l_tls_dtor_count.
Correspondingly, _dl_close_worker loads l_tls_dtor_count and if it is zero,
unloads the DSO, thus deallocating the current link map. This is the goal
of maintaining l_tls_dtor_count - to unload the DSO and free resources if
there are no pending destructors to be called.
We want to eliminate the inconsistent state where the DSO is unloaded in
_dl_close_worker before it is used in __call_tls_dtors. This could happen
if __call_tls_dtors uses the link map after it sets l_tls_dtor_count to 0,
since _dl_close_worker will conclude from the 0 l_tls_dtor_count value that
it is safe to unload the DSO. Hence, to ensure that this does not happen,
the following conditions must be met:
1. In _dl_close_worker, the l_tls_dtor_count load happens before the DSO is
unloaded and its link map is freed
Also use l_tls_dtor_count to decide on object unload (BZ #18657) When an TLS destructor is registered, we set the DF_1_NODELETE flag to signal that the object should not be destroyed. We then clear the DF_1_NODELETE flag when all destructors are called, which is wrong - the flag could have been set by other means too. This patch replaces this use of the flag by using l_tls_dtor_count directly to determine whether it is safe to unload the object. This change has the added advantage of eliminating the lock taking when calling the destructors, which could result in a deadlock. The patch also fixes the test case tst-tls-atexit - it was making an invalid dlclose call, which would just return an error silently. I have also added a detailed note on concurrency which also aims to justify why I chose the semantics I chose for accesses to l_tls_dtor_count. Thanks to Torvald for his help in getting me started on this and (literally) teaching my how to approach the problem. Change verified on x86_64; the test suite does not show any regressions due to the patch. ChangeLog: [BZ #18657] * elf/dl-close.c (_dl_close_worker): Don't unload DSO if there are pending TLS destructor calls. * include/link.h (struct link_map): Add concurrency note for L_TLS_DTOR_COUNT. * stdlib/cxa_thread_atexit_impl.c (__cxa_thread_atexit_impl): Don't touch the link map flag. Atomically increment l_tls_dtor_count. (__call_tls_dtors): Atomically decrement l_tls_dtor_count. Avoid taking the load lock and don't touch the link map flag. * stdlib/tst-tls-atexit-nodelete.c: New test case. * stdlib/Makefile (tests): Use it. * stdlib/tst-tls-atexit.c (do_test): dlopen tst-tls-atexit-lib.so again before dlclose. Add conditionals to allow tst-tls-atexit-nodelete test case to use it.
2015-07-23 13:46:18 +08:00
2. The link map dereference in __call_tls_dtors happens before the
l_tls_dtor_count dereference.
To ensure this, the l_tls_dtor_count decrement in __call_tls_dtors should
have release semantics and the load in _dl_close_worker should have acquire
semantics.
Concurrent executions of __call_tls_dtors should only ensure that the value
is accessed atomically; no reordering constraints need to be considered.
Likewise for the increment of l_tls_dtor_count in __cxa_thread_atexit_impl.
There is still a possibility on concurrent execution of _dl_close_worker and
__call_tls_dtors where _dl_close_worker reads the value of l_tls_dtor_count
as 1, __call_tls_dtors decrements the value of l_tls_dtor_count but
_dl_close_worker does not unload the DSO, having read the old value. This
is not very different from a case where __call_tls_dtors is called after
_dl_close_worker on the DSO and hence is an accepted execution. */
#include <stdio.h>
#include <stdlib.h>
#include <ldsodefs.h>
#include <pointer_guard.h>
typedef void (*dtor_func) (void *);
struct dtor_list
{
dtor_func func;
void *obj;
struct link_map *map;
struct dtor_list *next;
};
static __thread struct dtor_list *tls_dtor_list;
static __thread void *dso_symbol_cache;
static __thread struct link_map *lm_cache;
/* Register a destructor for TLS variables declared with the 'thread_local'
keyword. This function is only called from code generated by the C++
compiler. FUNC is the destructor function and OBJ is the object to be
passed to the destructor. DSO_SYMBOL is the __dso_handle symbol that each
DSO has at a unique address in its map, added from crtbegin.o during the
linking phase. */
int
__cxa_thread_atexit_impl (dtor_func func, void *obj, void *dso_symbol)
{
PTR_MANGLE (func);
/* Prepend. */
struct dtor_list *new = calloc (1, sizeof (struct dtor_list));
if (__glibc_unlikely (new == NULL))
__libc_fatal ("Fatal glibc error: failed to register TLS destructor: "
"out of memory\n");
new->func = func;
new->obj = obj;
new->next = tls_dtor_list;
tls_dtor_list = new;
Also use l_tls_dtor_count to decide on object unload (BZ #18657) When an TLS destructor is registered, we set the DF_1_NODELETE flag to signal that the object should not be destroyed. We then clear the DF_1_NODELETE flag when all destructors are called, which is wrong - the flag could have been set by other means too. This patch replaces this use of the flag by using l_tls_dtor_count directly to determine whether it is safe to unload the object. This change has the added advantage of eliminating the lock taking when calling the destructors, which could result in a deadlock. The patch also fixes the test case tst-tls-atexit - it was making an invalid dlclose call, which would just return an error silently. I have also added a detailed note on concurrency which also aims to justify why I chose the semantics I chose for accesses to l_tls_dtor_count. Thanks to Torvald for his help in getting me started on this and (literally) teaching my how to approach the problem. Change verified on x86_64; the test suite does not show any regressions due to the patch. ChangeLog: [BZ #18657] * elf/dl-close.c (_dl_close_worker): Don't unload DSO if there are pending TLS destructor calls. * include/link.h (struct link_map): Add concurrency note for L_TLS_DTOR_COUNT. * stdlib/cxa_thread_atexit_impl.c (__cxa_thread_atexit_impl): Don't touch the link map flag. Atomically increment l_tls_dtor_count. (__call_tls_dtors): Atomically decrement l_tls_dtor_count. Avoid taking the load lock and don't touch the link map flag. * stdlib/tst-tls-atexit-nodelete.c: New test case. * stdlib/Makefile (tests): Use it. * stdlib/tst-tls-atexit.c (do_test): dlopen tst-tls-atexit-lib.so again before dlclose. Add conditionals to allow tst-tls-atexit-nodelete test case to use it.
2015-07-23 13:46:18 +08:00
/* We have to acquire the big lock to prevent a racing dlclose from pulling
our DSO from underneath us while we're setting up our destructor. */
__rtld_lock_lock_recursive (GL(dl_load_lock));
Also use l_tls_dtor_count to decide on object unload (BZ #18657) When an TLS destructor is registered, we set the DF_1_NODELETE flag to signal that the object should not be destroyed. We then clear the DF_1_NODELETE flag when all destructors are called, which is wrong - the flag could have been set by other means too. This patch replaces this use of the flag by using l_tls_dtor_count directly to determine whether it is safe to unload the object. This change has the added advantage of eliminating the lock taking when calling the destructors, which could result in a deadlock. The patch also fixes the test case tst-tls-atexit - it was making an invalid dlclose call, which would just return an error silently. I have also added a detailed note on concurrency which also aims to justify why I chose the semantics I chose for accesses to l_tls_dtor_count. Thanks to Torvald for his help in getting me started on this and (literally) teaching my how to approach the problem. Change verified on x86_64; the test suite does not show any regressions due to the patch. ChangeLog: [BZ #18657] * elf/dl-close.c (_dl_close_worker): Don't unload DSO if there are pending TLS destructor calls. * include/link.h (struct link_map): Add concurrency note for L_TLS_DTOR_COUNT. * stdlib/cxa_thread_atexit_impl.c (__cxa_thread_atexit_impl): Don't touch the link map flag. Atomically increment l_tls_dtor_count. (__call_tls_dtors): Atomically decrement l_tls_dtor_count. Avoid taking the load lock and don't touch the link map flag. * stdlib/tst-tls-atexit-nodelete.c: New test case. * stdlib/Makefile (tests): Use it. * stdlib/tst-tls-atexit.c (do_test): dlopen tst-tls-atexit-lib.so again before dlclose. Add conditionals to allow tst-tls-atexit-nodelete test case to use it.
2015-07-23 13:46:18 +08:00
/* See if we already encountered the DSO. */
if (__glibc_unlikely (dso_symbol_cache != dso_symbol))
{
ElfW(Addr) caller = (ElfW(Addr)) dso_symbol;
struct link_map *l = _dl_find_dso_for_object (caller);
/* If the address is not recognized the call comes from the main
program (we hope). */
lm_cache = l ? l : GL(dl_ns)[LM_ID_BASE]._ns_loaded;
}
Also use l_tls_dtor_count to decide on object unload (BZ #18657) When an TLS destructor is registered, we set the DF_1_NODELETE flag to signal that the object should not be destroyed. We then clear the DF_1_NODELETE flag when all destructors are called, which is wrong - the flag could have been set by other means too. This patch replaces this use of the flag by using l_tls_dtor_count directly to determine whether it is safe to unload the object. This change has the added advantage of eliminating the lock taking when calling the destructors, which could result in a deadlock. The patch also fixes the test case tst-tls-atexit - it was making an invalid dlclose call, which would just return an error silently. I have also added a detailed note on concurrency which also aims to justify why I chose the semantics I chose for accesses to l_tls_dtor_count. Thanks to Torvald for his help in getting me started on this and (literally) teaching my how to approach the problem. Change verified on x86_64; the test suite does not show any regressions due to the patch. ChangeLog: [BZ #18657] * elf/dl-close.c (_dl_close_worker): Don't unload DSO if there are pending TLS destructor calls. * include/link.h (struct link_map): Add concurrency note for L_TLS_DTOR_COUNT. * stdlib/cxa_thread_atexit_impl.c (__cxa_thread_atexit_impl): Don't touch the link map flag. Atomically increment l_tls_dtor_count. (__call_tls_dtors): Atomically decrement l_tls_dtor_count. Avoid taking the load lock and don't touch the link map flag. * stdlib/tst-tls-atexit-nodelete.c: New test case. * stdlib/Makefile (tests): Use it. * stdlib/tst-tls-atexit.c (do_test): dlopen tst-tls-atexit-lib.so again before dlclose. Add conditionals to allow tst-tls-atexit-nodelete test case to use it.
2015-07-23 13:46:18 +08:00
/* This increment may only be concurrently observed either by the decrement
in __call_tls_dtors since the other l_tls_dtor_count access in
_dl_close_worker is protected by the dl_load_lock. The execution in
Also use l_tls_dtor_count to decide on object unload (BZ #18657) When an TLS destructor is registered, we set the DF_1_NODELETE flag to signal that the object should not be destroyed. We then clear the DF_1_NODELETE flag when all destructors are called, which is wrong - the flag could have been set by other means too. This patch replaces this use of the flag by using l_tls_dtor_count directly to determine whether it is safe to unload the object. This change has the added advantage of eliminating the lock taking when calling the destructors, which could result in a deadlock. The patch also fixes the test case tst-tls-atexit - it was making an invalid dlclose call, which would just return an error silently. I have also added a detailed note on concurrency which also aims to justify why I chose the semantics I chose for accesses to l_tls_dtor_count. Thanks to Torvald for his help in getting me started on this and (literally) teaching my how to approach the problem. Change verified on x86_64; the test suite does not show any regressions due to the patch. ChangeLog: [BZ #18657] * elf/dl-close.c (_dl_close_worker): Don't unload DSO if there are pending TLS destructor calls. * include/link.h (struct link_map): Add concurrency note for L_TLS_DTOR_COUNT. * stdlib/cxa_thread_atexit_impl.c (__cxa_thread_atexit_impl): Don't touch the link map flag. Atomically increment l_tls_dtor_count. (__call_tls_dtors): Atomically decrement l_tls_dtor_count. Avoid taking the load lock and don't touch the link map flag. * stdlib/tst-tls-atexit-nodelete.c: New test case. * stdlib/Makefile (tests): Use it. * stdlib/tst-tls-atexit.c (do_test): dlopen tst-tls-atexit-lib.so again before dlclose. Add conditionals to allow tst-tls-atexit-nodelete test case to use it.
2015-07-23 13:46:18 +08:00
__call_tls_dtors does not really depend on this value beyond the fact that
it should be atomic, so Relaxed MO should be sufficient. */
atomic_fetch_add_relaxed (&lm_cache->l_tls_dtor_count, 1);
__rtld_lock_unlock_recursive (GL(dl_load_lock));
Also use l_tls_dtor_count to decide on object unload (BZ #18657) When an TLS destructor is registered, we set the DF_1_NODELETE flag to signal that the object should not be destroyed. We then clear the DF_1_NODELETE flag when all destructors are called, which is wrong - the flag could have been set by other means too. This patch replaces this use of the flag by using l_tls_dtor_count directly to determine whether it is safe to unload the object. This change has the added advantage of eliminating the lock taking when calling the destructors, which could result in a deadlock. The patch also fixes the test case tst-tls-atexit - it was making an invalid dlclose call, which would just return an error silently. I have also added a detailed note on concurrency which also aims to justify why I chose the semantics I chose for accesses to l_tls_dtor_count. Thanks to Torvald for his help in getting me started on this and (literally) teaching my how to approach the problem. Change verified on x86_64; the test suite does not show any regressions due to the patch. ChangeLog: [BZ #18657] * elf/dl-close.c (_dl_close_worker): Don't unload DSO if there are pending TLS destructor calls. * include/link.h (struct link_map): Add concurrency note for L_TLS_DTOR_COUNT. * stdlib/cxa_thread_atexit_impl.c (__cxa_thread_atexit_impl): Don't touch the link map flag. Atomically increment l_tls_dtor_count. (__call_tls_dtors): Atomically decrement l_tls_dtor_count. Avoid taking the load lock and don't touch the link map flag. * stdlib/tst-tls-atexit-nodelete.c: New test case. * stdlib/Makefile (tests): Use it. * stdlib/tst-tls-atexit.c (do_test): dlopen tst-tls-atexit-lib.so again before dlclose. Add conditionals to allow tst-tls-atexit-nodelete test case to use it.
2015-07-23 13:46:18 +08:00
new->map = lm_cache;
return 0;
}
/* Call the destructors. This is called either when a thread returns from the
initial function or when the process exits via the exit function. */
void
__call_tls_dtors (void)
{
while (tls_dtor_list)
{
struct dtor_list *cur = tls_dtor_list;
dtor_func func = cur->func;
PTR_DEMANGLE (func);
Also use l_tls_dtor_count to decide on object unload (BZ #18657) When an TLS destructor is registered, we set the DF_1_NODELETE flag to signal that the object should not be destroyed. We then clear the DF_1_NODELETE flag when all destructors are called, which is wrong - the flag could have been set by other means too. This patch replaces this use of the flag by using l_tls_dtor_count directly to determine whether it is safe to unload the object. This change has the added advantage of eliminating the lock taking when calling the destructors, which could result in a deadlock. The patch also fixes the test case tst-tls-atexit - it was making an invalid dlclose call, which would just return an error silently. I have also added a detailed note on concurrency which also aims to justify why I chose the semantics I chose for accesses to l_tls_dtor_count. Thanks to Torvald for his help in getting me started on this and (literally) teaching my how to approach the problem. Change verified on x86_64; the test suite does not show any regressions due to the patch. ChangeLog: [BZ #18657] * elf/dl-close.c (_dl_close_worker): Don't unload DSO if there are pending TLS destructor calls. * include/link.h (struct link_map): Add concurrency note for L_TLS_DTOR_COUNT. * stdlib/cxa_thread_atexit_impl.c (__cxa_thread_atexit_impl): Don't touch the link map flag. Atomically increment l_tls_dtor_count. (__call_tls_dtors): Atomically decrement l_tls_dtor_count. Avoid taking the load lock and don't touch the link map flag. * stdlib/tst-tls-atexit-nodelete.c: New test case. * stdlib/Makefile (tests): Use it. * stdlib/tst-tls-atexit.c (do_test): dlopen tst-tls-atexit-lib.so again before dlclose. Add conditionals to allow tst-tls-atexit-nodelete test case to use it.
2015-07-23 13:46:18 +08:00
tls_dtor_list = tls_dtor_list->next;
func (cur->obj);
Also use l_tls_dtor_count to decide on object unload (BZ #18657) When an TLS destructor is registered, we set the DF_1_NODELETE flag to signal that the object should not be destroyed. We then clear the DF_1_NODELETE flag when all destructors are called, which is wrong - the flag could have been set by other means too. This patch replaces this use of the flag by using l_tls_dtor_count directly to determine whether it is safe to unload the object. This change has the added advantage of eliminating the lock taking when calling the destructors, which could result in a deadlock. The patch also fixes the test case tst-tls-atexit - it was making an invalid dlclose call, which would just return an error silently. I have also added a detailed note on concurrency which also aims to justify why I chose the semantics I chose for accesses to l_tls_dtor_count. Thanks to Torvald for his help in getting me started on this and (literally) teaching my how to approach the problem. Change verified on x86_64; the test suite does not show any regressions due to the patch. ChangeLog: [BZ #18657] * elf/dl-close.c (_dl_close_worker): Don't unload DSO if there are pending TLS destructor calls. * include/link.h (struct link_map): Add concurrency note for L_TLS_DTOR_COUNT. * stdlib/cxa_thread_atexit_impl.c (__cxa_thread_atexit_impl): Don't touch the link map flag. Atomically increment l_tls_dtor_count. (__call_tls_dtors): Atomically decrement l_tls_dtor_count. Avoid taking the load lock and don't touch the link map flag. * stdlib/tst-tls-atexit-nodelete.c: New test case. * stdlib/Makefile (tests): Use it. * stdlib/tst-tls-atexit.c (do_test): dlopen tst-tls-atexit-lib.so again before dlclose. Add conditionals to allow tst-tls-atexit-nodelete test case to use it.
2015-07-23 13:46:18 +08:00
/* Ensure that the MAP dereference happens before
l_tls_dtor_count decrement. That way, we protect this access from a
potential DSO unload in _dl_close_worker, which happens when
l_tls_dtor_count is 0. See CONCURRENCY NOTES for more detail. */
atomic_fetch_add_release (&cur->map->l_tls_dtor_count, -1);
free (cur);
}
}
libc_hidden_def (__call_tls_dtors)