The __register_frame/__deregister_frame functions are used to register
unwinding frames from JITed code in a sorted list. That list itself
is protected by object_mutex, which leads to terrible performance
in multi-threaded code and is somewhat expensive even if single-threaded.
There was already a fast-path that avoided taking the mutex if no
frame was registered at all.
This commit eliminates both the mutex and the sorted list from
the atomic fast path, and replaces it with a btree that uses
optimistic lock coupling during lookup. This allows for fully parallel
unwinding and is essential to scale exception handling to large
core counts.
libgcc/ChangeLog:
* unwind-dw2-fde.c (release_registered_frames): Cleanup at shutdown.
(__register_frame_info_table_bases): Use btree in atomic fast path.
(__deregister_frame_info_bases): Likewise.
(_Unwind_Find_FDE): Likewise.
(base_from_object): Make parameter const.
(classify_object_over_fdes): Add query-only mode.
(get_pc_range): Compute PC range for lookup.
* unwind-dw2-fde.h (last_fde): Make parameter const.
* unwind-dw2-btree.h: New file.
This adds checks for _GLIBCXX_HOSTED to a number of headers which are
not currently installed for freestanding, but need to be for P1642R11
support. For example, <iterator> needs to be installed for C++23
freestanding mode, but without stream iterators and streambuf iterators.
Similarly, <memory> needs to be installed, but without std::allocator
and std::shared_ptr. This change disables the non-freestanding parts of
those headers.
libstdc++-v3/ChangeLog:
PR libstdc++/106953
* include/backward/auto_ptr.h [!_GLIBCXX_HOSTED]: Do not define
shared_ptr members.
* include/bits/alloc_traits.h [!_GLIBCXX_HOSTED]: Do not declare
std::allocator_traits<std::allocator<T>> specializations for
freestanding.
* include/bits/memoryfwd.h [!_GLIBCXX_HOSTED] (allocator): Do
not declare for freestanding.
* include/bits/stl_algo.h [!_GLIBCXX_HOSTED] (stable_partition):
Do not define for freestanding.
[!_GLIBCXX_HOSTED] (merge, stable_sort): Do not use temporary
buffers for freestanding.
* include/bits/stl_algobase.h [!_GLIBCXX_HOSTED]: Do not declare
streambuf iterators and overloaded algorithms using them.
* include/bits/stl_uninitialized.h [!_GLIBCXX_HOSTED]: Do not
define specialized overloads for std::allocator.
* include/bits/unique_ptr.h [!_GLIBCXX_HOSTED] (make_unique)
(make_unique_for_overwrite, operator<<): Do not define for
freestanding.
* include/c_global/cstdlib [!_GLIBCXX_HOSTED] (_Exit): Declare.
Use _GLIBCXX_NOTHROW instead of throw().
* include/debug/assertions.h [!_GLIBCXX_HOSTED]: Ignore
_GLIBCXX_DEBUG for freestanding.
* include/debug/debug.h [!_GLIBCXX_DEBUG]: Likewise.
* include/std/bit [!_GLIBCXX_HOSTED]: Do not use the custom
__int_traits if <ext/numeric_traits.h> is available.
* include/std/functional [!_GLIBCXX_HOSTED]: Do not include
headers that aren't valid for freestanding.
(boyer_moore_searcher, boyer_moore_horspool_searcher): Do not
define for freestanding.
* include/std/iterator [!_GLIBCXX_HOSTED]: Do not include
headers that aren't valid for freestanding.
* include/std/memory [!_GLIBCXX_HOSTED]: Likewise.
* include/std/ranges [!_GLIBCXX_HOSTED] (istream_view): Do not
define for freestanding.
(views::__detail::__is_basic_string_view) [!_GLIBCXX_HOSTED]:
Do not define partial specialization for freestanding.
The __alloc_swap and __shrink_to_fit_aux helpers are not specific to
std::allocator, so don't belong in <bits/allocator.h>. This also
simplifies enabling <memory> for freestanding, as now we can just omit
the whole of <bits/allocator.h> for freestanding.
libstdc++-v3/ChangeLog:
* include/bits/alloc_traits.h (__alloc_swap)
(__shrink_to_fit_aux): Move here, from ...
* include/bits/allocator.h: ... here.
* include/ext/alloc_traits.h: Do not include allocator.h.
This adds required headers to a few internal headers that currently
assume their deps will be included first. It's more robust to make them
include their own dependencies, so that later refactoring or reuse of
those headers in new contexts doesn't break.
libstdc++-v3/ChangeLog:
* include/bits/stl_algo.h: Include <bits/stl_algobase.h>.
* include/bits/stl_tempbuf.h: Include headers for __try and
__catch macros, std::pair, and __gnu_cxx::__numeric_traits.
* include/bits/stream_iterator.h: Include <iosfwd> and headers
for std::addressof and std::iterator.
* include/bits/streambuf_iterator.h: Include header for
std::iterator.
* include/std/iterator: Do not include <iosfwd>.
This test was written assuming that std::atomic_ref clears its target's
padding on construction, but that could introduce data races. Change the
test to store a value after construction and check that its padding is
cleared by the store.
libstdc++-v3/ChangeLog:
* testsuite/29_atomics/atomic_ref/compare_exchange_padding.cc:
Store value with non-zero padding bits after construction.
This patch permits accessing 'mutable' members of local objects during
constexpr evaluation, while continuing to reject it for global objects
(as in the last line of cpp0x/constexpr-mutable1.C). To distinguish
between the two cases, it looks like it suffices to just check
CONSTRUCTOR_MUTABLE_POSION in cxx_eval_component_reference before
deciding to reject a DECL_MUTABLE_P member access.
PR c++/92505
gcc/cp/ChangeLog:
* constexpr.cc (cxx_eval_component_reference): Check non_constant_p
sooner. In C++14 or later, reject a DECL_MUTABLE_P member access
only if CONSTRUCTOR_MUTABLE_POISION is also set.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/constexpr-mutable3.C: New test.
* g++.dg/cpp1y/constexpr-mutable1.C: New test.
The tr1/5_numerical_facilities/random/variate_generator/37986.cc test
fails for strict -std=c++98 mode because _Adaptor(const _Engine&) is
ill-formed in C++98 when _Engine is a reference type.
Rather than attempt to make the _Adaptor handle references and pointers,
just strip references and pointers from the _Engine type before we adapt
it. That removes the need for the _Adaptor<_Engine*> partial
specialization and avoids the reference-to-reference problem for c++98
mode.
While looking into this I noticed that the TR1 spec requires the
variate_generator<E,D>::engine_value_type to be the underlying engine
type, whereas we make it the _Adaptor<E> type that wraps the engine.
libstdc++-v3/ChangeLog:
* include/tr1/random.h (__detail::_Adaptor::_BEngine): Remove.
(__detail::_Adaptor::_M_g): Make public.
(__detail::_Adaptor<_Engine*, _Dist>): Remove partial
specialization.
(variate_generate::_Value): New helper to simplify handling of
_Engine* and _Engine& template arguments.
(variate_generate::engine_value_type): Define to underlying
engine type, not adapted type.
(variate_generate::engine()): Return underlying engine instead
of adaptor.
* testsuite/tr1/5_numerical_facilities/random/variate_generator/37986.cc:
Fix comment.
* testsuite/tr1/5_numerical_facilities/random/variate_generator/requirements/typedefs.cc:
Check member typedefs have the correct types.
This has to be valid as C++98/C++03.
libstdc++-v3/ChangeLog:
* include/debug/formatter.h [_GLIBCXX_DEBUG_BACKTRACE]
(_Error_formatter): Use 0 as null pointer constant.
This class template and partial specialization were added 15 years ago
to optimize allocator equality comparisons in std::list. I think it's
safe to assume that GCC is now capable of optimizing an inline
operator!= that just returns false at least as well as an inline member
function that just returns false.
libstdc++-v3/ChangeLog:
* include/bits/allocator.h (__alloc_neq): Remove.
* include/bits/stl_list.h (list::_M_check_equal_allocators):
Compare allocators directly, without __alloc_neq.
Remove the bogus -D__allocator_base=std::__new_allocator macro
definition for Doxygen, because that's an alias template for C++11 and
later, not a macro.
Fix the @cond/@endcond pair that span the end of an @addtogroup group.
Add another @endcond inside the group, and another @cond after it.
libstdc++-v3/ChangeLog:
* doc/doxygen/user.cfg.in (PREDEFINED): Remove __allocator_base.
* include/bits/allocator.h: Fix nesting of Doxygen commands.
this-f names a member function, which isn't an addressable lvalue. Give a
helpful error instead of crashing. The first hunk makes the error range
cover the whole expression.
PR c++/106858
gcc/cp/ChangeLog:
* parser.cc (cp_parser_omp_var_list_no_open): Pass the
initial token location down.
* semantics.cc (finish_omp_clauses): Check
invalid_nonstatic_memfn_p.
* typeck.cc (invalid_nonstatic_memfn_p): Handle null TREE_TYPE.
gcc/testsuite/ChangeLog:
* g++.dg/gomp/map-3.C: New test.
For ifloor/lfloor/iceil/lceil/irint/lrint/iround/lround when size of
in_mode is not equal out_mode, vectorizer doesn't go to internal fn
way,still left that part in the ix86_builtin_vectorized_function.
Remove others builtins and add corresponding expanders.
gcc/ChangeLog:
PR target/106910
* config/i386/i386-builtins.cc
(ix86_builtin_vectorized_function): Modernized with
corresponding expanders.
* config/i386/sse.md (lrint<mode><sseintvecmodelower>2): New
expander.
(floor<mode>2): Ditto.
(lfloor<mode><sseintvecmodelower>2): Ditto.
(ceil<mode>2): Ditto.
(lceil<mode><sseintvecmodelower>2): Ditto.
(btrunc<mode>2): Ditto.
(lround<mode><sseintvecmodelower>2): Ditto.
(exp2<mode>2): Ditto.
Previously <memory> included <bits/stl_algobase.h> so that std::copy,
std::fill etc. could be used by <bits/stl_uninitialized.h>. But that
includes it explicitly now, so that it can be compiled as a header unit.
There's no need to include it in <memory>, where its purpose isn't
obvious.
libstdc++-v3/ChangeLog:
* include/std/memory: Do not include <bits/stl_algobase.h>.
gcc/fortran/ChangeLog:
PR fortran/104314
* resolve.cc (deferred_op_assign): Do not try to generate temporary
for deferred character length assignment if types do not agree.
gcc/testsuite/ChangeLog:
PR fortran/104314
* gfortran.dg/pr104314.f90: New test.
Co-authored-by: Steven G. Kargl <kargl@gcc.gnu.org>
C2x has changed the rules for defining INFINITY in <float.h> so it is
no longer defined when float does not support infinities, instead of
being defined to an expression that overflows at translation time.
Thus, make the definition conditional on __FLT_HAS_INFINITY__ (this is
already inside a C2x-conditional part of <float.h>, because previous C
standard versions only had this macro in <math.h>).
Bootstrapped with no regressions for x86_64-pc-linux-gnu. Also did a
spot test of the case of no infinities supported by building cc1 for
vax-netbsdelf and testing compiling the new c2x-float-11.c test with
it.
gcc/
* ginclude/float.h (INFINITY): Define only if
[__FLT_HAS_INFINITY__].
gcc/testsuite/
* gcc.dg/c2x-float-2.c: Require inff effective-target.
* gcc.dg/c2x-float-11.c: New test.
Do not use the __tsan_mutex_not_static flag for annotation functions
where it's not a valid flag. Also use the try_lock and try_lock_failed
flags to more precisely annotate the CAS loop used to acquire a lock.
libstdc++-v3/ChangeLog:
* include/bits/shared_ptr_atomic.h (_GLIBCXX_TSAN_MUTEX_PRE_LOCK):
Replace with ...
(_GLIBCXX_TSAN_MUTEX_TRY_LOCK): ... this, add try_lock flag.
(_GLIBCXX_TSAN_MUTEX_TRY_LOCK_FAILED): New macro using
try_lock_failed flag
(_GLIBCXX_TSAN_MUTEX_POST_LOCK): Rename to ...
(_GLIBCXX_TSAN_MUTEX_LOCKED): ... this.
(_GLIBCXX_TSAN_MUTEX_PRE_UNLOCK): Remove invalid flag.
(_GLIBCXX_TSAN_MUTEX_POST_UNLOCK): Remove invalid flag.
(_Sp_atomic::_Atomic_count::lock): Use new macros.
Remove expressions for symbols in std::__detail::__8 namespace, they are obsolete since
version namespace applies only at std:: level, not at sub-levels.
libstdc++-v3/ChangeLog:
* config/abi/pre/gnu-versioned-namespace.ver: Remove obsolete std::__detail::__8
symbols.
PRE implicitely keeps virtual operands at the blocks incoming version
but the explicit updating point during PHI translation fails to trigger
when there are no PHIs at all in a block. Later lazy updating then
fails because of a too lose block check. A similar issues plagues
reference invalidation when checking the ANTIC_OUT to ANTIC_IN
translation. The following fixes both and makes the lazy updating
work.
The diagnostic testcase unfortunately requires boost so the
testcase is the one I reduced for a missed optimization in PRE.
The testcase fails with -m32 on x86_64 because we optimize too
much before PRE which causes PRE to not trigger so we fail to
eliminate a full redundancy. I'm going to open a separate bug
for this. Hopefully the !lp64 selector is good enough.
PR tree-optimization/106922
* tree-ssa-pre.cc (translate_vuse_through_block): Only
keep the VUSE if its def dominates PHIBLOCK.
(prune_clobbered_mems): Rewrite logic so we check whether
a value dies in a block when the VUSE def doesn't dominate it.
* g++.dg/tree-ssa/pr106922.C: New testcase.
All frontends replicate this, so move it.
gcc/
* tree.cc (build_common_tree_nodes): Initialize void_list_node
here.
gcc/ada/
* gcc-interface/trans.cc (gigi): Do not initialize void_list_node.
gcc/c-family/
* c-common.h (build_void_list_node): Remove.
* c-common.cc (c_common_nodes_and_builtins): Do not initialize
void_list_node.
gcc/c/
* c-decl.cc (build_void_list_node): Remove.
gcc/cp/
* decl.cc (cxx_init_decl_processing): Inline last
build_void_list_node call.
(build_void_list_node): Remove.
gcc/d/
* d-builtins.cc (d_build_c_type_nodes): Do not initialize
void_list_node.
gcc/fortran/
* f95-lang.cc (gfc_init_decl_processing): Do not initialize
void_list_node.
gcc/go/
* go-lang.cc (go_langhook_init): Do not initialize
void_list_node.
gcc/jit/
* dummy-frontend.cc (jit_langhook_init): Do not initialize
void_list_node.
gcc/lto/
* lto-lang.cc (lto_build_c_type_nodes): Do not initialize
void_list_node.
The expected scan dump output for this test will change after the
following patch is committed:
https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601558.html
But for now, this patch reverts to the old expected pattern so the test
passes. I will apply as obvious.
2022-09-15 Julian Brown <julian@codesourcery.com>
gcc/testsuite/
* c-c++-common/gomp/target-50.c: Modify scan pattern.
These testsuite hunks got left attached to the wrong patch in the series
I just posted. I will apply as obvious.
2022-09-15 Julian Brown <julian@codesourcery.com>
gcc/testsuite/
* c-c++-common/goacc/mdc-2.c: Update expected errors.
* g++.dg/goacc/mdc.C: Likewise.
Hi,
Test cases are updated/added, and code is refined as the comments in the
review for previous version:
https://gcc.gnu.org/pipermail/gcc-patches/2022-September/600768.html
As mentioned in PR106550, since pli could support 34bits immediate, we could
use less instructions(3insn would be ok) to build 64bits constant with pli.
For example, for constant 0x020805006106003, we could generate it with:
asm code1:
pli 9,101736451 (0x6106003)
sldi 9,9,32
paddi 9,9, 2130000 (0x0208050)
or asm code2:
pli 10, 2130000
pli 9, 101736451
rldimi 9, 10, 32, 0
The asm code2 would be better.
This patch generates the asm code2 in split1 pass, this patch also supports
to generate asm code1 when splitter is only after RA.
This patch pass boostrap and regtest on ppc64. P10 testing is running.
Thanks for any comments!
BR,
Jeff(Jiufu)
PR target/106550
gcc/ChangeLog:
* config/rs6000/rs6000.cc (rs6000_emit_set_long_const): Use pli.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr106550.c: New test.
* gcc.target/powerpc/pr106550_1.c: New test.
This adds annotations to std::atomic<shared_ptr<T>> to enable TSan to
understand the custom locking. Without this, TSan reports data races for
accesses to the _M_ptr member, even though those are correctly
synchronized using atomic operations on the tagged pointer.
libstdc++-v3/ChangeLog:
* include/bits/shared_ptr_atomic.h (_GLIBCXX_TSAN_MUTEX_DESTROY)
(_GLIBCXX_TSAN_MUTEX_PRE_LOCK, _GLIBCXX_TSAN_MUTEX_POST_LOCK)
(_GLIBCXX_TSAN_MUTEX_PRE_UNLOCK, _GLIBCXX_TSAN_MUTEX_POST_UNLOCK)
(_GLIBCXX_TSAN_MUTEX_PRE_SIGNAL, _GLIBCXX_TSAN_MUTEX_POST_SIGNAL):
Define macros for TSan annotation functions.
(_Sp_atomic::_Atomic_count): Add annotations.
This is needed for std::nothrow and the nothrow operator new overload,
so should be included explicitly.
libstdc++-v3/ChangeLog:
* include/bits/stl_tempbuf.h: Include <new>.
Without this assertion, the shared state is made ready, but contains
neither a value nor an exception. Add an assertion to prevent users from
accessing a value that was never initialized in the shared state.
libstdc++-v3/ChangeLog:
* include/std/future
(_State_baseV2::__setter(exception_ptr&, promise&)): Add
assertion for LWG 2276 precondition.
* testsuite/30_threads/promise/members/set_exception_neg.cc:
New test.
To display (o-,i-)stringstreams in the common case, we just print the
underlying stringbuf, without the many ios_base members. In the
unconventional case that the underlying streambuf was redirected, we
report the redirected target.
Signed-off-by: Philipp Fent <fent@in.tum.de>
libstdc++-v3/ChangeLog:
* python/libstdcxx/v6/printers.py (access_streambuf_ptrs):
New helper function.
(StdStringBufPrinter, StdStringStreamPrinter): New printers.
(build_libstdcxx_dictionary): Register stringstream printers.
* testsuite/libstdc++-prettyprinters/debug.cc: Check string
streams.
* testsuite/libstdc++-prettyprinters/simple.cc: Likewise.
* testsuite/libstdc++-prettyprinters/simple11.cc: Likewise.
Every time there's equality at play, we must be careful that any
equality with zero matches both -0.0 and +0.0 when honoring signed
zeros.
We were doing this correctly for the == and != op1_range operators
(albeit inefficiently), but aren't doing it at all when building >=
and <=. This fixes the oversight.
There is change in functionality here for the build_* functions.
This is the last "simple" patch I submit before overhauling NAN and
sign tracking. And that will likely be after Cauldron because it will need
further testing (lapack, ppc64le, -ffinite-math-only, etc).
Regstrapped on x86-64 Linux, plus I ran selftests for
-ffinite-math-only.
gcc/ChangeLog:
* range-op-float.cc (frange_add_zeros): New.
(build_le): Call frange_add_zeros.
(build_ge): Same.
(foperator_equal::op1_range): Same.
(foperator_not_equal::op1_range): Same.
The build_<relop> helper functions in range-op-float.cc take the
actual value from the operand's endpoint, but this value could be
deduced from the operand itself therefore cleaning up the call site.
This also reduces the potential of mistakenly passing the wrong bound.
No functional changes.
Regstrapped on x86-64 Linux, plus I ran selftests for
-ffinite-math-only.
gcc/ChangeLog:
* range-op-float.cc (build_le): Accept frange instead of number.
(build_lt): Same.
(build_ge): Same.
(build_gt): Same.
(foperator_lt::op1_range): Pass full range to build_*.
(foperator_lt::op2_range): Same.
(foperator_le::op1_range): Same.
(foperator_le::op2_range): Same.
(foperator_gt::op1_range): Same.
(foperator_gt::op2_range): Same.
(foperator_ge::op1_range): Same.
(foperator_ge::op2_range): Same.
This patch cleans up the frange::set() code by passing all things NAN
to frange::set_nan().
No functional changes.
Regstrapped on x86-64 Linux, plus I ran selftests for
-ffinite-math-only.
gcc/ChangeLog:
* value-range.cc (frange::set): Use set_nan.
* value-range.h (frange::set_nan): Inline code originally in
set().
set_* has a very specific meaning for irange's and friends. Methods
prefixed with set_* are setters clobbering the existing range. As
such, the current set_nan() method is confusing in that it's not
actually setting a range to a NAN, but twiddling the NAN flags for an
existing frange.
This patch replaces set_nan() with an update_nan() to set the flag,
and clear_nan() to clear it. This makes the code clearer, and though
the confusing tristate is still there, it will be removed in upcoming
patches.
Also, there is now an actual set_nan() method to set the range to a
NAN. This replaces two out of class functions doing the same thing.
In future patches I will also add the ability to create a NAN with a
specific sign, but doing so now would be confusing because we're not
tracking NAN signs.
We should also submit set_signbit to the same fate, but it's about to
get removed.
No functional changes.
Regstrapped on x86-64 Linux, plus I ran selftests for
-ffinite-math-only.
gcc/ChangeLog:
* range-op-float.cc (frange_set_nan): Remove.
(build_lt): Use set_nan, update_nan, clear_nan.
(build_gt): Same.
(foperator_equal::op1_range): Same.
(foperator_not_equal::op1_range): Same.
(foperator_lt::op1_range): Same.
(foperator_lt::op2_range): Same.
(foperator_le::op1_range): Same.
(foperator_le::op2_range): Same.
(foperator_gt::op1_range): Same.
(foperator_gt::op2_range): Same.
(foperator_ge::op1_range): Same.
(foperator_ge::op2_range): Same.
(foperator_unordered::op1_range): Same.
(foperator_ordered::op1_range): Same.
* value-query.cc (range_query::get_tree_range): Same.
* value-range.cc (frange::set_nan): Same.
(frange::update_nan): Same.
(frange::union_): Same.
(frange::intersect): Same.
(range_tests_nan): Same.
(range_tests_signed_zeros): Same.
(range_tests_signbit): Same.
(range_tests_floats): Same.
* value-range.h (class frange): Add update_nan and clear_nan.
(frange::set_nan): New.
Following are a series of cleanups to the frange code in preparation
for a much more invasive patch rewriting the NAN and sign tracking
bits. Please be patient, as I'm trying to break everything up into
small chunks instead of dropping a mondo patch removing the NAN and
sign tristate handling.
No functional changes.
Regstrapped on x86-64 Linux, plus I ran selftests for
-ffinite-math-only.
gcc/ChangeLog:
* value-query.cc (range_query::get_tree_range): Remove check for overflow.
* value-range-pretty-print.cc (vrange_printer::visit): Move read
of type until after undefined_p is checked.
* value-range.cc (frange::set): Remove asserts for REAL_CST.
(frange::contains_p): Tidy up.
(range_tests_nan): Add comment.
* value-range.h (frange::type): Check for undefined_p.
(frange::set_undefined): Remove set of endpoints.
This patch adjusts OpenMP/OpenACC clause list handling in a couple of
places, in preparation for the following mapping-clause expansion rework
patch. Firstly mapping groups are removed as a whole in the C and C++
front-ends when an error is detected, which avoids leaving "badly-formed"
mapping clause groups in the list.
Secondly, reindexing of the omp_mapping_group hashmap (during
omp_build_struct_sibling_lists) has been reimplemented, fixing some
tricky corner-cases where mapping groups are removed from a list at the
same time as it is being reordered.
Thirdly, code to check if a different clause on the same directive maps
the whole of a struct that we have a component mapping for (for example)
has been outlined, removing a bit of code duplication.
2022-09-13 Julian Brown <julian@codesourcery.com>
gcc/
* gimplify.cc (omp_group_last): Allow GOMP_MAP_ATTACH_DETACH after
GOMP_MAP_STRUCT (for reindexing).
(omp_gather_mapping_groups): Reimplement using...
(omp_gather_mapping_groups_1): This new function. Stop processing at
GATHER_SENTINEL.
(omp_group_base): Allow GOMP_MAP_TO_PSET without any following node.
(omp_index_mapping_groups): Reimplement using...
(omp_index_mapping_groups_1): This new function. Handle
REINDEX_SENTINEL.
(omp_reindex_mapping_groups, omp_mapped_by_containing_struct): New
functions.
(omp_tsort_mapping_groups_1): Adjust handling of base group being the
same as current group. Use omp_mapped_by_containing_struct.
(omp_build_struct_sibling_lists): Use omp_mapped_by_containing_struct
and omp_reindex_mapping_groups. Robustify group deletion for reordered
lists.
(gimplify_scan_omp_clauses): Update calls to
omp_build_struct_sibling_lists.
gcc/c/
* c-typeck.cc (c_finish_omp_clauses): Remove whole mapping node group
on error.
gcc/cp/
* semantics.cc (finish_omp_clauses): Likewise.
This patch refactors struct sibling-list processing in gimplify.cc, and
adjusts some related mapping-clause processing in the Fortran FE and
omp-low.cc accordingly.
2022-09-13 Julian Brown <julian@codesourcery.com>
gcc/fortran/
* trans-openmp.cc (gfc_trans_omp_clauses): Don't create
GOMP_MAP_TO_PSET mappings for class metadata, nor GOMP_MAP_POINTER
mappings for POINTER_TYPE_P decls.
gcc/
* gimplify.cc (gimplify_omp_var_data): Remove GOVD_MAP_HAS_ATTACHMENTS.
(GOMP_FIRSTPRIVATE_IMPLICIT): Renumber.
(insert_struct_comp_map): Refactor function into...
(build_omp_struct_comp_nodes): This new function. Remove list handling
and improve self-documentation.
(extract_base_bit_offset): Remove BASE_REF, OFFSETP parameters. Move
code to strip outer parts of address out of function, but strip no-op
conversions.
(omp_mapping_group): Add DELETED field for use during reindexing.
(omp_strip_components_and_deref, omp_strip_indirections): New functions.
(omp_group_last, omp_group_base): Add GOMP_MAP_STRUCT handling.
(omp_gather_mapping_groups): Initialise DELETED field for new groups.
(omp_index_mapping_groups): Notice DELETED groups when (re)indexing.
(omp_siblist_insert_node_after, omp_siblist_move_node_after,
omp_siblist_move_nodes_after, omp_siblist_move_concat_nodes_after): New
helper functions.
(omp_accumulate_sibling_list): New function to build up GOMP_MAP_STRUCT
node groups for sibling lists. Outlined from gimplify_scan_omp_clauses.
(omp_build_struct_sibling_lists): New function.
(gimplify_scan_omp_clauses): Remove struct_map_to_clause,
struct_seen_clause, struct_deref_set. Call
omp_build_struct_sibling_lists as pre-pass instead of handling sibling
lists in the function's main processing loop.
(gimplify_adjust_omp_clauses_1): Remove GOVD_MAP_HAS_ATTACHMENTS
handling, unused now.
* omp-low.cc (scan_sharing_clauses): Handle pointer-type indirect
struct references, and references to pointers to structs also.
gcc/testsuite/
* g++.dg/goacc/member-array-acc.C: New test.
* g++.dg/gomp/member-array-omp.C: New test.
* g++.dg/gomp/target-3.C: Update expected output.
* g++.dg/gomp/target-lambda-1.C: Likewise.
* g++.dg/gomp/target-this-2.C: Likewise.
* c-c++-common/goacc/deep-copy-arrayofstruct.c: Move test from here.
* c-c++-common/gomp/target-50.c: New test.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/deep-copy-15.c: New test.
* testsuite/libgomp.oacc-c-c++-common/deep-copy-16.c: New test.
* testsuite/libgomp.oacc-c++/deep-copy-17.C: New test.
* testsuite/libgomp.oacc-c-c++-common/deep-copy-arrayofstruct.c: Move
test to here, make "run" test.
After inlining and IPA transforms we run fixup_cfg to fixup CFG
effects in other functions. But that fails to clean abnormal
edges from non-pure/const calls which might no longer be necessary
when ->calls_setjmp is false. The following ensures this happens
and refactors things so we call EH/abnormal cleanup only on the
last stmt in a block.
PR tree-optimization/106938
* tree-cfg.cc (execute_fixup_cfg): Purge dead abnormal
edges for all last stmts in a block. Do EH cleanup
only on the last stmt in a block.
* gcc.dg/pr106938.c: New testcase.
This assert was put here to make sure that the legacy
get_value_range() wasn't being called on stuff that legacy couldn't
handle (floats, etc), because the result would ultimately be copied
into a value_range_equiv.
In this case, simplify_casted_cond() is calling it on an offset_type
which is neither an integer nor a pointer. However, range_of_expr
happily punted on it, and then the fallthru code set the range to
VARYING. As value_range_equiv can store VARYING types of anything
(including types it can't handle), this is fine.
The easiest thing to do is remove the assert. If someone from the non
legacy world tries to get a non integer/pointer range here, it's going
to blow up anyhow because the temporary in get_value_range is
int_range_max.
PR tree-optimization/106936
gcc/ChangeLog:
* value-query.cc (range_query::get_value_range): Remove assert.
gcc/testsuite/ChangeLog:
* g++.dg/tree-ssa/pr106936.C: New test.
This patch reimplements the omp_target_reorder_clauses function in
anticipation of supporting "deeper" struct mappings (that is, with
several structure dereference operators, or similar).
The idea is that in place of the (possibly quadratic) algorithm in
omp_target_reorder_clauses that greedily moves clauses containing
addresses that are subexpressions of other addresses before those other
addresses, we employ a topological sort algorithm to calculate a proper
order for map clauses. This should run in linear time, and hopefully
handles degenerate cases where multiple "levels" of indirect accesses
are present on a given directive.
The new method also takes care to keep clause groups together, addressing
the concerns raised in:
https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570501.html
To figure out if some given clause depends on a base pointer in another
clause, we strip off the outer layers of the address expression, and check
(via a tree_operand_hash hash table we have built) if the result is a
"base pointer" as defined in OpenMP 5.0 (1.2.6 Data Terminology). There
are some subtleties involved, however:
- We must treat MEM_REF with zero offset the same as INDIRECT_REF.
This should probably be fixed in the front ends instead so we always
use a canonical form (probably INDIRECT_REF). The following patch
shows one instance of the problem, but there may be others:
https://gcc.gnu.org/pipermail/gcc-patches/2021-May/571382.html
- Mapping a whole struct implies mapping each of that struct's
elements, which may be base pointers. Because those base pointers
aren't necessarily explicitly referenced in the directive in question,
we treat the whole-struct mapping as a dependency instead.
2022-09-13 Julian Brown <julian@codesourcery.com>
gcc/
* gimplify.cc (is_or_contains_p, omp_target_reorder_clauses): Delete
functions.
(omp_tsort_mark): Add enum.
(omp_mapping_group): Add struct.
(debug_mapping_group, omp_get_base_pointer, omp_get_attachment,
omp_group_last, omp_gather_mapping_groups, omp_group_base,
omp_index_mapping_groups, omp_containing_struct,
omp_tsort_mapping_groups_1, omp_tsort_mapping_groups,
omp_segregate_mapping_groups, omp_reorder_mapping_groups): New
functions.
(gimplify_scan_omp_clauses): Call above functions instead of
omp_target_reorder_clauses, unless we've seen an error.
* omp-low.cc (scan_sharing_clauses): Avoid strict test if we haven't
sorted mapping groups.
gcc/testsuite/
* g++.dg/gomp/target-lambda-1.C: Adjust expected output.
* g++.dg/gomp/target-this-3.C: Likewise.
* g++.dg/gomp/target-this-4.C: Likewise.
My change to match.pd (that added the two simplifications this patch
touches) results in more |/^/& assignments with pointer arguments,
but since r12-1608 we reject pointer operands for BIT_NOT_EXPR.
Disallowing them for BIT_NOT_EXPR and allowing for BIT_{IOR,XOR,AND}_EXPR
leads to a match.pd maintainance nightmare (see one of the patches in the
PR), so either we want to allow pointer operand on BIT_NOT_EXPR (but then
we run into issues e.g. with the ranger which expects it can emulate
BIT_NOT_EXPR ~X as - 1 - X which doesn't work for pointers which don't
support MINUS_EXPR), or the following patch disallows pointer arguments
for all of BIT_{IOR,XOR,AND}_EXPR with the exception of BIT_AND_EXPR
with INTEGER_CST last operand (for simpler pointer realignment).
I had to tweak one reassoc optimization and the two match.pd
simplifications.
2022-09-14 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/106878
* tree-cfg.cc (verify_gimple_assign_binary): Disallow pointer,
reference or OFFSET_TYPE BIT_IOR_EXPR, BIT_XOR_EXPR or, unless
the second argument is INTEGER_CST, BIT_AND_EXPR.
* match.pd ((type) X op CST -> (type) (X op ((type-x) CST)),
(type) (((type2) X) op Y) -> (X op (type) Y)): Punt for
POINTER_TYPE_P or OFFSET_TYPE.
* tree-ssa-reassoc.cc (optimize_range_tests_cmp_bitwise): For
pointers cast them to pointer sized integers first.
* gcc.c-torture/compile/pr106878.c: New test.
The following avoids creating BIT_FIELD_REF of bitfields in
update-address-taken. The patch doesn't implement punning to
a full precision integer type but leaves a comment according to
that.
PR tree-optimization/106934
* tree-ssa.cc (non_rewritable_mem_ref_base): Avoid BIT_FIELD_REFs
of bitfields.
(maybe_rewrite_mem_ref_base): Likewise.
* gfortran.dg/pr106934.f90: New testcase.