mirror/gcc - gcc - Collaboration & Inovation

mirror/gcc

mirror of git://gcc.gnu.org/git/gcc.git synced 2025-03-20 03:20:25 +08:00

Author	SHA1	Message	Date
Jakub Jelinek	3d6bb83202	phiopt: Improve value_replacement maybe equal phires range handling My previous patch added throwing away of SSA_NAME_RANGE_INFO of phires when we have phires = x != carg ? x : oarg, but that could throw away useful range info, all we need is merge phires current range info with the carg constant which can newly appear there (and the optimization proved the single user doesn't care about that). 2022-12-23 Jakub Jelinek <jakub@redhat.com> Aldy Hernandez <aldyh@redhat.com> * tree-ssa-phiopt.cc (value_replacement): Instead of resetting phires range info, union it with carg.	2022-12-23 16:19:08 +01:00
Jakub Jelinek	fd1b0aefda	tree-ssa-dom: can_infer_simple_equiv fixes [PR108068] As reported in the PR, tree-ssa-dom.cc uses real_zerop call to find if a floating point constant is zero and it shouldn't try to infer equivalences from comparison against it if signed zeros are honored. This doesn't work at all for decimal types, because real_zerop always returns false for them (one can have different representations of decimal zero beyond -0/+0), and it doesn't work for vector compares either, as real_zerop checks if all elements are zero, while we need to avoid infering equivalences from comparison against vector constants which have at least one zero element in it (if signed zeros are honored). Furthermore, as mentioned by Joseph, for decimal types many other values aren't singleton. So, this patch stops infering anything if element mode is decimal, and otherwise uses instead of real_zerop a new function, real_maybe_zerop, which will work even for decimal types and for complex or vector will return true if any element is or might be zero (so it returns true for anything but constants for now). 2022-12-23 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/108068 * tree.h (real_maybe_zerop): Declare. * tree.cc (real_maybe_zerop): Define. * tree-ssa-dom.cc (record_edge_info): Use it instead of real_zerop or TREE_CODE (op1) == SSA_NAME \|\| real_zerop. Always set can_infer_simple_equiv to false for decimal floating point types. * gcc.dg/dfp/pr108068.c: New test.	2022-12-23 16:12:21 +01:00
Patrick Palka	bd1fc4a219	c++: template friend with variadic constraints [PR107853] When instantiating a constrained hidden template friend, we substitute into its template-head requirements in tsubst_friend_function. For this substitution we use the template's full argument vector whose outer levels correspond to the instantiated class's arguments and innermost level corresponds to the template's own level-lowered generic arguments. But for A<int>::f here, for which the relevant argument vector is {{int}, {Us...}}, the substitution into (C<Ts, Us> && ...) triggers the assert in use_pack_expansion_extra_args_p since one argument is a pack expansion and the other isn't. And for A<int, int>::f, for which the relevant argument vector is {{int, int}, {Us...}}, the use_pack_expansion_extra_args_p assert would also trigger but we first get a bogus "mismatched argument pack lengths" error from tsubst_pack_expansion. Sidestepping the question of whether tsubst_pack_expansion should be able to handle such substitutions, it seems we can work around this by using only the instantiated class's arguments and not also the template friend's own generic arguments, which is consistent with how we normally substitute into the signature of a member template. PR c++/107853 gcc/cp/ChangeLog: * constraint.cc (maybe_substitute_reqs_for): Substitute into the template-head requirements of a template friend using only its outer arguments via outer_template_args. * cp-tree.h (outer_template_args): Declare. * pt.cc (outer_template_args): Define, factored out and generalized from ... (ctor_deduction_guides_for): ... here. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/concepts-friend12.C: New test. * g++.dg/cpp2a/concepts-friend13.C: New test.	2022-12-23 09:18:37 -05:00
Jonathan Wakely	b358521b60	libstdc++: Fix Darwin bootstrap error in src/c++20/tzdb.cc Mach-O requires weak symbols to have a definition, so add a default implementation of __gnu_cxx::zoneinfo_dir_override. libstdc++-v3/ChangeLog: * src/c++20/tzdb.cc [__APPLE__] (zoneinfo_dir_override): Add definition.	2022-12-23 13:45:30 +00:00
Julian Brown	1e7d2b2d22	Fortran: Typo/unicode-o fixes This patch fixes a minor typo in dump output and a stray unicode character in a comment. 2022-06-01 Julian Brown <julian@codesourcery.com> gcc/fortran/ * dump-parse-tree.cc (show_attr): Fix OMP-UDR-ARTIFICIAL-VAR typo. * trans-openmp.cc (gfc_trans_omp_array_section): Replace stray unicode m-dash character with hyphen.	2022-12-23 10:50:33 +00:00
Roger Sayle	0b2c1369d0	PR target/107548: Handle vec_select in STV on x86. This patch enhances x86's STV pass to handle VEC_SELECT during general scalar chain conversion, performing SImode scalar extraction from V4SI and DImode scalar extraction from V2DI in vector registers. The motivating test case from bugzilla is: typedef unsigned int v4si __attribute__((vector_size(16))); unsigned int f (v4si a, v4si b) { a[0] += b[0]; return a[0] + a[1]; } currently with -O2 -march=znver2 this generates: vpextrd $1, %xmm0, %edx vmovd %xmm0, %eax addl %edx, %eax vmovd %xmm1, %edx addl %edx, %eax ret which performs three transfers from the vector unit to the scalar unit, and performs the two additions there. With this patch, we now generate: vmovdqa %xmm0, %xmm2 vpshufd $85, %xmm0, %xmm0 vpaddd %xmm0, %xmm2, %xmm0 vpaddd %xmm1, %xmm0, %xmm0 vmovd %xmm0, %eax ret which performs the two additions in the vector unit, and then transfers the result to the scalar unit. Technically the (cheap) movdqa isn't needed with better register allocation (or this could be cleaned up during peephole2), but even so this transform is still a win. 2022-12-23 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog PR target/107548 * config/i386/i386-features.cc (scalar_chain::add_insn): The operands of a VEC_SELECT don't need to added to the scalar chain. (general_scalar_chain::compute_convert_gain) <case VEC_SELECT>: Provide gains for performing STV on a VEC_SELECT. (general_scalar_chain::convert_insn): Convert VEC_SELECT to pshufd, psrldq or no-op. (general_scalar_to_vector_candidate_p): Handle VEC_SELECT of a single element from a vector register to a scalar register. gcc/testsuite/ChangeLog PR target/107548 * gcc.target/i386/pr107548-1.c: New test V4SI case. * gcc.target/i386/pr107548-2.c: New test V2DI case.	2022-12-23 09:58:13 +00:00
Roger Sayle	24a7980d0f	PR target/106933: Limit TImode STV to SSA-like def-use chains on x86. With many thanks to H.J. for doing all the hard work, this patch resolves two P1 regressions; PR target/106933 and PR target/106959. Although superficially similar, the i386 backend's two scalar-to-vector (STV) passes perform their transformations in importantly different ways. The original pass converting SImode and DImode operations to V4SImode or V2DImode operations is "soft", allowing values to be maintained in both integer and vector hard registers. The newer pass converting TImode operations to V1TImode is "hard" (all or nothing) that converts all uses of a pseudo to vector form. To implement this it invokes powerful ju-ju calling SET_MODE on a reg_rtx, which due to RTL sharing, often updates this pseudo's mode everywhere in the RTL chain. Hence, TImode STV can only be performed when all uses of a pseudo are convertible to V1TImode form. To ensure this the STV passes currently use data-flow analysis to inspect all DEFs and USEs in a chain. This works fine for chains that are in the usual single assignment form, but the occurrence of uninitialized variables, or multiple assignments that split a pseudo's usage into several independent chains (lifetimes) can lead to situations where some but not all of a pseudo's occurrences need to be updated. This is safe for the SImode/DImode pass, but leads to the above bugs during the TImode pass. My one minor tweak to HJ's patch from comment #4 of bugzilla PR106959 is to only perform the new single_def_chain_p check for TImode STV; it turns out that STV of SImode/DImode min/max operates safely on multiple-def chains, and prohibiting this leads to testsuite regressions. We don't (yet) support V1TImode min/max, so this idiom isn't an issue during the TImode STV pass. For the record, the two alternate possible fixes are (i) make the TImode STV pass "soft", by eliminating use of SET_MODE, instead using replace_rtx with a new pseudo, or (ii) merging "chains" so that multiple DFA chains/lifetimes are considered a single STV chain. 2022-12-23 H.J. Lu <hjl.tools@gmail.com> Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog PR target/106933 PR target/106959 * config/i386/i386-features.cc (single_def_chain_p): New predicate function to check that a pseudo's use-def chain is in SSA form. (timode_scalar_to_vector_candidate_p): Check that TImode regs that are SET_DEST or SET_SRC of an insn match/are single_def_chain_p. gcc/testsuite/ChangeLog PR target/106933 PR target/106959 * gcc.target/i386/pr106933-1.c: New test case. * gcc.target/i386/pr106933-2.c: Likewise. * gcc.target/i386/pr106959-1.c: Likewise. * gcc.target/i386/pr106959-2.c: Likewise. * gcc.target/i386/pr106959-3.c: Likewise.	2022-12-23 09:50:18 +00:00
Jonathan Wakely	db3c5831f8	libstdc++: Remove problematic static_assert from src/c++20/tzdb.cc This assertion fails for cris-elf where sizeof(datetime) is only 7, due to lower alignment requirements. The assertion was used while I was writing the code to check that the objects were as compact as I wanted, but it doesn't need to be kept now. libstdc++-v3/ChangeLog: * src/c++20/tzdb.cc: Remove static_assert.	2022-12-23 09:21:47 +00:00
Iain Sandoe	a846817739	c++, driver: Fix -static-libstdc++ for targets without Bstatic/dynamic. The current implementation for swapping between the static and shared c++ runtimes relies on the static linker supporting Bstatic/dynamic which is not available for every target (Darwin's linker does not support this). Specs substitution (%s) is an alternative solution for this (which is what Darwin uses for Fortran, D and Objective-C). However, specs substitution requires that the '-static-libstdc++' be preserved in the driver's command line. The patch here arranges for this to be done when the configuration determines that linker support for Bstatic/dynamic is missing. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk> gcc/cp/ChangeLog: * g++spec.cc (lang_specific_driver): Preserve -static-libstdc++ in the driver command line for targets without -Bstatic/dynamic support in their static linker.	2022-12-23 08:53:17 +00:00
Ju-Zhe Zhong	16eb1f43ab	RISC-V: Fix vle constraints gcc/ChangeLog: * config/riscv/vector.md: Fix contraints. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/vle-constraint-1.c: New test.	2022-12-23 13:42:04 +08:00
Ju-Zhe Zhong	a143c3f7a6	RISC-V: Support vle.v/vse.v intrinsics gcc/ChangeLog: * config/riscv/riscv-protos.h (get_avl_type_rtx): New function. * config/riscv/riscv-v.cc (get_avl_type_rtx): Ditto. * config/riscv/riscv-vector-builtins-bases.cc (class loadstore): New class. (BASE): Ditto. * config/riscv/riscv-vector-builtins-bases.h: Ditto. * config/riscv/riscv-vector-builtins-functions.def (vle): Ditto. (vse): Ditto. * config/riscv/riscv-vector-builtins-shapes.cc (build_one): Ditto. (struct loadstore_def): Ditto. (SHAPE): Ditto. * config/riscv/riscv-vector-builtins-shapes.h: Ditto. * config/riscv/riscv-vector-builtins-types.def (DEF_RVV_U_OPS): New macro. (DEF_RVV_F_OPS): Ditto. (vuint8mf8_t): Add corresponding mask type. (vuint8mf4_t): Ditto. (vuint8mf2_t): Ditto. (vuint8m1_t): Ditto. (vuint8m2_t): Ditto. (vuint8m4_t): Ditto. (vuint8m8_t): Ditto. (vuint16mf4_t): Ditto. (vuint16mf2_t): Ditto. (vuint16m1_t): Ditto. (vuint16m2_t): Ditto. (vuint16m4_t): Ditto. (vuint16m8_t): Ditto. (vuint32mf2_t): Ditto. (vuint32m1_t): Ditto. (vuint32m2_t): Ditto. (vuint32m4_t): Ditto. (vuint32m8_t): Ditto. (vuint64m1_t): Ditto. (vuint64m2_t): Ditto. (vuint64m4_t): Ditto. (vuint64m8_t): Ditto. (vfloat32mf2_t): Ditto. (vfloat32m1_t): Ditto. (vfloat32m2_t): Ditto. (vfloat32m4_t): Ditto. (vfloat32m8_t): Ditto. (vfloat64m1_t): Ditto. (vfloat64m2_t): Ditto. (vfloat64m4_t): Ditto. (vfloat64m8_t): Ditto. * config/riscv/riscv-vector-builtins.cc (DEF_RVV_TYPE): Adjust for new macro. (DEF_RVV_I_OPS): Ditto. (DEF_RVV_U_OPS): New macro. (DEF_RVV_F_OPS): New macro. (use_real_mask_p): New function. (use_real_merge_p): Ditto. (get_tail_policy_for_pred): Ditto. (get_mask_policy_for_pred): Ditto. (function_builder::apply_predication): Ditto. (function_builder::append_base_name): Ditto. (function_builder::append_sew): Ditto. (function_expander::add_vundef_operand): Ditto. (function_expander::add_mem_operand): Ditto. (function_expander::use_contiguous_load_insn): Ditto. (function_expander::use_contiguous_store_insn): Ditto. * config/riscv/riscv-vector-builtins.def (DEF_RVV_TYPE): Adjust for adding mask type. (vbool64_t): Ditto. (vbool32_t): Ditto. (vbool16_t): Ditto. (vbool8_t): Ditto. (vbool4_t): Ditto. (vbool2_t): Ditto. (vbool1_t): Ditto. (vint8mf8_t): Ditto. (vint8mf4_t): Ditto. (vint8mf2_t): Ditto. (vint8m1_t): Ditto. (vint8m2_t): Ditto. (vint8m4_t): Ditto. (vint8m8_t): Ditto. (vint16mf4_t): Ditto. (vint16mf2_t): Ditto. (vint16m1_t): Ditto. (vint16m2_t): Ditto. (vint16m4_t): Ditto. (vint16m8_t): Ditto. (vint32mf2_t): Ditto. (vint32m1_t): Ditto. (vint32m2_t): Ditto. (vint32m4_t): Ditto. (vint32m8_t): Ditto. (vint64m1_t): Ditto. (vint64m2_t): Ditto. (vint64m4_t): Ditto. (vint64m8_t): Ditto. (vfloat32mf2_t): Ditto. (vfloat32m1_t): Ditto. (vfloat32m2_t): Ditto. (vfloat32m4_t): Ditto. (vfloat32m8_t): Ditto. (vfloat64m1_t): Ditto. (vfloat64m4_t): Ditto. * config/riscv/riscv-vector-builtins.h (function_expander::add_output_operand): New function. (function_expander::add_all_one_mask_operand): Ditto. (function_expander::add_fixed_operand): Ditto. (function_expander::vector_mode): Ditto. (function_base::apply_vl_p): Ditto. (function_base::can_be_overloaded_p): Ditto. * config/riscv/riscv-vsetvl.cc (get_vl): Remove restrict of supporting AVL is not VLMAX. * config/riscv/t-riscv: Add include file.	2022-12-23 13:41:34 +08:00
Ju-Zhe Zhong	55d65ad4fd	RISC-V: Update vsetvl/vsetvlmax intrinsics to the latest api name. gcc/ChangeLog: * config/riscv/riscv-vector-builtins-shapes.cc (struct vsetvl_def): Add "__riscv_" prefix. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/vsetvl-1.c: Add "__riscv_" prefix.	2022-12-23 13:41:26 +08:00
Ju-Zhe Zhong	b47b33c799	RISC-V: Remove side effects of vsetvl pattern in RTL. gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc: Change it to no side effects. * config/riscv/vector.md (@vsetvl<mode>_no_side_effects): New pattern.	2022-12-23 13:41:23 +08:00
Ju-Zhe Zhong	37fd10fd3e	RISC-V: Remove side effects of vsetvl/vsetvlmax intriniscs in properties gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc: Remove side effects.	2022-12-23 13:41:20 +08:00
Ju-Zhe Zhong	9374f766a7	RISC-V: Fix incorrect annotation gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (anticipatable_occurrence_p): Fix incorrect annotations. (available_occurrence_p): Ditto. (backward_propagate_worthwhile_p): Ditto. (can_backward_propagate_p): Ditto.	2022-12-23 13:41:11 +08:00
Ju-Zhe Zhong	85112fbbfd	RISC-V: Fix muti-line condition format gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (vlmax_avl_insn_p): Fix multi-line conditional. (vsetvl_insn_p): Ditto. (same_bb_and_before_p): Ditto. (same_bb_and_after_or_equal_p): Ditto.	2022-12-23 13:40:56 +08:00
Steve Kargl	7e76cd9695	Remove not needed assert macro which fails. PR fortran/106731 gcc/fortran/ChangeLog: * trans-array.cc (gfc_trans_auto_array_allocation): Remove gcc_assert (!TREE_STATIC()). gcc/testsuite/ChangeLog: * gfortran.dg/pr106731.f90: New test.	2022-12-22 21:19:39 -08:00
Arsen Arsenović	8ec5fcb6fc	libstdc++: Improve output of default contract violation handler [PR107792] Make the output more readable. Don't output anything unless verbose termination is enabled at configure-time. The testsuite change was almost entirely mechanical. Save for two files which had very short matches, these changes were produced by two seds and a Perl script, for the more involved cases. The latter will be added in a subsequent commit. The former are as follows: sed -E -i "/dg-output/s/default std::handle_contract_violation called: \ (\S+) (\S+) (\S+(<[A-Za-z0-9, ])?>?)\ /contract violation in function \3 at \1:\2: /" .C sed -i '/dg-output/s/ / /g' Whichever files remained failing after the above changes were checked-out, re-ran, with output extracted, and ran through dg-out-generator.pl. Co-Authored-By: Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: PR libstdc++/107792 PR libstdc++/107778 src/experimental/contract.cc (handle_contract_violation): Make output more readable. gcc/testsuite/ChangeLog: * g++.dg/contracts/contracts-access1.C: Convert to new default violation handler. * g++.dg/contracts/contracts-assume2.C: Ditto. * g++.dg/contracts/contracts-config1.C: Ditto. * g++.dg/contracts/contracts-constexpr1.C: Ditto. * g++.dg/contracts/contracts-ctor-dtor1.C: Ditto. * g++.dg/contracts/contracts-deduced2.C: Ditto. * g++.dg/contracts/contracts-friend1.C: Ditto. * g++.dg/contracts/contracts-multiline1.C: Ditto. * g++.dg/contracts/contracts-post3.C: Ditto. * g++.dg/contracts/contracts-pre10.C: Ditto. * g++.dg/contracts/contracts-pre2.C: Ditto. * g++.dg/contracts/contracts-pre2a2.C: Ditto. * g++.dg/contracts/contracts-pre3.C: Ditto. * g++.dg/contracts/contracts-pre4.C: Ditto. * g++.dg/contracts/contracts-pre5.C: Ditto. * g++.dg/contracts/contracts-pre7.C: Ditto. * g++.dg/contracts/contracts-pre9.C: Ditto. * g++.dg/contracts/contracts-redecl3.C: Ditto. * g++.dg/contracts/contracts-redecl4.C: Ditto. * g++.dg/contracts/contracts-redecl6.C: Ditto. * g++.dg/contracts/contracts-redecl7.C: Ditto. * g++.dg/contracts/contracts-tmpl-spec1.C: Ditto. * g++.dg/contracts/contracts-tmpl-spec2.C: Ditto. * g++.dg/contracts/contracts-tmpl-spec3.C: Ditto. * g++.dg/contracts/contracts10.C: Ditto. * g++.dg/contracts/contracts14.C: Ditto. * g++.dg/contracts/contracts15.C: Ditto. * g++.dg/contracts/contracts16.C: Ditto. * g++.dg/contracts/contracts17.C: Ditto. * g++.dg/contracts/contracts19.C: Ditto. * g++.dg/contracts/contracts25.C: Ditto. * g++.dg/contracts/contracts3.C: Ditto. * g++.dg/contracts/contracts35.C: Ditto. * g++.dg/contracts/contracts5.C: Ditto. * g++.dg/contracts/contracts7.C: Ditto. * g++.dg/contracts/contracts9.C: Ditto.	2022-12-22 19:48:40 -05:00
Arsen Arsenović	e70380f454	contrib: Add dg-out-generator.pl This script is a helper used to generate dg-output lines from an existing program output conveniently. It takes care of escaping Tcl and ARE stuff. contrib/ChangeLog: * dg-out-generator.pl: New file.	2022-12-22 19:44:07 -05:00
GCC Administrator	40b8ac12df	Daily bump.	2022-12-23 00:17:16 +00:00
Jason Merrill	23be9d78f4	testsuite: don't declare printf in coro.h mingw stdio.h plays horrible games with extern "C++", but it also seems sloppy for coro.h to declare printf in testcases that will also include standard headers. gcc/testsuite/ChangeLog: * g++.dg/coroutines/coro.h: #include <stdio.h> instead of declaring puts/printf. * g++.dg/coroutines/torture/mid-suspend-destruction-0.C: #include <stdio.h>. * g++.dg/coroutines/pr95599.C: Use PRINT instead of puts. * g++.dg/coroutines/torture/call-00-co-aw-arg.C: * g++.dg/coroutines/torture/call-01-multiple-co-aw.C: * g++.dg/coroutines/torture/call-02-temp-co-aw.C: * g++.dg/coroutines/torture/call-03-temp-ref-co-aw.C: * g++.dg/coroutines/torture/co-await-00-trivial.C: * g++.dg/coroutines/torture/co-await-01-with-value.C: * g++.dg/coroutines/torture/co-await-02-xform.C: * g++.dg/coroutines/torture/co-await-03-rhs-op.C: * g++.dg/coroutines/torture/co-await-04-control-flow.C: * g++.dg/coroutines/torture/co-await-05-loop.C: * g++.dg/coroutines/torture/co-await-06-ovl.C: * g++.dg/coroutines/torture/co-await-07-tmpl.C: * g++.dg/coroutines/torture/co-await-08-cascade.C: * g++.dg/coroutines/torture/co-await-09-pair.C: * g++.dg/coroutines/torture/co-await-11-forwarding.C: * g++.dg/coroutines/torture/co-await-12-operator-2.C: * g++.dg/coroutines/torture/co-await-13-return-ref.C: * g++.dg/coroutines/torture/co-await-14-return-ref-to-auto.C: * g++.dg/coroutines/torture/pr95003.C: Likewise.	2022-12-22 18:38:37 -05:00
Jonathan Wakely	ee4af2ed0b	libstdc++: Avoid recursion in __nothrow_wait_cv::wait [PR105730] The commit r12-5877-g9e18a25331fa25 removed the incorrect noexcept-specifier from std::condition_variable::wait and gave the new symbol version @@GLIBCXX_3.4.30. It also redefined the original symbol std::condition_variable::wait(unique_lock<mutex>&)@GLIBCXX_3.4.11 as an alias for a new symbol, __gnu_cxx::__nothrow_wait_cv::wait, which still has the incorrect noexcept guarantee. That __nothrow_wait_cv::wait is just a wrapper around the real condition_variable::wait which adds noexcept and so terminates on a __forced_unwind exception. This doesn't work on uclibc, possibly due to a dynamic linker bug. When __nothrow_wait_cv::wait calls the condition_variable::wait function it binds to the alias symbol, which means it just calls itself recursively until the stack overflows. This change avoids the possibility of a recursive call by changing the __nothrow_wait_cv::wait function so that instead of calling condition_variable::wait it re-implements it. This requires accessing the private _M_cond member of condition_variable, so we need to use the trick of instantiating a template with the member-pointer of the private member. libstdc++-v3/ChangeLog: PR libstdc++/105730 * src/c++11/compatibility-condvar.cc (__nothrow_wait_cv::wait): Access private data member of base class and call its wait member.	2022-12-22 23:34:27 +00:00
Jonathan Wakely	f99b94865f	libstdc++: Add std::format support to <chrono> This adds the operator<< overloads and std::formatter specializations required by C++20 so that <chrono> types can be written to ostreams and printed with std::format. libstdc++-v3/ChangeLog: * include/Makefile.am: Add new header. * include/Makefile.in: Regenerate. * include/std/chrono (operator<<): Move to new header. (nonexistent_local_time::_M_make_what_str): Define correctly. (ambiguous_local_time::_M_make_what_str): Likewise. * include/bits/chrono_io.h: New file. * src/c++20/tzdb.cc (operator<<(ostream&, const Rule&)): Use new ostream output for month and weekday types. * testsuite/20_util/duration/io.cc: Test std::format support. * testsuite/std/time/exceptions.cc: Check what() strings. * testsuite/std/time/syn_c++20.cc: Uncomment local_time_format. * testsuite/std/time/time_zone/get_info_local.cc: Enable check for formatted output of local_info objects. * testsuite/std/time/clock/file/io.cc: New test. * testsuite/std/time/clock/gps/io.cc: New test. * testsuite/std/time/clock/system/io.cc: New test. * testsuite/std/time/clock/tai/io.cc: New test. * testsuite/std/time/clock/utc/io.cc: New test. * testsuite/std/time/day/io.cc: New test. * testsuite/std/time/format.cc: New test. * testsuite/std/time/hh_mm_ss/io.cc: New test. * testsuite/std/time/month/io.cc: New test. * testsuite/std/time/weekday/io.cc: New test. * testsuite/std/time/year/io.cc: New test. * testsuite/std/time/year_month_day/io.cc: New test.	2022-12-22 23:34:27 +00:00
Jonathan Wakely	9247402a29	libstdc++: Add helper function in <format> Add a new __format::__write_padded_as_spec helper to remove duplicated code in formatter specializations. libstdc++-v3/ChangeLog: * include/std/format (__format::__write_padded_as_spec): New function. (__format::__formatter_str, __format::__formatter_int::format) (formatter<const void*, charT>): Use it.	2022-12-22 23:34:26 +00:00
Jonathan Wakely	d33a250f70	libstdc++: Add GDB printers for <chrono> types libstdc++-v3/ChangeLog: * python/libstdcxx/v6/printers.py (StdChronoDurationPrinter) (StdChronoTimePointPrinter, StdChronoZonedTimePrinter) (StdChronoCalendarPrinter, StdChronoTimeZonePrinter) (StdChronoLeapSecondPrinter, StdChronoTzdbPrinter) (StdChronoTimeZoneRulePrinter): New printers.	2022-12-22 23:34:26 +00:00
Jonathan Wakely	9fc61d45fa	libstdc++: Implement C++20 time zone support in <chrono> This is the largest missing piece of C++20 support. Only the cxx11 ABI is supported, due to the use of std::string in the API for time zones. For the old gcc4 ABI, utc_clock and leap seconds are supported, but only using a hardcoded list of leap seconds, no up-to-date tzdb::leap_seconds information is available, and no time zones or zoned_time conversions. The implementation currently depends on a tzdata.zi file being provided by the OS or the user. The expected location is /usr/share/zoneinfo but that can be changed using --with-libstdcxx-zoneinfo-dir=PATH. On targets that support it there is also a weak symbol that users can override in their own program (which also helps with testing): extern "C++" const char* __gnu_cxx::zoneinfo_dir_override(); If no file is found, a fallback tzdb object will be created which only contains the "Etc/UTC" and "Etc/GMT" time zones. A leapseconds file is also expected in the same directory, but if that isn't present then a hardcoded list of leapseconds is used, which is correct at least as far as 2023-06-28 (and it currently looks like no leap second will be inserted for a few years). The tzdata.zi and leapseconds files from https://www.iana.org/time-zones are in the public domain, so shipping copies of them with GCC would be an option. However, the tzdata.zi file will rapidly become outdated, so users should really provide it themselves (or convince their OS vendor to do so). It would also be possible to implement an alternative parser for the compiled tzdata files (one per time zone) under /usr/share/zoneinfo. Those files are present on more operating systems, but do not contain all the information present in tzdata.zi. Specifically, the "links" are not present, so that e.g. "UTC" and "Universal" are distinct time zones, rather than both being links to the canonical "Etc/UTC" zone. For some platforms those files are hard links to the same file, but there's no indication which zone is the canonical name and which is a link. Other platforms just store them in different inodes anyway. I do not plan to add such an alternative parser for the compiled files. That would need to be contributed by maintainers or users of targets that require it, if making tzdata.zi available is not an option. The library ABI would not need to change for a new tzdb implementation, because everything in tzdb_list, tzdb and time_zone is implemented as a pimpl (except for the shared_ptr links between nodes, described below). That means the new exported symbols added by this commit should be stable even if the implementation is completely rewritten. The information from tzdata.zi is parsed and stored in data structures that closely model the info in the file. This is a space-efficient representation that uses less memory that storing every transition for every time zone. It also avoids spending time expanding that information into time zone transitions that might never be needed by the program. When a conversion to/from a local time to UTC is requested the information will be processed to determine the time zone transitions close to the time being converted. There is a bug in some time zone transitions. When generating a sys_info object immediately after one that was previously generated, we need to find the previous rule that was in effect and note its offset and letters. This is so that the start time and abbreviation of the new sys_info will be correct. This only affects time zones that use a format like "C%sT" where the LETTERS replacing %s are non-empty for standard time, e.g. "Asia/Shanghai" which uses "CST" for standard time and "CDT" for daylight time. The tzdb_list structure maintains a linked list of tzdb nodes using shared_ptr links. This allows the iterators into the list to share ownership with the list itself. This offers a non-portable solution to a lifetime issue in the API. Because tzdb objects can be erased from the list using tzdb_list::erase_after, separate modules/libraries in a large program cannot guarantee that any const tzdb& or const time_zone* remains valid indefinitely. Holding onto a tzdb_list::const_iterator will extend the tzdb object's lifetime, even if it's erased from the list. An alternative design would be for the list iterator to hold a weak_ptr. This would allow users to test whether the tzdb still exists when the iterator is dereferenced, which is better than just having a dangling raw pointer. That doesn't actually extend the tzdb's lifetime though, and every use of it would need to be preceded by checking the weak_ptr. Using shared_ptr adds a little bit of overhead but allows users to solve the lifetime issue if they rely on the libstdc++-specific iterator property. libstdc++-v3/ChangeLog: * acinclude.m4 (GLIBCXX_ZONEINFO_DIR): New macro. * config.h.in: Regenerate. * config/abi/pre/gnu.ver: Export new symbols. * configure: Regenerate. * configure.ac (GLIBCXX_ZONEINFO_DIR): Use new macro. * include/std/chrono (utc_clock::from_sys): Correct handling of leap seconds. (nonexistent_local_time::_M_make_what_str): Define. (ambiguous_local_time::_M_make_what_str): Define. (__throw_bad_local_time): Define new function. (time_zone, tzdb_list, tzdb): Implement all members. (remote_version, zoned_time, get_leap_second_info): Define. * include/std/version: Add comment for __cpp_lib_chrono. * src/c++20/Makefile.am: Add new file. * src/c++20/Makefile.in: Regenerate. * src/c++20/tzdb.cc: New file. * testsuite/lib/libstdc++.exp: Define effective target tzdb. * testsuite/std/time/clock/file/members.cc: Check file_time alias and file_clock::now() member. * testsuite/std/time/clock/gps/1.cc: Likewise for gps_clock. * testsuite/std/time/clock/tai/1.cc: Likewise for tai_clock. * testsuite/std/time/syn_c++20.cc: Uncomment everything except parse. * testsuite/std/time/clock/utc/leap_second_info.cc: New test. * testsuite/std/time/exceptions.cc: New test. * testsuite/std/time/time_zone/get_info_local.cc: New test. * testsuite/std/time/time_zone/get_info_sys.cc: New test. * testsuite/std/time/time_zone/requirements.cc: New test. * testsuite/std/time/tzdb/1.cc: New test. * testsuite/std/time/tzdb/leap_seconds.cc: New test. * testsuite/std/time/tzdb_list/1.cc: New test. * testsuite/std/time/tzdb_list/requirements.cc: New test. * testsuite/std/time/zoned_time/1.cc: New test. * testsuite/std/time/zoned_time/custom.cc: New test. * testsuite/std/time/zoned_time/deduction.cc: New test. * testsuite/std/time/zoned_time/req_neg.cc: New test. * testsuite/std/time/zoned_time/requirements.cc: New test. * testsuite/std/time/zoned_traits.cc: New test.	2022-12-22 23:34:20 +00:00
Ian Lance Taylor	907c84cb1d	compiler: remove unused fields This avoids clang warnings: gcc/go/gofrontend/escape.cc:1290:17: warning: private field 'fn_' is not used [-Wunused-private-field] gcc/go/gofrontend/escape.cc:3478:19: warning: private field 'context_' is not used [-Wunused-private-field] gcc/go/gofrontend/lex.h:564:15: warning: private field 'input_file_name_' is not used [-Wunused-private-field] gcc/go/gofrontend/types.cc:5788:20: warning: private field 'call_' is not used [-Wunused-private-field] gcc/go/gofrontend/wb.cc:206:9: warning: private field 'gogo_' is not used [-Wunused-private-field] Path by Martin Liška. Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/458975	2022-12-22 14:26:14 -08:00
Harald Anlauf	794af0d00b	Fortran: check for invalid uses of statement functions arguments [PR69604] gcc/fortran/ChangeLog: PR fortran/69604 * match.cc (chk_stmt_fcn_body): New function. Check for invalid uses of statement functions arguments. (gfc_match_st_function): Use above. gcc/testsuite/ChangeLog: PR fortran/69604 * gfortran.dg/statement_function_4.f90: New test.	2022-12-22 22:03:31 +01:00
Andrew Carlotti	74544bdadc	docs: Fix peephole paragraph ordering The documentation for the DONE and FAIL macros was incorrectly inserted between example code, and a remark attached to that example. gcc/ChangeLog: * doc/md.texi: Move example code remark next to it's code block.	2022-12-22 17:04:57 +00:00
Andrew Carlotti	27afe64c19	docs: Fix inconsistent example predicate name It is unclear why the example C function was renamed to `commutative_integer_operator` as part of ec8e098d in 2004, while the text and the example md were both left as `commutative_operator`. The latter name appears to be more accurate, so revert the 2004 change. gcc/ChangeLog: * doc/md.texi: Fix inconsistent example name.	2022-12-22 17:04:48 +00:00
Andrew Carlotti	e48864e51d	docs: Link to correct section for constraint modifiers gcc/ChangeLog: * doc/md.texi: Fix incorrect pxref.	2022-12-22 16:55:01 +00:00
Richard Biener	b97c33fbd2	bootstrap/106482 - document minimal GCC version There's no explicit mention of what GCC compiler supports C++11 and the cross compiler build requirement mentions GCC 4.8 but not GCC 4.8.3 which is the earliest known version to not run into C++11 implementation bugs. The following adds explicit wording. PR bootstrap/106482 * doc/install.texi (ISO C++11 Compiler): Document GCC version known to work.	2022-12-22 16:01:08 +01:00
Richard Biener	d4a320f1ee	testsuite/107809 - fix vect-recurr testcases This adds a missing effective target check for the permute recurrence vectorization requires. PR testsuite/107809 * gcc.dg/vect/vect-recurr-1.c: Require vect_perm. * gcc.dg/vect/vect-recurr-2.c: Likewise. * gcc.dg/vect/vect-recurr-3.c: Likewise. * gcc.dg/vect/vect-recurr-4.c: Likewise. * gcc.dg/vect/vect-recurr-5.c: Likewise. * gcc.dg/vect/vect-recurr-6.c: Likewise.	2022-12-22 14:22:06 +01:00
Jakub Jelinek	5c17adfb5d	phiopt: Drop SSA_NAME_RANGE_INFO in maybe equal case [PR108166] The following place in value_replacement is after proving that x == cst1 ? cst2 : x phi result is only used in a comparison with constant which doesn't care if it compares cst1 or cst2 and replaces it with x. The testcase is miscompiled because we have after the replacement incorrect range info for the phi result, we would need to effectively union the phi result range with cst1 (oarg in the code) because previously that constant might be missing in the range, but newly it can appear (we've just verified that the single use stmt of the phi result doesn't care about that value in particular). The following patch just resets the info, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Aldy/Andrew, how would one instead union the SSA_NAME_RANGE_INFO with some INTEGER_CST and store it back into SSA_NAME_RANGE_INFO (including adjusting non-zero bits and the like)? 2022-12-22 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/108166 * tree-ssa-phiopt.cc (value_replacement): For the maybe_equal_p case turned into equal_p reset SSA_NAME_RANGE_INFO of phi result. * g++.dg/torture/pr108166.C: New test.	2022-12-22 12:52:48 +01:00
Jakub Jelinek	0cb5d7cdba	cse: Fix up CSE const_anchor handling [PR108193] The following testcase ICEs on aarch64, because insert_const_anchor inserts invalid CONST_INT into the CSE tables - 0x80000000 for SImode. The second hunk of the patch fixes that, the first one is to avoid triggering undefined behavior at compile time during compute_const_anchors computations - performing those additions and subtractions in HOST_WIDE_INT means it can overflow for certain constants. 2022-12-22 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/108193 * cse.cc (compute_const_anchors): Change n type to unsigned HOST_WIDE_INT, adjust comparison against it to avoid warnings. Formatting fix. (insert_const_anchor): Use gen_int_mode instead of GEN_INT. * gfortran.dg/pr108193.f90: New test.	2022-12-22 12:44:13 +01:00
Richard Biener	7b2cf50414	tree-optimization/107451 - SLP load vectorization issue When vectorizing SLP loads with permutations we can access excess elements when the load vector type is bigger than the group size and the vectorization factor covers less groups than necessary to fill it. Since we know the code will only access up to group_size * VF elements in the unpermuted vector we can simply fill the rest of the vector with whatever we want. For simplicity this patch chooses to repeat the last group. PR tree-optimization/107451 * tree-vect-stmts.cc (vectorizable_load): Avoid loading SLP group members from group numbers in excess of the vectorization factor. * gcc.dg/torture/pr107451.c: New testcase.	2022-12-22 12:21:06 +01:00
Jakub Jelinek	5b30e9bc21	aarch64: Fix plugin header install The r13-2943-g11a113d501ff64 made aarch64.h include aarch64-option-extensions.def, but that file isn't installed for building plugins. On Wed, Dec 21, 2022 at 09:56:33AM +0000, Richard Sandiford wrote: > Should this (and aarch64-fusion-pairs.def and aarch64-tuning-flags.def) > be in TM_H instead? The first two OPTIONS_H_EXTRA entries seem to be > for aarch64-opt.h (included via aarch64.opt). > > I guess TM_H should also have aarch64-arches.def, since it's included > for aarch64_feature. gcc/Makefile.in has TM_H = $(GTM_H) insn-flags.h $(OPTIONS_H) and OPTIONS_H = options.h flag-types.h $(OPTIONS_H_EXTRA) which means that adding something into TM_H when it is already in OPTIONS_H_EXTRA is a unnecessary. It is true that aarch64-fusion-pairs.def (included by aarch64-protos.h) and aarch64-tuning-flags.def (ditto) and aarch64-option-extensions.def (included by aarch64.h) aren't needed for options.h, so I think the right patch would be following. 2022-12-22 Jakub Jelinek <jakub@redhat.com> * config/aarch64/t-aarch64 (TM_H): Don't add aarch64-cores.def, add aarch64-fusion-pairs.def, aarch64-tuning-flags.def and aarch64-option-extensions.def. (OPTIONS_H_EXTRA): Don't add aarch64-fusion-pairs.def nor aarch64-tuning-flags.def.	2022-12-22 11:17:57 +01:00
Jonathan Wakely	d2d3826cd4	libstdc++: Define and use variable templates in <chrono> Thi defines a variable template for the internal __is_duration helper trait, defines a new __is_time_point_v variable template (to be used in a subsequent commit), and adds explicit specializations of the standard chrono::treat_as_floating_point trait for common types. A fast path is added to chrono::duration_cast for the no-op case where no conversion is needed. Finally, some SFINAE constraints are simplified by using the __enable_if_t alias, or by using variable templates. libstdc++-v3/ChangeLog: * include/bits/chrono.h (__is_duration_v, __is_time_point_v): New variable templates. (duration_cast): Add simplified definition for noconv case. (treat_as_floating_point_v): Add explicit specializations. (duration::operator%=, floor, ceil, round): Simplify SFINAE constraints.	2022-12-22 10:14:52 +00:00
Jonathan Wakely	ec8f914f57	libstdc++: Add [[nodiscard]] in <chrono> libstdc++-v3/ChangeLog: * include/std/chrono: Use nodiscard attribute.	2022-12-22 10:14:52 +00:00
Jan Hubicka	eef81eefcd	Zen4 tuning part 2 Adds tunes needed for zen4 microarchitecture. I added two new knobs. TARGET_AVX512_SPLIT_REGS which is used to specify that internally 512 vectors are split to 256 vectors. This affects vectorization costs and reassociation width. It probably should also affect RTX costs however I doubt it is very useful since RTL optimizers are usually not judging between 256 and 512 vectors. I also added X86_TUNE_AVOID_256FMA_CHAINS. Since fma has improved in zen4 this flag may not be a win except for very specific benchmarks. I am still doing some more detailed testing here. Oherwise I disabled gathers on zen4 for 2 parts nad 4 parts. We can open code them and since the latencies has only increased since zen3 opencoding is better than actual instrucction. This shows at 4 tsvc benchmarks. I ended up setting AVX256_OPTIMAL. This is a compromise. There are some tsvc benchmarks that increase noticeably (up to 250%) however there are also few regressions. Most of these can be solved by incrasing vec_perm cost in the vectorizer. However this does not cure about 14% regression on x264 that is quite important. Here we produce vectorized loops for avx512 that probably would be faster if the loops in question had high enough iteration count. We hit this problem with avx256 too: since the loop iterates few times, only prologues/epilogues are used. Adding another round of prologue/epilogue code does not make it better. Finally I enabled avx stores for constnat sized memcpy and memset. I am not sure why this is an opt-in feature. I think for most hardware this is a win. gcc/ChangeLog: 2022-12-22 Jan Hubicka <hubicka@ucw.cz> * config/i386/i386-expand.cc (ix86_expand_set_or_cpymem): Add TARGET_AVX512_SPLIT_REGS * config/i386/i386-options.cc (ix86_option_override_internal): Honor x86_TONE_AVOID_256FMA_CHAINS. * config/i386/i386.cc (ix86_vec_cost): Honor TARGET_AVX512_SPLIT_REGS. (ix86_reassociation_width): Likewise. * config/i386/i386.h (TARGET_AVX512_SPLIT_REGS): New tune. * config/i386/x86-tune.def (X86_TUNE_USE_GATHER_2PARTS): Disable for znver4. (X86_TUNE_USE_GATHER_4PARTS): Likewise. (X86_TUNE_AVOID_256FMA_CHAINS): Set for znver4. (X86_TUNE_AVOID_512FMA_CHAINS): New utne; set for znver4. (X86_TUNE_AVX256_OPTIMAL): Add znver4. (X86_TUNE_AVX512_SPLIT_REGS): New tune. (X86_TUNE_AVX256_MOVE_BY_PIECES): Add znver1-3. (X86_TUNE_AVX256_STORE_BY_PIECES): Add znver1-3. (X86_TUNE_AVX512_MOVE_BY_PIECES): Add znver4. (X86_TUNE_AVX512_STORE_BY_PIECES): Add znver4.	2022-12-22 10:55:46 +01:00
Richard Biener	924033e39b	Compare DECL_NOT_FLEXARRAY for LTO tree merging This was missing. gcc/lto/ * lto-common.cc (compare_tree_sccs_1): Compare DECL_NOT_FLEXARRAY.	2022-12-22 09:43:33 +01:00
Jan Hubicka	bbe04bade0	Update znver4 costs Update cost of znver4 mostly based on data measued by Agner Fog. Compared to previous generations x87 became bit slower which is probably not big deal (and we have minimal benchmarking coverage for it). One interesting improvement is reducation of FMA cost. I also updated costs of AVX256 loads/stores based on latencies (not throughput which is twice of avx256). Overall AVX512 vectorization seems to improve noticeably some of TSVC benchmarks but since internally 512 vectors are split to 256 vectors it is somewhat risky and does not win in SPEC scores (mostly by regressing benchmarks with loop that have small trip count like x264 and exchange), so for now I am going to set AVX256_OPTIMAL tune but I am still playing with it. We improved since ZNVER1 on choosing vectorization size and also have vectorized prologues/epilogues so it may be possible to make avx512 small win overall. 2022-12-22 Jan Hubicka <hubicka@ucw.cz> * config/i386/x86-tune-costs.h (znver4_cost): Upate costs of FP and SSE moves, division multiplication, gathers, L2 cache size, and more complex FP instrutions.	2022-12-22 02:16:24 +01:00
GCC Administrator	de282a2012	Daily bump.	2022-12-22 00:17:29 +00:00
Jonathan Yong	37d8312f56	testsuite: Fix pr55569.c excess errors on LLP64 This fixes the following on LLP64 mingw-w64 target: Excess errors: gcc/testsuite/gcc.c-torture/compile/pr55569.c:13:12: warning: overflow in conversion from 'long long unsigned int' to 'long int' changes value from '4611686018427387903' to '-1' [-Woverflow] gcc/testsuite/gcc.c-torture/compile/pr55569.c:13:34: warning: iteration 2147483647 invokes undefined behavior [-Waggressive-loop-optimizations] gcc/testsuite/ChangeLog: * gcc.c-torture/compile/pr55569.c: fix excess errors. Signed-off-by: Jonathan Yong <10walls@gmail.com>	2022-12-21 23:15:48 +00:00
Andrew Pinski	193fccaa5c	Fix PR 105532: match.pd patterns calling tree_nonzero_bits with vector types Even though this PR was reported with an ubsan issue, the problem is tree_nonzero_bits is being called with an expression which is a vector type. This fixes three patterns I noticed which does that. And adds a testcase for one of the patterns. OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions gcc/ChangeLog: PR tree-optimization/105532 * match.pd (~(X >> Y) -> ~X >> Y): Check if it is an integral type before calling tree_nonzero_bits. (popcount(X) + popcount(Y)): Likewise. (popcount(X&C1)): Likewise. gcc/testsuite/ChangeLog: * gcc.c-torture/compile/vector-shift-1.c: New test.	2022-12-21 18:32:26 +00:00
Andrew Pinski	91e0d22025	[PATCH] Use toplevel configure for GMP and MPFR for gdb [Sync'ed from the binutils-gdb repo] This patch uses the toplevel configure parts for GMP/MPFR for gdb. The only thing is that gdb now requires MPFR for building. Before it was a recommended but not required library. Also this allows building of GMP and MPFR with the toplevel directory just like how it is done for GCC. We now error out in the toplevel configure of the version of GMP and MPFR that is wrong. OK after GDB 13 branches? Build gdb 3 ways: with GMP and MPFR in the toplevel (static library used at that point for both) With only MPFR in the toplevel (GMP distro library used and MPFR built from source) With neither GMP and MPFR in the toplevel (distro libraries used) Changes from v1: * Updated gdb/README and gdb/doc/gdb.texinfo. * Regenerated using unmodified autoconf-2.69 Thanks, Andrew Pinski ChangeLog: * Makefile.def: Add configure-gdb dependencies on all-gmp and all-mpfr. * configure.ac: Split out MPC checking from MPFR. Require GMP and MPFR if the gdb directory exist. * Makefile.in: Regenerate. * configure: Regenerate.	2022-12-21 17:18:53 +00:00
Chung-Lin Tang	fdc7469cf5	nvptx: reimplement libgomp barriers [PR99555] Instead of trying to have the GPU do CPU-with-OS-like things, this new barriers implementation for NVPTX uses simplistic bar.* synchronization instructions. Tasks are processed after threads have joined, and only if team->task_count != 0 It is noted that: there might be a little bit of performance forfeited for cases where earlier arriving threads could've been used to process tasks ahead of other threads, but that has the requirement of implementing complex futex-wait/wake like behavior, which is what we're try to avoid with this patch. It is deemed that task processing is not what GPU target offloading is usually used for. Implementation highlight notes: 1. gomp_team_barrier_wake() is now an empty function (threads never "wake" in the usual manner) 2. gomp_team_barrier_cancel() now uses the "exit" PTX instruction. 3. gomp_barrier_wait_last() now is implemented using "bar.arrive" 4. gomp_team_barrier_wait_end()/gomp_team_barrier_wait_cancel_end(): The main synchronization is done using a 'bar.red' instruction. This reduces across all threads the condition (team->task_count != 0), to enable the task processing down below if any thread created a task. (this bar.red usage means that this patch is dependent on the prior NVPTX bar.red GCC patch) PR target/99555 libgomp/ChangeLog: * config/nvptx/bar.c (generation_to_barrier): Remove. (futex_wait,futex_wake,do_spin,do_wait): Remove. (GOMP_WAIT_H): Remove. (#include "../linux/bar.c"): Remove. (gomp_barrier_wait_end): New function. (gomp_barrier_wait): Likewise. (gomp_barrier_wait_last): Likewise. (gomp_team_barrier_wait_end): Likewise. (gomp_team_barrier_wait): Likewise. (gomp_team_barrier_wait_final): Likewise. (gomp_team_barrier_wait_cancel_end): Likewise. (gomp_team_barrier_wait_cancel): Likewise. (gomp_team_barrier_cancel): Likewise. * config/nvptx/bar.h (gomp_barrier_t): Remove waiters, lock fields. (gomp_barrier_init): Remove init of waiters, lock fields. (gomp_team_barrier_wake): Remove prototype, add new static inline function.	2022-12-21 05:58:49 -08:00
Chung-Lin Tang	623daaf8a2	nvptx: support bar.red instruction This patch adds support for the PTX 'bar.red' (i.e. "barrier reduction") instruction, in the form of nvptx-specific __builtin_nvptx_bar_red_[and/or/popc] built-in functions. gcc/ChangeLog: * config/nvptx/nvptx.cc (nvptx_print_operand): Add 'p' case, adjust comments. (enum nvptx_builtins): Add NVPTX_BUILTIN_BAR_RED_AND, NVPTX_BUILTIN_BAR_RED_OR, and NVPTX_BUILTIN_BAR_RED_POPC. (nvptx_expand_bar_red): New function. (nvptx_init_builtins): Add DEFs of __builtin_nvptx_bar_red_[and/or/popc]. (nvptx_expand_builtin): Use nvptx_expand_bar_red to expand NVPTX_BUILTIN_BAR_RED_[AND/OR/POPC] cases. * config/nvptx/nvptx.md (define_c_enum "unspecv"): Add UNSPECV_BARRED_AND, UNSPECV_BARRED_OR, and UNSPECV_BARRED_POPC. (BARRED): New int iterator. (barred_op,barred_mode,barred_ptxtype): New int attrs. (nvptx_barred_<barred_op>): New define_insn.	2022-12-21 05:58:49 -08:00
Iain Sandoe	f661b3d11e	libffi: Update LOCAL_PATCHES. Add the patch that fixes i686 Darwin build. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk> libffi/ChangeLog: * LOCAL_PATCHES: Add patch to fix i686 darwin build.	2022-12-21 13:10:16 +00:00
Iain Sandoe	3cc159bc01	libffi: Fix X86 32b Darwin build and EH frames. This addresses a number of issues in the X86 Darwin 32b port for libffi. 1. The pic symbol stubs are weak definitions; the correct section placement for these depends on the linker version in use. We do not have access to that information, but we can use the target OS version (assumes that the user has installed the latest version of xcode available). When a coalesced section is in use (OS versions earlier than Darwin12 / OSX 10.8), its name must differ from __TEXT,__text since otherwise that would correspond to altering the attributes of the .text section (which produces a diagnostic from the assembler). Here we use __TEXT, __textcoal_nt for this which is what GCC emits for these stubs. For later versions than Darwin 12 (OS X 10.8) we can place the stubs in the .text section (if we do not we get a diagnostic from clang -cc1as saying that the use of coalesced sections for this is deprecated). 2. The EH frame is specified manually, since there is no support for .cfi_ directives in 'cctools' assemblers. The implementation needs to provide offsets for CFA advance, code size and to the CIE as signed values rather than relocations. However the cctools assembler will produce a relocation for expressions like ' .long Lxx-Lyy' which then leads to a link-time error. We correct this by forming the offset values using ' .set' directives and then assigning the results of them. 3. The register numbering used by m32 X86 Darwin EH frames is not the same as the DWARF debug numbering (the Frame and Stack pointer numbers are swapped). 4. The FDE address encoding used by the system tools is '0x10' (PCrel + abs) where the value provided was PCrel + sdata4. 5. GCC does not use compact unwind at present, and it was not implemented until Darwin10 / OSX 10.6. There were some issues with function location in 10.6 so that the solution here suppresses emitting the compact unwind section until Darwin11 / OSX 10.7. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk> libffi/ChangeLog: * src/x86/sysv.S (COMDAT): Amend section use for Darwin, accounting cases where coalesced is needed. (eh_frame): Rework to avoid relocs that cause builf fails on earlier Darwin. Adjust register numbers to account for X86 m32 Darwin differences between EH and debug.	2022-12-21 13:02:25 +00:00

1 2 3 4 5 ...

197600 Commits