mirror/gcc - gcc - Collaboration & Inovation

mirror/gcc

mirror of git://gcc.gnu.org/git/gcc.git synced 2025-02-26 19:05:59 +08:00

Author	SHA1	Message	Date
Richard Biener	78b56a12dd	amdgcn: Add gfx1036 target Add support for the gfx1036 RDNA2 APU integrated graphics devices. The ROCm documentation warns that these may not be supported, but it seems to work at least partially. gcc/ChangeLog: * config.gcc (amdgcn): Add gfx1036 entries. * config/gcn/gcn-hsa.h (NO_XNACK): Likewise. (gcn_local_sym_hash): Likewise. * config/gcn/gcn-opts.h (enum processor_type): Likewise. (TARGET_GFX1036): New macro. * config/gcn/gcn.cc (gcn_option_override): Handle gfx1036. (gcn_omp_device_kind_arch_isa): Likewise. (output_file_start): Likewise. * config/gcn/gcn.h (TARGET_CPU_CPP_BUILTINS): Add __gfx1036__. (TARGET_CPU_CPP_BUILTINS): Rename __gfx1030 to __gfx1030__. * config/gcn/gcn.opt: Add gfx1036. * config/gcn/mkoffload.cc (EF_AMDGPU_MACH_AMDGCN_GFX1036): New. (main): Handle gfx1036. * config/gcn/t-omp-device: Add gfx1036 isa. * doc/install.texi (amdgcn): Add gfx1036. * doc/invoke.texi (-march): Likewise. libgomp/ChangeLog: * plugin/plugin-gcn.c (EF_AMDGPU_MACH): GFX1036. (gcn_gfx1103_s): New. (isa_hsa_name): Handle gfx1036. (isa_code): Likewise. (max_isa_vgprs): Likewise.	2024-03-25 15:54:37 +01:00
Gaius Mulley	44863af22d	modula2: Rebuild documentation sections for target independent libs This patch rebuilds the documentation for the target independent library sections. gcc/m2/ChangeLog: * Make-lang.in (doc/m2.pdf): Add line break. * target-independent/m2/Builtins.texi: Rebuilt. * target-independent/m2/gm2-libs.texi: Rebuilt. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>	2024-03-25 14:33:54 +00:00
Jonathan Wakely	cf3fc6f414	libstdc++: Fix incorrect macro used in #undef in test This was a copy & paste error. libstdc++-v3/ChangeLog: * testsuite/std/text_encoding/requirements.cc: #undef the correct macro.	2024-03-25 12:13:50 +00:00
Pan Li	5cab64a9cf	RISC-V: Allow RVV intrinsic when function target("arch=+v") This patch would like to allow the RVV intrinsic when function is attributed as target("arch=+v") and build with rv64gc. For example: vint32m1_t __attribute__((target("arch=+v"))) test_1 (vint32m1_t a, vint32m1_t b, size_t vl) { return __riscv_vadd_vv_i32m1 (a, b, vl); } build with -march=rv64gc -mabi=lp64d -O3, we will have asm like below: test_1: .option push .option arch, rv64i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_v1p0_zicsr2p0_\ zifencei2p0_zve32f1p0_zve32x1p0_zve64d1p0_zve64f1p0_zve64x1p0_zvl128b1p0_zvl32b1p0_zvl64b1p0 vsetvli zero,a0,e32,m1,ta,ma vadd.vv v8,v8,v9 ret The riscv_vector.h must be included when leverage intrinisc type(s) and API(s). And the scope of this attribute should not excced the function body. Meanwhile, to make rvv types and API(s) available for this attribute, include riscv_vector.h will not report error for now if v is not present in march. Below test are passed for this patch: * The riscv fully regression test. gcc/ChangeLog: * config/riscv/riscv-c.cc (riscv_pragma_intrinsic): Remove error when V is disabled and init the RVV types and intrinic APIs. * config/riscv/riscv-vector-builtins.cc (expand_builtin): Report error if V ext is disabled. * config/riscv/riscv.cc (riscv_return_value_is_vector_type_p): Ditto. (riscv_arguments_is_vector_type_p): Ditto. (riscv_vector_cc_function_p): Ditto. * config/riscv/riscv_vector.h: Remove error if V is disable. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pragma-1.c: Remove. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-1.c: New test. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-2.c: New test. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-3.c: New test. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-4.c: New test. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-5.c: New test. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-6.c: New test. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-7.c: New test. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>	2024-03-25 20:08:28 +08:00
GCC Administrator	ecd2c37372	Daily bump.	2024-03-25 00:16:24 +00:00
GCC Administrator	bb04a11418	Daily bump.	2024-03-24 00:16:50 +00:00
Gaius Mulley	a68458187d	PR modula2/114444 trunc float malformed error cause ICE This patch corrects two error format specifiers. gcc/m2/ChangeLog: PR modula2/114444 * gm2-compiler/M2Quads.mod (BuildTruncFunction): Correct error format specifier. (BuildFloatFunction): Correct error format specifier. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>	2024-03-23 16:04:23 +00:00
Gaius Mulley	c8a343f9f8	PR modula2/114443 missing quote cause ICE This patch inserts a missing quotation at the end of a line if required (after an appropiate error message is generated). gcc/m2/ChangeLog: PR modula2/114443 * m2.flex: Call AddTokCharStar with a stringtok if end of line is reached without a closing quote. gcc/testsuite/ChangeLog: PR modula2/114443 * gm2/pim/fail/missingquote.mod: New test. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>	2024-03-23 15:49:23 +00:00
David Malcolm	80a0cb3745	analyzer: fix ICE and false positive with -Wanalyzer-deref-before-check [PR114408] gcc/analyzer/ChangeLog: PR analyzer/114408 * engine.cc (impl_run_checkers): Free up any dominance info that we may have created. * kf.cc (class kf_ubsan_handler): New. (register_sanitizer_builtins): New. (register_known_functions): Call register_sanitizer_builtins. gcc/testsuite/ChangeLog: PR analyzer/114408 * c-c++-common/analyzer/deref-before-check-pr114408.c: New test. * c-c++-common/ubsan/analyzer-ice-pr114408.c: New test. Signed-off-by: David Malcolm <dmalcolm@redhat.com>	2024-03-23 09:52:38 -04:00
John David Anglin	2e4b3374cb	hppa: Fix LO_SUM DLTIND14R address support in PRINT_OPERAND_ADDRESS This bug was hidden since LO_SUM DLTIND14R addresses are normally handled by the A constraint in the move patterns. 2024-03-23 John David Anglin <danglin@gcc.gnu.org> gcc/ChangeLog: * config/pa/pa.cc (pa_output_global_address): Handle UNSPEC_DLTIND14R addresses. * config/pa/pa.h (PRINT_OPERAND_ADDRESS): Output "RT'" for UNSPEC_DLTIND14R address.	2024-03-23 13:47:31 +00:00
Jonathan Wakely	543585046d	libstdc++: Disable std::formatter specializations (LWG 3944) This was just approved in Tokyo as a DR for C++23. It doesn't affect us yet, because we don't implement the __cpp_lib_format_ranges features. We can add the disabled specializations and add a testcase now though. libstdc++-v3/ChangeLog: * include/std/format (formatter): Disable specializations that would allow sequences of narrow characters to be formatted as wchar_t without conversion, as per LWG 3944. * testsuite/std/format/formatter/lwg3944.cc: New test.	2024-03-23 11:07:57 +00:00
Jonathan Wakely	3763fb8970	libstdc++: Add __is_in_place_index_v helper and use it in <variant> We already have __is_in_place_type_v for in_place_type_t so adding an equivalent for in_place_index_t allows us avoid a class template instantiation for the __not_in_place_tag constraint on the most commonly-used std::variant::variant(T&&) constructor. For in_place_type_t we also have a __is_in_place_type class template defined in terms of the variable template, but that isn't actually used anywhere. I'm not adding an equivalent for the new variable template, because that wouldn't be used either. For GCC 15 we should remove the unused __is_in_place_tag and __is_in_place_type class templates. libstdc++-v3/ChangeLog: * include/bits/utility.h (__is_in_place_index_v): New variable template. * include/std/variant (__not_in_place_tag): Define in terms of variable templates not a class template.	2024-03-23 11:07:57 +00:00
Jonathan Wakely	f4605c53ea	libstdc++: Use std::type_identity_t in <string_view> as per LWG 3950 [PR114400] The difference between __type_identity_t and std::type_identity_t is observable, as demonstrated in the PR. Nobody in LWG seems to think this an example we should really care about, but it seems easy and harmless to change this. libstdc++-v3/ChangeLog: PR libstdc++/114400 * include/std/string_view (operator==): Use std::type_identity_t in C++20 instead of our own __type_identity_t.	2024-03-23 11:07:57 +00:00
Jakub Jelinek	4a46a48ebc	bitint: Fix bitfield loads in handle_cast [PR114433] We ICE on the following testcase, because handle_cast was incorrectly testing !m_first to see whether it should use m_data[m_bitfld_load + 1] or fresh SSA_NAME for a PHI result. Now, m_first is in the routine sometimes temporarily cleared in between doing prepare_data_in_out and the !m_first check and only before returning restored from the save_first copy. Without this patch, we try to use the same SSA_NAME (_12 here) in 2 different PHI results which is obviously invalid IL and ICEs very quickly. 2024-03-23 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/114433 * gimple-lower-bitint.cc (bitint_large_huge::handle_cast): For m_bitfld_load check save_first rather than m_first. * gcc.dg/torture/bitint-68.c: New test.	2024-03-23 11:20:00 +01:00
Jakub Jelinek	f92cf8cbbe	bitint: Handle complex types in build_bitint_stmt_ssa_conflicts [PR114425] The task of the build_bitint_stmt_ssa_conflicts hook for tree-ssa-coalesce.cc next to special casing the multiplication/division/modulo is to ignore statements with large/huge _BitInt lhs which isn't in names bitmap and on the other side pretend all uses of the stmt are used in a later stmt (single user of that SSA_NAME or perhaps single user of lhs of the single user etc.) where the lowering will actually emit the code. Unfortunately the function wasn't handling COMPLEX_TYPE of the large/huge BITINT_TYPE, while the FE doesn't really support such types, they are used under the hood for __builtin_{add,sub,mul}_overflow{,_p}, they are also present or absent from the names bitmap and should be treated the same. Without this patch, the operands of .ADD_OVERFLOW were incorrectly pretended to be used right in that call statement rather than on the cast stmt from IMAGPART_EXPR of .ADD_OVERFLOW return value to some integral type. 2024-03-23 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/114425 * gimple-lower-bitint.cc (build_bitint_stmt_ssa_conflicts): Handle _Complex large/huge _BitInt types like the large/huge _BitInt types. * gcc.dg/torture/bitint-67.c: New test.	2024-03-23 11:19:09 +01:00
Jakub Jelinek	8fc5593df8	predcom: Punt for steps which aren't multiples of access size [PR111683] On the following testcases, there is no overlap between data references within a single iteration, but the data references have size which is twice as large as the step, which means the data references overlap with the next iteration which predcom doesn't take into account. As discussed in the PR, even if the reference size is smaller than step, if step isn't a multiple of the reference size, there could be overlaps with some other iteration later on. The initial version of the patch regressed (test still passed, but predcom didn't optimize anymore) pr71083.c which has a packed char, short structure and was reading/writing the short 2 bytes in there with step 3. The following patch deals with that by retrying for COMPONENT_REFs also the aggregate sizes etc., so that it then compares 3 bytes against step 3. In make check-gcc/check-g++ this patch I believe affects code generation for only the 2 new testcases according to statistics I've gathered. 2024-03-23 Jakub Jelinek <jakub@redhat.com> PR middle-end/111683 * tree-predcom.cc (pcom_worker::suitable_component_p): If has_write and comp_step is RS_NONZERO, return false if any reference in the component doesn't have DR_STEP a multiple of access size. * gcc.dg/pr111683-1.c: New test. * gcc.dg/pr111683-2.c: New test.	2024-03-23 11:17:44 +01:00
Takayuki 'January June' Suwa	7a01cc711f	xtensa: Add supplementary split pattern for "addsubx" int test(int a) { return a 4 + 30000; } In the example above, since Xtensa has instructions to add register value scaled by 2, 4 or 8 (and corresponding define_insns), we would expect them to be used but not, because it is transformed before reaching the RTL generation pass as below: int test(int a) { return (a + 7500) * 4; } Fortunately, the RTL combination pass tries a splitting pattern that matches the first example, so it is easy to solve by defining that pattern. gcc/ChangeLog: * config/xtensa/xtensa.md: Add new split pattern described above.	2024-03-22 18:12:18 -07:00
GCC Administrator	e8985864a3	Daily bump.	2024-03-23 00:17:26 +00:00
Jonathan Wakely	c2e28df90a	libstdc++: Destroy allocators in re-inserted container nodes [PR114401] The allocator objects in container node handles were not being destroyed after the node was re-inserted into a container. They are stored in a union and so need to be explicitly destroyed when the node becomes empty. The containers were zeroing the node handle's pointer, which makes it empty, causing the handle's destructor to think there's nothign to clean up. Add a new member function to the node handle which destroys the allocator and zeros the pointer. Change the containers to call that instead of just changing the pointer manually. We can also remove the _M_empty member of the union which is not necessary. libstdc++-v3/ChangeLog: PR libstdc++/114401 * include/bits/hashtable.h (_Hashtable::_M_reinsert_node): Call release() on node handle instead of just zeroing its pointer. (_Hashtable::_M_reinsert_node_multi): Likewise. (_Hashtable::_M_merge_unique): Likewise. (_Hashtable::_M_merge_multi): Likewise. * include/bits/node_handle.h (_Node_handle_common::release()): New member function. (_Node_handle_common::_Optional_alloc::_M_empty): Remove unnecessary union member. (_Node_handle_common): Declare _Hashtable as a friend. * include/bits/stl_tree.h (_Rb_tree::_M_reinsert_node_unique): Call release() on node handle instead of just zeroing its pointer. (_Rb_tree::_M_reinsert_node_equal): Likewise. (_Rb_tree::_M_reinsert_node_hint_unique): Likewise. (_Rb_tree::_M_reinsert_node_hint_equal): Likewise. * testsuite/23_containers/multiset/modifiers/114401.cc: New test. * testsuite/23_containers/set/modifiers/114401.cc: New test. * testsuite/23_containers/unordered_multiset/modifiers/114401.cc: New test. * testsuite/23_containers/unordered_set/modifiers/114401.cc: New test.	2024-03-22 22:39:06 +00:00
Jonathan Wakely	142cc4c223	libstdc++: Constrain std::vector default constructor [PR113841] This is needed to avoid errors outside the immediate context when evaluating is_default_constructible_v<vector<T, A>> when A is not default constructible. To avoid diagnostic regressions for 23_containers/vector/48101_neg.cc we need to make the std::allocator<cv T> partial specializations default constructible, which they probably should have been anyway. libstdc++-v3/ChangeLog: PR libstdc++/113841 * include/bits/allocator.h (allocator<cv T>): Add default constructor to partial specializations for cv-qualified types. * include/bits/stl_vector.h (_Vector_impl::_Vector_impl()): Constrain so that it's only present if the allocator is default constructible. * include/bits/stl_bvector.h (_Bvector_impl::_Bvector_impl()): Likewise. * testsuite/23_containers/vector/cons/113841.cc: New test.	2024-03-22 22:39:06 +00:00
Jonathan Wakely	8539c5610a	libstdc++: Use feature test macros in <bits/stl_construct.h> The preprocessor checks for __cplusplus in <bits/stl_construct.h> should use the appropriate feature test macros instead of __cplusplus, namely __glibcxx_raw_memory_algorithms and __cpp_constexpr_dynamic_alloc. For the latter, we want to check the compiler macro not the library's __cpp_lib_constexpr_dynamic_alloc, because the latter is not defined for freestanding but std::construct_at needs to be. libstdc++-v3/ChangeLog: * include/bits/stl_construct.h (destroy_at, construct_at): Guard with feature test macros instead of just __cplusplus.	2024-03-22 22:39:05 +00:00
Jonathan Wakely	ff773ac3d9	libstdc++: Reorder feature test macro definitions Put the C++23 generator and tuple_like ones before the C++26 ones. libstdc++-v3/ChangeLog: * include/bits/version.def (generator, tuple_like): Move earlier in the file. * include/bits/version.h: Regenerate.	2024-03-22 22:39:05 +00:00
Jonathan Wakely	31ef58b18d	libstdc++: Replace std::result_of with __invoke_result_t [PR114394] Replace std::result_of with std::invoke_result, as specified in the standard since C++17, to avoid deprecated warnings for std::result_of. We don't have __invoke_result_t in C++11 mode, so add it as an alias template for __invoke_result<>::type (which is what std::result_of uses as its base class, so there's no change in functionality). This fixes warnings given by Clang 18. libstdc++-v3/ChangeLog: PR libstdc++/114394 * include/std/functional (bind): Use __invoke_result_t instead of result_of::type. * include/std/type_traits (__invoke_result_t): New alias template. * testsuite/20_util/bind/ref_neg.cc: Adjust prune pattern.	2024-03-22 22:37:57 +00:00
Harald Anlauf	c083a453db	Fortran: no size check passing NULL() without MOLD argument [PR55978] gcc/fortran/ChangeLog: PR fortran/55978 * interface.cc (gfc_compare_actual_formal): Skip size check for NULL() actual without MOLD argument. gcc/testsuite/ChangeLog: PR fortran/55978 * gfortran.dg/null_actual_5.f90: New test.	2024-03-22 22:00:53 +01:00
Georg-Johann Lay	65b7d1862e	AVR: Adjust message for SIGNAL and INTERRUPT usage gcc/ * config/avr/avr.cc (avr_set_current_function): Adjust diagnostic for deprecated SIGNAL and INTERRUPT usage without respective header.	2024-03-22 19:30:18 +01:00
Kwok Cheung Yeung	637e76b90e	openmp: Change to using a hashtab to lookup offload target addresses for indirect function calls A splay-tree was previously used to lookup equivalent target addresses for a given host address on offload targets. However, as splay-trees can modify their structure on lookup, they are not suitable for concurrent access from separate teams/threads without some form of locking. This patch changes the lookup data structure to a hashtab instead, which does not have these issues. The call to build_indirect_map to initialize the data structure is now called from just the first thread of the first team to avoid redundant calls to this function. 2024-03-22 Kwok Cheung Yeung <kcyeung@baylibre.com> libgomp/ * config/accel/target-indirect.c: Include string.h and hashtab.h. Remove include of splay-tree.h. Update comments. (splay_tree_prefix, splay_tree_c): Delete. (struct indirect_map_t): New. (hash_entry_type, htab_alloc, htab_free, htab_hash, htab_eq): New. (GOMP_INDIRECT_ADD_MAP): Remove volatile qualifier. (USE_SPLAY_TREE_LOOKUP): Rename to... (USE_HASHTAB_LOOKUP): ..this. (indirect_map, indirect_array): Delete. (indirect_htab): New. (build_indirect_map): Remove locking. Build indirect map using hashtab. (GOMP_target_map_indirect_ptr): Use indirect_htab to lookup target address. (GOMP_target_map_indirect_ptr): Remove volatile qualifier. * config/gcn/team.c (gomp_gcn_enter_kernel): Call build_indirect_map from first thread of first team only. * config/nvptx/team.c (gomp_nvptx_main): Likewise. * testsuite/libgomp.c-c++-common/declare-target-indirect-2.c (main): Add missing break statements. * testsuite/libgomp.fortran/declare-target-indirect-2.f90: Remove xfail.	2024-03-22 18:09:40 +00:00
Patrick O'Neill	65107faad7	RISC-V: Require a extension for ztso testcases with atomic insns Use dg_add_options riscv_a to add atomic extension when running compile tests on non-a targets. gcc/testsuite/ChangeLog: * gcc.target/riscv/amo-table-ztso-amo-add-1.c: Add dg_add_options riscv_a * gcc.target/riscv/amo-table-ztso-amo-add-2.c: Ditto. * gcc.target/riscv/amo-table-ztso-amo-add-3.c: Ditto. * gcc.target/riscv/amo-table-ztso-amo-add-4.c: Ditto. * gcc.target/riscv/amo-table-ztso-amo-add-5.c: Ditto. * gcc.target/riscv/amo-table-ztso-compare-exchange-1.c: Ditto. * gcc.target/riscv/amo-table-ztso-compare-exchange-2.c: Ditto. * gcc.target/riscv/amo-table-ztso-compare-exchange-3.c: Ditto. * gcc.target/riscv/amo-table-ztso-compare-exchange-4.c: Ditto. * gcc.target/riscv/amo-table-ztso-compare-exchange-5.c: Ditto. * gcc.target/riscv/amo-table-ztso-compare-exchange-6.c: Ditto. * gcc.target/riscv/amo-table-ztso-compare-exchange-7.c: Ditto. * gcc.target/riscv/amo-table-ztso-subword-amo-add-1.c: Ditto. * gcc.target/riscv/amo-table-ztso-subword-amo-add-2.c: Ditto. * gcc.target/riscv/amo-table-ztso-subword-amo-add-3.c: Ditto. * gcc.target/riscv/amo-table-ztso-subword-amo-add-4.c: Ditto. * gcc.target/riscv/amo-table-ztso-subword-amo-add-5.c: Ditto. Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>	2024-03-22 10:41:39 -07:00
Andrew Stubbs	e194503b6f	amdgcn: Adjust GFX10/GFX11 cache coherency The RDNA devices have different cache architectures to the CDNA devices, and the differences go deeper than just the assembler mnemonics. I believe this patch is correct according to the documentation in the LLVM AMDGPU user guide (the ISA manual is less instructive), but I hadn't observed any real problems before (or after). gcc/ChangeLog: * config/gcn/gcn.md (*memory_barrier): Split into RDNA and !RDNA. (atomic_load<mode>): Adjust RDNA cache settings. (atomic_store<mode>): Likewise. (atomic_exchange<mode>): Likewise.	2024-03-22 15:54:33 +00:00
Andrew Stubbs	6dedafe166	amdgcn: Prefer V32 on RDNA devices We run these devices in wavefrontsize64 for compatibility, but they actually only have 32-lane vectors, natively. If the upper part of a V64 is masked off (as it is in V32) then RDNA devices will skip execution of the upper part for most operations, so this adjustment shouldn't leave too much performance on the table. One exception is memory instructions, so full wavefrontsize32 support would be better. The advantage is that we avoid the missing V64 operations (such as permute and vec_extract). gcc/ChangeLog: * config/gcn/gcn.cc (gcn_vectorize_preferred_simd_mode): Prefer V32 on RDNA devices.	2024-03-22 15:54:33 +00:00
David Malcolm	c6cf578913	analyzer: look through casts in taint sanitization [PR112974,PR112975] PR analyzer/112974 and PR analyzer/112975 record false positives from the analyzer's taint detection where sanitization of the form if (VALUE CMP VALUE-OF-WIDER-TYPE) happens, but wasn't being "noticed" by the taint checker, due to the test being: (WIDER_TYPE)VALUE CMP VALUE-OF-WIDER-TYPE at the gimple level, and thus taint_state_machine recording sanitization of (WIDER_TYPE)VALUE, but not of VALUE. Fix by stripping casts in taint_state_machine::on_condition so that the state machine records sanitization of the underlying value. gcc/analyzer/ChangeLog: PR analyzer/112974 PR analyzer/112975 * sm-taint.cc (taint_state_machine::on_condition): Strip away casts before considering LHS and RHS, to increase the chance of detecting places where sanitization of a value may have happened. gcc/testsuite/ChangeLog: PR analyzer/112974 PR analyzer/112975 * gcc.dg/plugin/plugin.exp (plugin_test_list): Add taint-pr112974.c and taint-pr112975.c to analyzer_kernel_plugin.c. * gcc.dg/plugin/taint-pr112974.c: New test. * gcc.dg/plugin/taint-pr112975.c: New test. Signed-off-by: David Malcolm <dmalcolm@redhat.com>	2024-03-22 10:57:25 -04:00
David Malcolm	d475a4571e	analyzer: add SARIF property bags to taint diagnostics Another followup to r14-6057-g12b67d1e13b3cf to make it easier to debug the analyzer. gcc/analyzer/ChangeLog: * sm-taint.cc: Include "diagnostic-format-sarif.h". (bounds_to_str): New. (taint_diagnostic::maybe_add_sarif_properties): New. (tainted_offset::tainted_offset): Add "offset" param. (tainted_offset::maybe_add_sarif_properties): New. (tainted_offset::m_offset): New. (region_model::check_region_for_taint): Pass offset to tainted_offset ctor. Signed-off-by: David Malcolm <dmalcolm@redhat.com>	2024-03-22 10:57:20 -04:00
Andrew Stubbs	1bf18629c5	amdgcn: Add gfx1103 target Add support for the gfx1103 RDNA3 APU integrated graphics devices. The ROCm documentation warns that these may not be supported, but it seems to work at least partially. gcc/ChangeLog: * config.gcc (amdgcn): Add gfx1103 entries. * config/gcn/gcn-hsa.h (NO_XNACK): Likewise. (gcn_local_sym_hash): Likewise. * config/gcn/gcn-opts.h (enum processor_type): Likewise. (TARGET_GFX1103): New macro. * config/gcn/gcn.cc (gcn_option_override): Handle gfx1103. (gcn_omp_device_kind_arch_isa): Likewise. (output_file_start): Likewise. (gcn_hsa_declare_function_name): Use TARGET_RDNA3, not just gfx1100. * config/gcn/gcn.h (TARGET_CPU_CPP_BUILTINS): Add __gfx1103__. * config/gcn/gcn.opt: Add gfx1103. * config/gcn/mkoffload.cc (EF_AMDGPU_MACH_AMDGCN_GFX1103): New. (main): Handle gfx1103. * config/gcn/t-omp-device: Add gfx1103 isa. * doc/install.texi (amdgcn): Add gfx1103. * doc/invoke.texi (-march): Likewise. libgomp/ChangeLog: * plugin/plugin-gcn.c (EF_AMDGPU_MACH): GFX1103. (gcn_gfx1103_s): New. (isa_hsa_name): Handle gfx1103. (isa_code): Likewise. (max_isa_vgprs): Likewise.	2024-03-22 14:45:15 +00:00
Marek Polacek	d1d8fd2884	c++: direct-init of an array of class type [PR59465] ...from another array in a mem-initializer should not be accepted. We already reject struct string {} a[1]; string x[1](a); but struct pair { string s[1]; pair() : s(a) {} }; is wrongly accepted. It started to be accepted with r0-110915-ga034826198b771: <https://gcc.gnu.org/pipermail/gcc-patches/2011-August/320236.html> which was supposed to be a cleanup, not a deliberate change to start accepting the code. The build_vec_init_expr code was added in r165976: <https://gcc.gnu.org/pipermail/gcc-patches/2010-October/297582.html>. It appears that we do the magic copy array when we have a defaulted constructor and we generate code for its mem-initializer which initializes an array. I also see that we go that path for compound literals. So when initializing an array member, we can limit building up a VEC_INIT_EXPR to those special cases. PR c++/59465 gcc/cp/ChangeLog: * init.cc (can_init_array_with_p): New. (perform_member_init): Check it. gcc/testsuite/ChangeLog: * g++.dg/init/array62.C: New test. * g++.dg/init/array63.C: New test. * g++.dg/init/array64.C: New test.	2024-03-22 10:40:50 -04:00
Andrew Stubbs	e4e02c07d9	vect: more oversized bitmask fixups These patches fix up a failure in testcase vect/tsvc/vect-tsvc-s278.c when configured to use V32 instead of V64 (I plan to do this for RDNA devices). The problem was that a "not" operation on the mask inadvertently enabled inactive lanes 31-63 and corrupted the output. The fix is to adjust the mask when calling internal functions (in this case COND_MINUS), when doing masked loads and stores, and when doing conditional jumps (some cases were already handled). gcc/ChangeLog: * dojump.cc (do_compare_rtx_and_jump): Clear excess bits in vector bitmasks. (do_compare_and_jump): Remove now-redundant similar code. * internal-fn.cc (expand_fn_using_insn): Clear excess bits in vector bitmasks. (add_mask_and_len_args): Likewise.	2024-03-22 14:14:00 +00:00
Thomas Neumann	a364148530	handle unwind tables that are embedded within unwinding code [PR111731] Original bug report: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111731 The unwinding mechanism registers both the code range and the unwind table itself within a b-tree lookup structure. That data structure assumes that is consists of non-overlappping intervals. This becomes a problem if the unwinding table is embedded within the code itself, as now the intervals do overlap. To fix this problem we now keep the unwind tables in a separate b-tree, which prevents the overlap. libgcc/ChangeLog: PR libgcc/111731 * unwind-dw2-fde.c: Split unwind ranges if they contain the unwind table.	2024-03-22 14:56:50 +01:00
Mikael Morin	a44d7e8a52	fortran: Ignore use statements on error [PR107426] This fixes an access to freed memory on the testcase from the PR. The problem comes from an invalid subroutine statement in an interface, which is ignored and causes the following statements forming the procedure body to be rejected. One of them use-associates the intrinsic ISO_C_BINDING module, which imports new symbols in a namespace that is freed at the time the statement is rejected. However, this creates dangling pointers as ISO_C_BINDING is special and its import creates a reference to the imported C_PTR symbol in the return type of the global intrinsic symbol for C_LOC (see the function create_intrinsic_function). This change saves and restores the list of use statements, so that rejected use statements are removed before they have a chance to be applied to the current namespace and create dangling pointers. PR fortran/107426 gcc/fortran/ChangeLog: * gfortran.h (gfc_save_module_list, gfc_restore_old_module_list): New declarations. * module.cc (old_module_list_tail): New global variable. (gfc_save_module_list, gfc_restore_old_module_list): New functions. (gfc_use_modules): Set module_list and old_module_list_tail. * parse.cc (next_statement): Save module_list before doing any work. (reject_statement): Restore module_list to its saved value. gcc/testsuite/ChangeLog: * gfortran.dg/pr89943_3.f90: Update error pattern. * gfortran.dg/pr89943_4.f90: Likewise. * gfortran.dg/use_31.f90: New test.	2024-03-22 13:19:06 +01:00
Mikael Morin	44c0398e65	fortran: Fix specification expression error with dummy procedures [PR111781] This fixes a spurious invalid variable in specification expression error. The error was caused on the testcase from the PR by two different bugs. First, the call to is_parent_of_current_ns was unable to recognize correct host association and returned false. Second, an ad-hoc condition coming next was using a global variable previously improperly restored to false (instead of restoring it to its initial value). The latter happened on the testcase because one dummy argument was a procedure, and checking that argument what causing a check of all its arguments with the (improper) reset of the flag at the end, and that preceded the check of the next argument. For the first bug, the wrong result of is_parent_of_current_ns is fixed by correcting the namespaces that function deals with, both the one passed as argument and the current one tracked in the gfc_current_ns global. Two new functions are introduced to select the right namespace. Regarding the second bug, the problematic condition is removed, together with the formal_arg_flag associated with it. Indeed, that condition was (wrongly) allowing local variables to be used in array bounds of dummy arguments. PR fortran/111781 gcc/fortran/ChangeLog: * symbol.cc (gfc_get_procedure_ns, gfc_get_spec_ns): New functions. * gfortran.h (gfc_get_procedure_ns, gfc_get_spec ns): Declare them. (gfc_is_formal_arg): Remove. * expr.cc (check_restricted): Remove special case allowing local variable in dummy argument bound expressions. Use gfc_get_spec_ns to get the right namespace. * resolve.cc (gfc_is_formal_arg, formal_arg_flag): Remove. (gfc_resolve_formal_arglist): Set gfc_current_ns. Quit loop and restore gfc_current_ns instead of early returning. (resolve_symbol): Factor common array spec resolution code to... (resolve_symbol_array_spec): ... this new function. Additionnally set and restore gfc_current_ns. gcc/testsuite/ChangeLog: * gfortran.dg/spec_expr_8.f90: New test. * gfortran.dg/spec_expr_9.f90: New test.	2024-03-22 13:07:38 +01:00
Mikael Morin	ebace32a26	testsuite: Declare fortran array bound variables This fixes invalid undeclared fortran array bound variables in the testsuite. gcc/testsuite/ChangeLog: * gfortran.dg/graphite/pr107865.f90: Declare array bound variable(s) as dummy argument(s). * gfortran.dg/pr101267.f90: Likewise. * gfortran.dg/pr112404.f90: Likewise. * gfortran.dg/pr78061.f: Likewise. * gfortran.dg/pr79315.f90: Likewise. * gfortran.dg/vect/pr90681.f: Likewise. * gfortran.dg/vect/pr97761.f90: Likewise. * gfortran.dg/vect/pr99746.f90: Likewise.	2024-03-22 13:07:38 +01:00
Pan Li	47de95d801	RISC-V: Introduce gcc attribute riscv_rvv_vector_bits for RVV This patch would like to introduce one new gcc attribute for RVV. This attribute is used to define fixed-length variants of one existing sizeless RVV types. This attribute is valid if and only if the mrvv-vector-bits=zvl, the only one args should be the integer constant and its' value is terminated by the LMUL and the vector register bits in zvlb. For example: typedef vint32m2_t fixed_vint32m2_t __attribute__((riscv_rvv_vector_bits(128))); The above type define is valid when -march=rv64gc_zve64d_zvl64b (aka 2(m2) 64 = 128 for vin32m2_t), and will report error when -march=rv64gcv_zvl128b similar to below. "error: invalid RVV vector size '128', expected size is '256' based on LMUL of type and '-mrvv-vector-bits=zvl'" Meanwhile, a pre-define macro __riscv_v_fixed_vlen is introduced to represent the fixed vlen in a RVV vector register. For the vintm_t below operations are allowed. * The sizeof. * The global variable(s). * The element of union and struct. * The cast to other equalities. * CMP: >, <, ==, !=, <=, >= * ALU: +, -, , /, %, &, \|, ^, >>, <<, ~, - The CMP will return vintm_t the same as aarch64 sve. For example: typedef vint32m1_t fixed_vint32m1_t __attribute__((riscv_rvv_vector_bits(128))); fixed_vint32m1_t less_than (fixed_vint32m1_t a, fixed_vint32m1_t b) { return a < b; } For the vfloatm_t below operations are allowed. The sizeof. * The global variable(s). * The element of union and struct. * The cast to other equalities. * CMP: >, <, ==, !=, <=, >= * ALU: +, -, , /, - The CMP will return vfloatm_t the same as aarch64 sve. For example: typedef vfloat32m1_t fixed_vfloat32m1_t __attribute__((riscv_rvv_vector_bits(128))); fixed_vfloat32m1_t less_than (fixed_vfloat32m1_t a, fixed_vfloat32m1_t b) { return a < b; } For the vbool_t types only below operations are allowed except the CMP and ALU. The CMP and ALU operations on vbool_t is not well defined currently. The sizeof. * The global variable(s). * The element of union and struct. * The cast to other equalities. For the vintxm_t tuple types are not suppored in this patch which is compatible with clang. This patch passed the below testsuites. The riscv fully regression tests. gcc/ChangeLog: * config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins): Add pre-define macro __riscv_v_fixed_vlen when zvl. * config/riscv/riscv.cc (riscv_handle_rvv_vector_bits_attribute): New static func to take care of the RVV types decorated by the attributes. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-1.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-10.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-11.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-12.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-13.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-14.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-15.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-16.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-17.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-18.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-2.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-3.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-4.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-5.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-6.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-7.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-8.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-9.c: New test. * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits.h: New test. Signed-off-by: Pan Li <pan2.li@intel.com>	2024-03-22 18:38:37 +08:00
Stefan Schulze Frielinghaus	e0a7233e1d	s390: testsuite: Fix backprop-6.c gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/backprop-6.c: On s390 we also have a copysign optab for long double. Thus, scan 3 instead of 2 times for it.	2024-03-22 11:23:24 +01:00
Jakub Jelinek	ca27c3b3a0	testsuite: Fix up depobj-3.c test on i686-linux [PR112724] While I've posted a patch to handle EXCESS_PRECISION_EXPR in C/C++ pretty printing, still we'd need to handle (a + (float)5) and (float)(((long double)a) + (long double)5) and possibly (float)(((double)a) + (double)5) too for s390?, so the following patch just uses -fexcess-precision=fast, so that the expression is always the same. 2024-03-22 Jakub Jelinek <jakub@redhat.com> PR c++/112724 * c-c++-common/gomp/depobj-3.c: Add -fexcess-precision=fast as dg-additional-options.	2024-03-22 10:26:22 +01:00
Andrew Pinski	dbe9062ce0	Another ICE after conflicting types of redeclaration [PR109619] This another one of these ICE after error issues with the gimplifier and a fallout from r12-3278-g823685221de986af. This case happens when we are trying to fold memcpy/memmove. There is already code to try to catch ERROR_MARKs as arguments to the builtins so just need to change them to use error_operand_p which checks the type of the expression to see if it was an error mark also. Bootstrapped and tested on x86_64-linux-gnu with no regressions. gcc/ChangeLog: PR c/109619 * builtins.cc (fold_builtin_1): Use error_operand_p instead of checking against ERROR_MARK. (fold_builtin_2): Likewise. (fold_builtin_3): Likewise. gcc/testsuite/ChangeLog: PR c/109619 * gcc.dg/redecl-26.c: New test. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>	2024-03-22 02:19:43 -07:00
Rainer Orth	644a7033cc	testsuite: vect: Remove dg-final in gcc.dg/vect/bb-slp-32.c [PR96147] gcc.dg/vect/bb-slp-32.c currently XPASSes on 32 and 64-bit Solaris/SPARC: XPASS: gcc.dg/vect/bb-slp-32.c -flto -ffat-lto-objects scan-tree-dump slp2 "vectorization is not profitable" XPASS: gcc.dg/vect/bb-slp-32.c scan-tree-dump slp2 "vectorization is not profitable" Richard suggested to remove the dg-final, so this is what the patch does. Tested on sparc-sun-solaris2.11 and i386-pc-solaris2.11. 2024-03-19 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> gcc/testsuite: PR tree-optimization/96147 * gcc.dg/vect/bb-slp-32.c (dg-final): Remove.	2024-03-22 10:07:05 +01:00
Rainer Orth	3d406af200	testsuite: i386: Skip gcc.target/i386/avx512cd-vpbroadcastmb2q-2.c etc. with Solaris as [PR114150] Two avx512cd tests FAIL to assemble with the Solaris/x86 assembler: FAIL: gcc.target/i386/avx512cd-vpbroadcastmb2q-2.c (test for excess errors) UNRESOLVED: gcc.target/i386/avx512cd-vpbroadcastmb2q-2.c compilation failed to produce executable FAIL: gcc.target/i386/avx512cd-vpbroadcastmw2d-2.c (test for excess errors) UNRESOLVED: gcc.target/i386/avx512cd-vpbroadcastmw2d-2.c compilation failed to produce executable Excess errors: Assembler: avx512cd-vpbroadcastmb2q-2.c "/var/tmp//ccs_9lod.s", line 42 : Invalid instruction argument Near line: " vpbroadcastmb2q %k0, %zmm0" Assembler: avx512cd-vpbroadcastmw2d-2.c "/var/tmp//ccevT6Rd.s", line 35 : Invalid instruction argument Near line: " vpbroadcastmw2d %k0, %zmm0" This seems to be an as bug, but given that this rarely if ever gets any fixes these days, this test just skips the affected tests. Adjuststing check_effective_target_avx512cd instead doesn't seem sensible since it would disable quite a number of working tests. Tested on i386-pc-solaris2.11 (as and gas) and x86_64-pc-linux-gnu. 2024-03-19 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> gcc/testsuite: PR target/114150 * gcc.target/i386/avx512cd-vpbroadcastmb2q-2.c: Skip on Solaris/x86 with as. * gcc.target/i386/avx512cd-vpbroadcastmw2d-2.c: Likewise.	2024-03-22 09:55:03 +01:00
Jakub Jelinek	ddd4a3ca87	ubsan: Don't -fsanitize=null instrument __seg_fs/gs pointers [PR111736] On x86 and avr some address spaces allow 0 pointers (on avr actually even generic as, but libsanitizer isn't ported to it and I'm not convinced we should completely kill -fsanitize=null in that case). The following patch makes sure those aren't diagnosed for -fsanitize=null, though they are still sanitized for -fsanitize=alignment. 2024-03-22 Jakub Jelinek <jakub@redhat.com> PR sanitizer/111736 * ubsan.cc (ubsan_expand_null_ifn, instrument_mem_ref): Avoid SANITIZE_NULL instrumentation for non-generic address spaces for which targetm.addr_space.zero_address_valid (as) is true. * gcc.dg/ubsan/pr111736.c: New test.	2024-03-22 09:24:42 +01:00
Jakub Jelinek	982250b230	bitint: Some bitint store fixes [PR114405] The following patch fixes some bugs in the handling of stores to large/huge _BitInt bitfields. In the first 2 hunks we are processing the most significant limb of the actual type (not necessarily limb in the storage), and so we know it is either partial or full limb, so [1, limb_prec] bits rather than [0, limb_prec - 1] bits as the code actually assumed. So, those 2 spots are fixed by making sure if tprec is a multiple of limb_prec we actually use limb_prec bits rather than 0. Otherwise, it e.g. happily could create and use 0 precision INTEGER_TYPE even when it actually should have processed 64 bits, or for non-zero bo_bit could handle just say 1 bit rather than 64 bits plus 1 bit in the last hunk spot. In the last hunk we are dealing with the extra bits in the last storage limb, and the code was e.g. happily creating 65 bit precision INTEGER_TYPE, even when we really should use 1 bit precision in that case. Also, it used a wrong offset in that case. The large testcase covers all these cases. 2024-03-22 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/114405 * gimple-lower-bitint.cc (bitint_large_huge::lower_mergeable_stmt): Set rprec to limb_prec rather than 0 if tprec is divisible by limb_prec. In the last bf_cur handling, set rprec to (tprec + bo_bit) % limb_prec rather than tprec % limb_prec and use just rprec instead of rprec + bo_bit. For build_bit_field_ref offset, divide (tprec + bo_bit) by limb_prec rather than just tprec. * gcc.dg/torture/bitint-66.c: New test.	2024-03-22 09:23:16 +01:00
Stefan Schulze Frielinghaus	d4ad99b035	s390: testsuite: Fix abs-4.c gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/abs-4.c: On s390 we also have a copysign optab for long double. Thus, scan 3 instead of 2 times for it.	2024-03-22 08:41:39 +01:00
Christoph Müllner	fd5e5dda8d	RISC-V: Don't add fractional LMUL types to V_VLS for XTheadVector The expansion of `memset` (via expand_builtin_memset_args()) uses clear_by_pieces() and store_by_pieces() to avoid calls to the C runtime. To check if a type can be used for that purpose the function by_pieces_mode_supported_p() tests if a `mov` and a `vec_duplicate` INSN can be expaned by the backend. The `vec_duplicate` expansion takes arguments of type `V_VLS`. The `mov` expansions take arguments of type `V`, `VB`, `VT`, `VLS_AVL_IMM`, and `VLS_AVL_REG`. Some of these types (in fact not types but type iterators) include fractional LMUL types. E.g. `V_VLS` includes `V`, which includes `VI`, which includes `RVVMF2QI`. This results in an attempt to use fractional LMUL-types for the `memset` expansion resulting in an ICE for XTheadVector, because that extension cannot handle fractional LMULs. This patch addresses this issue by splitting the definition of the `VI` mode itereator into `VI_NOFRAC` (without fractional LMUL types) and `VI_FRAC` (only fractional LMUL types). Further, it defines `V_VLS` such, that `VI_FRAC` types are only included if XTheadVector is not enabled. The effect is demonstrated by a new test case that shows that the by-pieces framework now emits `sb` instructions instead of triggering an ICE. Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu> PR target/114194 gcc/ChangeLog: * config/riscv/vector-iterators.md: Split VI into VI_FRAC and VI_NOFRAC. Only include VI_NOFRAC in V_VLS without TARGET_XTHEADVECTOR. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/xtheadvector/pr114194.c: New test. Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>	2024-03-22 07:52:47 +01:00
Jeff Law	c65046ff2e	[committed] Fix RISC-V missing stack tie As some of you know, Raphael has been working on stack-clash support for the RISC-V port. A little while ago Florian reached out to us with an issue where glibc was failing its smoke test due to referencing an unallocated stack slot. Without diving into the code in detail I (incorrectly) concluded it was a problem with the fallback of using Ada's stack-check paths due to not having stack-clash support. Once enough stack-clash bits were ready I had Raphael review the code generated for Florian's test and we concluded the the original case from Florian was just wrong irrespective of stack clash/stack check. While Raphael's stack-clash work will indirectly fix Florian's case, it really should also work without stack-clash. In particular this code was called out by valgrind: > 000000000003cb5e <realpath@@GLIBC_2.27>: > __GI___realpath(): > 3cb5e: 81010113 addi sp,sp,-2032 > 3cb62: 7d313423 sd s3,1992(sp) > 3cb66: 79fd lui s3,0xfffff > 3cb68: 7e813023 sd s0,2016(sp) > 3cb6c: 7c913c23 sd s1,2008(sp) > 3cb70: 7f010413 addi s0,sp,2032 > 3cb74: 35098793 addi a5,s3,848 # fffffffffffff350 <__libc_initial+0xffffffffffe8946a> > 3cb78: 74fd lui s1,0xfffff > 3cb7a: 008789b3 add s3,a5,s0 > 3cb7e: f9048793 addi a5,s1,-112 # ffffffffffffef90 <__libc_initial+0xffffffffffe890aa> > 3cb82: 008784b3 add s1,a5,s0 > 3cb86: 77fd lui a5,0xfffff > 3cb88: 7d413023 sd s4,1984(sp) > 3cb8c: 7b513c23 sd s5,1976(sp) > 3cb90: 7e113423 sd ra,2024(sp) > 3cb94: 7d213823 sd s2,2000(sp) > 3cb98: 7b613823 sd s6,1968(sp) > 3cb9c: 7b713423 sd s7,1960(sp) > 3cba0: 7b813023 sd s8,1952(sp) > 3cba4: 79913c23 sd s9,1944(sp) > 3cba8: 79a13823 sd s10,1936(sp) > 3cbac: 79b13423 sd s11,1928(sp) > 3cbb0: 34878793 addi a5,a5,840 # fffffffffffff348 <__libc_initial+0xffffffffffe89462> > 3cbb4: 40000713 li a4,1024 > 3cbb8: 00132a17 auipc s4,0x132 > 3cbbc: ae0a3a03 ld s4,-1312(s4) # 16e698 <__stack_chk_guard> > 3cbc0: 01098893 addi a7,s3,16 > 3cbc4: 42098693 addi a3,s3,1056 > 3cbc8: b8040a93 addi s5,s0,-1152 > 3cbcc: 97a2 add a5,a5,s0 > 3cbce: 000a3603 ld a2,0(s4) > 3cbd2: f8c43423 sd a2,-120(s0) > 3cbd6: 4601 li a2,0 > 3cbd8: 3d14b023 sd a7,960(s1) > 3cbdc: 3ce4b423 sd a4,968(s1) > 3cbe0: 7cd4b823 sd a3,2000(s1) > 3cbe4: 7ce4bc23 sd a4,2008(s1) > 3cbe8: b7543823 sd s5,-1168(s0) > 3cbec: b6e43c23 sd a4,-1160(s0) > 3cbf0: e38c sd a1,0(a5) > 3cbf2: b0010113 addi sp,sp,-1280 In particular note the store at 0x3cbd8. That's hitting (s1 + 960). If you chase the values around, you'll find it's a bit more than 1k into unallocated stack space. It's also worth noting the final stack adjustment at 0x3cbf2. While I haven't reproduced Florian's code exactly, I was able to get reasonably close and verify my suspicion that everything was fine before sched2 and incorrect after sched2. It was also obvious at that point what had gone wrong -- we were missing a stack tie after the final stack pointer adjustment. This patch adds the missing stack tie. While not technically a regression, I shudder at the thought of chasing one of these issues down again in the wild. Been there, done that. Regression tested on rv64gc. Verified the scheduler no longer mucked up realpath by hand. Pushing to the trunk. gcc/ * config/riscv/riscv.cc (riscv_expand_prologue): Add missing stack tie for scalable and final stack adjustment if needed. Co-authored-by: Raphael Zinsly <rzinsly@ventanamicro.com>	2024-03-21 20:46:45 -06:00
Pan Li	9941f0295a	RISC-V: Bugfix function target attribute pollution This patch depends on below ICE fix. https://gcc.gnu.org/pipermail/gcc-patches/2024-March/647915.html The function target attribute should be on a per-function basis. For example, we have 3 function as below: void test_1 () {} void __attribute__((target("arch=+v"))) test_2 () {} void __attribute__((target("arch=+zfh"))) test_3 () {} void test_4 () {} The scope of the target attribute should not extend the function body. Aka, test_3 cannot have the 'v' extension, as well as the test_4 cannot have both the 'v' and 'zfh' extension. Unfortunately, for now the test_4 is able to leverage the 'v' and the 'zfh' extension which is incorrect. This patch would like to fix the sticking attribute by introduce the commandline subset_list. When parse_arch, we always clone from the cmdline_subset_list instead of the current_subset_list. Meanwhile, we correct the print information about arch like below. .option arch, rv64i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_zicsr2p0_zifencei2p0_zbb1p0 The riscv_declare_function_name hook is always after the hook riscv_process_target_attr. Thus, we introduce one hash_map to record the 1:1 mapping from fndel to its' subset_list in advance. And later the riscv_declare_function_name is able to get the right information about the arch. Below test are passed for this patch * The riscv fully regression test. PR target/114352 gcc/ChangeLog: * common/config/riscv/riscv-common.cc (struct riscv_func_target_info): New struct for func decl and target name. (struct riscv_func_target_hasher): New hasher for hash table mapping from the fn_decl to fn_target_name. (riscv_func_decl_hash): New func to compute the hash for fn_decl. (riscv_func_target_hasher::hash): New func to impl hash interface. (riscv_func_target_hasher::equal): New func to impl equal interface. (riscv_cmdline_subset_list): New static var for cmdline subset list. (riscv_func_target_table_lazy_init): New func to lazy init the func target hash table. (riscv_func_target_get): New func to get target name from hash table. (riscv_func_target_put): New func to put target name into hash table. (riscv_func_target_remove_and_destory): New func to remove target info from the hash table and destory it. (riscv_parse_arch_string): Set the static var cmdline_subset_list. * config/riscv/riscv-subset.h (riscv_cmdline_subset_list): New static var for cmdline subset list. (riscv_func_target_get): New func decl. (riscv_func_target_put): Ditto. (riscv_func_target_remove_and_destory): Ditto. * config/riscv/riscv-target-attr.cc (riscv_target_attr_parser::parse_arch): Take cmdline_subset_list instead of current_subset_list when clone. (riscv_process_target_attr): Record the func target info to hash table. (riscv_option_valid_attribute_p): Add new arg tree fndel. * config/riscv/riscv.cc (riscv_declare_function_name): Consume the func target info and print the arch message. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pr114352-3.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>	2024-03-22 10:39:07 +08:00

1 2 3 4 5 ...

209583 Commits