mirror/gcc - gcc - Collaboration & Inovation

mirror/gcc

mirror of git://gcc.gnu.org/git/gcc.git synced 2024-11-28 09:00:01 +08:00

Author	SHA1	Message	Date
Tamar Christina	dbc38dd9e9	middle-end: Pass along SLP node when costing vector loads/stores With the support to SLP only we now pass the VMAT through the SLP node, however the majority of the costing calls inside vectorizable_load and vectorizable_store do no pass the SLP node along. Due to this the backend costing never sees the VMAT for these cases anymore. Additionally the helper around record_stmt_cost when both SLP and stmt_vinfo are passed would only pass the SLP node along. However the SLP node doesn't contain all the info available in the stmt_vinfo and we'd have to go through the SLP_TREE_REPRESENTATIVE anyway. As such I changed the function to just Always pass both along. Unlike the VMAT changes, I don't believe there to be a correctness issue here but would minimize the number of churn in the backend costing until vectorizer costing as a whole is revisited in GCC 16. These changes re-enable the cost model on AArch64 and also correctly find the VMATs on loads and stores fixing testcases such as sve_iters_low_2.c. gcc/ChangeLog: * tree-vect-data-refs.cc (vect_get_data_access_cost): Pass NULL for SLP node. * tree-vect-stmts.cc (record_stmt_cost): Expose. (vect_get_store_cost, vect_get_load_cost): Extend with SLP node. (vectorizable_store, vectorizable_load): Pass SLP node to all costing. * tree-vectorizer.h (record_stmt_cost): Always pass both SLP node and stmt_vinfo to costing. (vect_get_load_cost, vect_get_store_cost): Extend with SLP node.	2024-11-21 12:49:35 +00:00
Rainer Orth	116b1c5489	Use decl size in Solaris ASM_DECLARE_OBJECT_NAME [PR102296] Solaris has modified versions of ASM_DECLARE_OBJECT_NAME on both i386 and sparc. When commit `ce597aedd7` Author: Ilya Enkovich <ilya.enkovich@intel.com> Date: Thu Aug 7 08:04:55 2014 +0000 elfos.h (ASM_DECLARE_OBJECT_NAME): Use decl size instead of type size. was applied, those were missed. At the same time, the testcase was restricted to Linux though there's nothing Linux-specific in there, so the error remained undetected. This patch fixes the definitions to match elfos.h and enables the test on Solaris, too. Bootstrapped without regressions on i386-pc-solaris2.11 and sparc-sun-solaris2.11. 2024-11-19 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> gcc/testsuite: PR target/102296 * gcc.target/i386/struct-size.c: Enable on --solaris. gcc: PR target/102296 config/i386/sol2.h (ASM_DECLARE_OBJECT_NAME): Use decl size instead of type size. * config/sparc/sol2.h (ASM_DECLARE_OBJECT_NAME): Likewise.	2024-11-21 13:41:19 +01:00
Christoph Müllner	1c4d39ada3	forwprop: Try to blend two isomorphic VEC_PERM sequences This extends forwprop by yet another VEC_PERM optimization: It attempts to blend two isomorphic vector sequences by using the redundancy in the lane utilization in these sequences. This redundancy in lane utilization comes from the way how specific scalar statements end up vectorized: two VEC_PERMs on top, binary operations on both of them, and a final VEC_PERM to create the result. Here is an example of this sequence: v_in = {e0, e1, e2, e3} v_1 = VEC_PERM <v_in, v_in, {0, 2, 0, 2}> // v_1 = {e0, e2, e0, e2} v_2 = VEC_PERM <v_in, v_in, {1, 3, 1, 3}> // v_2 = {e1, e3, e1, e3} v_x = v_1 + v_2 // v_x = {e0+e1, e2+e3, e0+e1, e2+e3} v_y = v_1 - v_2 // v_y = {e0-e1, e2-e3, e0-e1, e2-e3} v_out = VEC_PERM <v_x, v_y, {0, 1, 6, 7}> // v_out = {e0+e1, e2+e3, e0-e1, e2-e3} To remove the redundancy, lanes 2 and 3 can be freed, which allows to change the last statement into: v_out' = VEC_PERM <v_x, v_y, {0, 1, 4, 5}> // v_out' = {e0+e1, e2+e3, e0-e1, e2-e3} The cost of eliminating the redundancy in the lane utilization is that lowering the VEC PERM expression could get more expensive because of tighter packing of the lanes. Therefore this optimization is not done alone, but in only in case we identify two such sequences that can be blended. Once all candidate sequences have been identified, we try to blend them, so that we can use the freed lanes for the second sequence. On success we convert 2x (2x BINOP + 1x VEC_PERM) to 2x VEC_PERM + 2x BINOP + 2x VEC_PERM traded for 4x VEC_PERM + 2x BINOP. The implemented transformation reuses (rewrites) the statements of the first sequence and the last VEC_PERM of the second sequence. The remaining four statements of the second statment are left untouched and will be eliminated by DCE later. This targets x264_pixel_satd_8x4, which calculates the sum of absolute transformed differences (SATD) using Hadamard transformation. We have seen 8% speedup on SPEC's x264 on a 5950X (x86-64) and 7% speedup on an AArch64 machine. Bootstrapped and reg-tested on x86-64 and AArch64 (all languages). gcc/ChangeLog: * tree-ssa-forwprop.cc (struct _vec_perm_simplify_seq): New data structure to store analysis results of a vec perm simplify sequence. (get_vect_selector_index_map): Helper to get an index map from the provided vector permute selector. (recognise_vec_perm_simplify_seq): Helper to recognise a vec perm simplify sequence. (narrow_vec_perm_simplify_seq): Helper to pack the lanes more tight. (can_blend_vec_perm_simplify_seqs_p): Test if two vec perm sequences can be blended. (calc_perm_vec_perm_simplify_seqs): Helper to calculate the new permutation indices. (blend_vec_perm_simplify_seqs): Helper to blend two vec perm simplify sequences. (process_vec_perm_simplify_seq_list): Helper to process a list of vec perm simplify sequences. (append_vec_perm_simplify_seq_list): Helper to add a vec perm simplify sequence to the list. (pass_forwprop::execute): Integrate new functionality. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/satd-hadamard.c: New test. * gcc.dg/tree-ssa/vector-10.c: New test. * gcc.dg/tree-ssa/vector-8.c: New test. * gcc.dg/tree-ssa/vector-9.c: New test. * gcc.target/aarch64/sve/satd-hadamard.c: New test. Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>	2024-11-21 13:38:54 +01:00
H.J. Lu	42a8005c63	apx-ndd-tls-1[ab].c: Add -std=gnu17 Since GCC 15 defaults to -std=gnu23, add -std=gnu17 to apx-ndd-tls-1[ab].c to avoid: gcc.target/i386/apx-ndd-tls-1a.c: In function ‘k’: gcc.target/i386/apx-ndd-tls-1a.c:29:7: error: too many arguments to function ‘l’ gcc.target/i386/apx-ndd-tls-1a.c:25:5: note: declared here * gcc.target/i386/apx-ndd-tls-1a.c: -std=gnu17. * gcc.target/i386/apx-ndd-tls-1b.c: Likewise. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>	2024-11-21 19:12:27 +08:00
Rainer Orth	0f7def8549	libgomp: testsuite: Fix libgomp.c/alloc-pinned-3.c etc. for C23 on non-Linux Since the switch to a C23 default, three libgomp tests FAIL on Solaris: FAIL: libgomp.c/alloc-pinned-3.c (test for excess errors) UNRESOLVED: libgomp.c/alloc-pinned-3.c compilation failed to produce executable FAIL: libgomp.c/alloc-pinned-4.c (test for excess errors) UNRESOLVED: libgomp.c/alloc-pinned-4.c compilation failed to produce executable FAIL: libgomp.c/alloc-pinned-6.c (test for excess errors) UNRESOLVED: libgomp.c/alloc-pinned-6.c compilation failed to produce executable Excess errors: /vol/gcc/src/hg/master/local/libgomp/testsuite/libgomp.c/alloc-pinned-3.c:104:3: error: too many arguments to function 'set_pin_limit' Fixed by adding the missing size argument to the stub functions. Tested on i386-pc-solaris2.11 and sparc-sun-solaris2.11. 2024-11-20 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> libgomp: * testsuite/libgomp.c/alloc-pinned-3.c [!__linux__] (set_pin_limit): Add size arg. * testsuite/libgomp.c/alloc-pinned-4.c [!__linux__] (set_pin_limit): Likewise. * testsuite/libgomp.c/alloc-pinned-6.c [!__linux__] (set_pin_limit): Likewise.	2024-11-21 11:46:36 +01:00
Jakub Jelinek	806563f11e	include: Add new post-DWARF 5 DW_LANG_* enumerators DWARF changed the language code assignment to be on a web page and after DWARF 5 has been published already 27 codes have been assigned. We have some of those already in the header, but most of them were missing, including one added just yesterday (DW_LANG_C23). Note, this is really post-DWARF 5 stuff rather than DWARF 6, because DWARF 6 plans to switch from DW_AT_language to DW_AT_language_{name,version} pair where we'll say DW_LNAME_C with 202311 version instead of this. 2024-11-21 Jakub Jelinek <jakub@redhat.com> * dwarf2.h (enum dwarf_source_language): Add comment where the post DWARF 5 additions start. Refresh list from https://dwarfstd.org/languages.html.	2024-11-21 10:17:03 +01:00
Richard Biener	7e9b0d90d3	tree-optimization/117720 - check alignment for VMAT_STRIDED_SLP While vectorizable_store was already checking alignment requirement of the stores and fall back to elementwise accesses if not honored the vectorizable_load path wasn't doing this. After the previous change to disregard alignment checking for VMAT_STRIDED_SLP in get_group_load_store_type this now tripped on power. PR tree-optimization/117720 * tree-vect-stmts.cc (vectorizable_load): For VMAT_STRIDED_SLP verify the choosen load type is OK with regard to alignment.	2024-11-21 10:04:55 +01:00
Jakub Jelinek	ab8d3606bb	c-family, docs: Adjust descriptions/documentation for C23 publication As C23 has been published already https://www.iso.org/standard/82075.html we don't need to say that it is expected to be published etc. Furthermore, standards.texi was still documenting that -std=gnu17 is the default. 2024-11-21 Jakub Jelinek <jakub@redhat.com> gcc/ * doc/invoke.texi (-std=c23): Adjust documentation for publication of the ISO/IEC 9899:2024 standard. * doc/standards.texi: Likewise. Document -std=gnu17 and -std=gnu23 options. Mention that -std=gnu23 rather than -std=gnu17 is now the default for C. gcc/c-family/ * c.opt (std=c23, std=gnu23, std=iso9899:2024): Adjust description for publication of the ISO/IEC 9899:2024 standard.	2024-11-21 09:40:37 +01:00
Jakub Jelinek	05ab9447fe	phiopt: Improve spaceship_replacement for HONOR_NANS [PR117612] The following patch optimizes spaceship followed by comparisons of the spaceship value even for floating point spaceship when NaNs can appear. operator<=> for this emits roughly signed char c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; and I believe the /* The optimization may be unsafe due to NaNs. / comment just isn't true. Sure, the i == j comparison doesn't raise exceptions on qNaNs, but if one of the operands is qNaN, then i == j is false and i < j or i > j is then executed and raises exceptions even on qNaNs. And we can safely optimize say c == -1 comparison after the above into i < j, that also raises exceptions like before and handles NaNs the same way as the original. The only unsafe transormation would be c == 0 or c != 0, turning it into i == j or i != j wouldn't raise exception, so I'm not doing that optimization (but other parts of the compiler optimize the i < j comparison away anyway). Anyway, to match the HONOR_NANS case, we need to verify that the second comparison has true edge to the phi_bb (yielding there -1 or 1), it can't be the false edge because when NaNs are honored, the false edge is for both the case where the inverted comparison is true or when one of the operands is NaN. Similarly we need to ensure that the two non-equality comparisons are the opposite, while for -ffast-math we can in some cases get one comparison x >= 5.0 and the other x > 5.0 and it is fine, because NaN is UB, when NaNs are honored, they must be different to leave the unordered case with 2 value as the last one remaining. The patch also punts if HONOR_NANS and the phi has just 3 arguments instead of 4. When NaNs are honored, we also in some cases need to perform some comparison and then invert its result (so that exceptions are properly thrown and we get the correct result). 2024-11-21 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/94589 PR tree-optimization/117612 tree-ssa-phiopt.cc (spaceship_replacement): Handle HONOR_NANS (TREE_TYPE (lhs1)) case when possible. * gcc.dg/pr94589-5.c: New test. * gcc.dg/pr94589-6.c: New test. * g++.dg/opt/pr94589-5.C: New test. * g++.dg/opt/pr94589-6.C: New test.	2024-11-21 09:39:06 +01:00
Jakub Jelinek	ca7430f145	phiopt: Fix a pasto in spaceship_replacement [PR117612] When working on the PR117612 fix, I've noticed a pasto in tree-ssa-phiopt.cc (spaceship_replacement). The code is if (absu_hwi (tree_to_shwi (arg2)) != 1) return false; if (e1->flags & EDGE_TRUE_VALUE) { if (tree_to_shwi (arg0) != 2 \|\| absu_hwi (tree_to_shwi (arg1)) != 1 \|\| wi::to_widest (arg1) == wi::to_widest (arg2)) return false; } else if (tree_to_shwi (arg1) != 2 \|\| absu_hwi (tree_to_shwi (arg0)) != 1 \|\| wi::to_widest (arg0) == wi::to_widest (arg1)) return false; where arg{0,1,2,3} are PHI args and wants to ensure that if e1 is a true edge, then arg0 is 2 and one of arg{1,2} is -1 and one is 1, otherwise arg1 is 2 and one of arg{0,2} is -1 and one is 1. But due to pasto in the latte case doesn't verify that arg0 is different from arg2, it could be both -1 or both 1 and we wouldn't punt. The wi::to_widest (arg0) == wi::to_widest (arg1) test is always false when we've made sure in the earlier conditions that arg1 is 2 and arg0 is -1 or 1, so never 2. 2024-11-21 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/94589 PR tree-optimization/117612 * tree-ssa-phiopt.cc (spaceship_replacement): Fix up a pasto in check when arg1 is 2.	2024-11-21 09:38:01 +01:00
Jakub Jelinek	7272e09c9b	c: Add u{,l,ll,imax}abs builtins [PR117024] The following patch adds u{,l,ll,imax}abs builtins, which just fold to ABSU_EXPR, similarly to how {,l,ll,imax}abs builtins fold to ABS_EXPR. 2024-11-21 Jakub Jelinek <jakub@redhat.com> PR c/117024 gcc/ * coretypes.h (enum function_class): Add function_c2y_misc enumerator. * builtin-types.def (BT_FN_UINTMAX_INTMAX, BT_FN_ULONG_LONG, BT_FN_ULONGLONG_LONGLONG): New DEF_FUNCTION_TYPE_1s. * builtins.def (DEF_C2Y_BUILTIN): Define. (BUILT_IN_UABS, BUILT_IN_UIMAXABS, BUILT_IN_ULABS, BUILT_IN_ULLABS): New builtins. * builtins.cc (fold_builtin_abs): Handle also folding of uabs to ABSU_EXPR. (fold_builtin_1): Handle BUILT_IN_U{,L,LL,IMAX}ABS. gcc/lto/ChangeLog: lto-lang.cc (flag_isoc2y): New variable. gcc/ada/ChangeLog: * gcc-interface/utils.cc (flag_isoc2y): New variable. gcc/testsuite/ * gcc.c-torture/execute/builtins/lib/abs.c (uintmax_t): New typedef. (uabs, ulabs, ullabs, uimaxabs): New functions. * gcc.c-torture/execute/builtins/uabs-1.c: New test. * gcc.c-torture/execute/builtins/uabs-1.x: New file. * gcc.c-torture/execute/builtins/uabs-1-lib.c: New file. * gcc.c-torture/execute/builtins/uabs-2.c: New test. * gcc.c-torture/execute/builtins/uabs-2.x: New file. * gcc.c-torture/execute/builtins/uabs-2-lib.c: New file. * gcc.c-torture/execute/builtins/uabs-3.c: New test. * gcc.c-torture/execute/builtins/uabs-3.x: New test. * gcc.c-torture/execute/builtins/uabs-3-lib.c: New test.	2024-11-21 09:34:28 +01:00
Kewen Lin	10e702789e	rs6000: Adjust FLOAT128 signbit2 expander for P8 LE [PR114567] As the associated test case shows, signbit generated assembly is sub-optimal for _Float128 argument from memory on P8 LE. On P8 LE, p8swap pass puts an explicit AND -16 on the memory, which causes mode_dependent_address_p considers it's invalid to change its mode and combine fails to make use of the existing pattern signbit<SIGNBIT:mode>2_dm_mem. Considering it's always more efficient to make use of 8 bytes load and shift on P8 LE, this patch is to adjust the current expander and treat it specially. PR target/114567 gcc/ChangeLog: * config/rs6000/rs6000.md (expander signbit<FLOAT128:mode>2): Adjust. (signbit<mode>2_dm_mem): Rename to ... (signbit<mode>2_dm_mem): ... this. gcc/testsuite/ChangeLog: gcc.target/powerpc/pr114567.c: New test.	2024-11-21 07:41:34 +00:00
Kewen Lin	baf536754f	rs6000: Use standard name {add,sub}v1ti3 for altivec_v{add,sub}uqm This patch is to adjust define_insn altivec_v{add,sub}uqm with standard names, as the associated test case shows, w/o this patch, it ends up with scalar {add,subf}c/{add,subf}e, the standard names help to exploit v{add,sub}uqm. gcc/ChangeLog: * config/rs6000/altivec.md (altivec_vadduqm): Rename to ... (addv1ti3): ... this. (altivec_vsubuqm): Rename to ... (subv1ti3): ... this. * config/rs6000/rs6000-builtins.def (__builtin_altivec_vadduqm): Replace bif expander altivec_vadduqm with addv1ti3. (__builtin_altivec_vsubuqm): Replace bif expander altivec_vsubuqm with subv1ti3. gcc/testsuite/ChangeLog: * gcc.target/powerpc/p8vector-int128-3.c: New test.	2024-11-21 07:41:33 +00:00
Kewen Lin	ca96c1d1bc	rs6000: Remove entry for V1TImode from VI_unit When making a patch to adjust VECTOR_P8_VECTOR rs6000_vector enum, I noticed that V1TImode's mode attribute in VI_unit VECTOR_UNIT_ALTIVEC_P (V1TImode) is never true, since VECTOR_UNIT_ALTIVEC_P checks if vector_unit[V1TImode] is equal to VECTOR_ALTIVEC, but vector_unit[V1TImode] can only be VECTOR_NONE or VECTOR_P8_VECTOR, there is no chance to be VECTOR_ALTIVEC: rs6000_vector_unit[V1TImode] = (TARGET_P8_VECTOR) ? VECTOR_P8_VECTOR : VECTOR_NONE; By checking all uses of VI_unit, the used mode iterator is one of VI2, VI, VP_small and VP, none of them has V1TImode, so the entry for V1TImode is useless. I guessed it was designed to have one mode attribute to cover all integer vector modes, but later we separated V1TI handlings to its own patterns (those guarded with TARGET_VADDUQM). Anyway, this patch is to remove this useless and confusing entry. gcc/ChangeLog: * config/rs6000/altivec.md (mode attr for V1TI in VI_unit): Remove.	2024-11-21 07:41:33 +00:00
Kewen Lin	2441dc2495	rs6000: Add veqv support to eqv<mode>3_internal1 When making patch to replace TARGET_P8_VECTOR, I noticed for eqv<BOOL_128:mode>3_internal1 unlike the other logical operations, we only exploited the vsx version. I think it is an oversight, this patch is to consider veqv as well. gcc/ChangeLog: * config/rs6000/rs6000.md (*eqv<BOOL_128:mode>3_internal1): Generate insn veqv if TARGET_ALTIVEC and operands are altivec_register_operand.	2024-11-21 07:41:33 +00:00
Kewen Lin	0719ade048	rs6000: Remove ISA_3_0_MASKS_IEEE and check P9_VECTOR instead When working to get rid of mask bit OPTION_MASK_P8_VECTOR, I noticed that the check on ISA_3_0_MASKS_IEEE is actually to check TARGET_P9_VECTOR, since we check all three mask bits together and p9 vector guarantees p8 vector and vsx should be enabled. So this patch is to adjust this first as preparatory patch for the following patch to change all uses of OPTION_MASK_P8_VECTOR and TARGET_P8_VECTOR. gcc/ChangeLog: * config/rs6000/rs6000-cpus.def (ISA_3_0_MASKS_IEEE): Remove. * config/rs6000/rs6000.cc (rs6000_option_override_internal): Replace ISA_3_0_MASKS_IEEE check with TARGET_P9_VECTOR.	2024-11-21 07:41:33 +00:00
Kewen Lin	33386d1421	rs6000: Simplify some conditions or code related to TARGET_DIRECT_MOVE When I was making a patch to rework TARGET_P8_VECTOR, I noticed that there are some redundant checks and dead code related to TARGET_DIRECT_MOVE, so I made this patch as one separated preparatory patch, it consists of: - Check either TARGET_DIRECT_MOVE or TARGET_P8_VECTOR only according to the context, rather than checking both of them since they are actually the same (TARGET_DIRECT_MOVE is defined as TARGET_P8_VECTOR). - Simplify TARGET_VSX && TARGET_DIRECT_MOVE as TARGET_DIRECT_MOVE since direct move ensures VSX enabled. - Replace some TARGET_POWERPC64 && TARGET_DIRECT_MOVE as TARGET_DIRECT_MOVE_64BIT to simplify it. - Remove some dead code guarded with TARGET_DIRECT_MOVE but the condition never holds here. gcc/ChangeLog: * config/rs6000/rs6000.cc (rs6000_option_override_internal): Simplify TARGET_P8_VECTOR && TARGET_DIRECT_MOVE as TARGET_P8_VECTOR. (rs6000_output_move_128bit): Simplify TARGET_VSX && TARGET_DIRECT_MOVE as TARGET_DIRECT_MOVE. * config/rs6000/rs6000.h (TARGET_XSCVDPSPN): Simplify conditions TARGET_DIRECT_MOVE \|\| TARGET_P8_VECTOR as TARGET_P8_VECTOR. (TARGET_XSCVSPDPN): Likewise. (TARGET_DIRECT_MOVE_128): Simplify TARGET_DIRECT_MOVE && TARGET_POWERPC64 as TARGET_DIRECT_MOVE_64BIT. (TARGET_VEXTRACTUB): Likewise. (TARGET_DIRECT_MOVE_64BIT): Simplify TARGET_P8_VECTOR && TARGET_DIRECT_MOVE as TARGET_DIRECT_MOVE. * config/rs6000/rs6000.md (signbit<mode>2, @signbit<mode>2_dm, signbit<mode>2_dm_mem, floatsi<mode>2_lfiwax, floatsi<SFDF:mode>2_lfiwax_<QHI:mode>_mem_zext, floatunssi<mode>2_lfiwzx, float<QHI:mode><SFDF:mode>2, float<QHI:mode><SFDF:mode>2_internal, floatuns<QHI:mode><SFDF:mode>2, floatuns<QHI:mode><SFDF:mode>2_internal, p8_mtvsrd_v16qidi2, p8_mtvsrd_df, p8_xxpermdi_<mode>, reload_vsx_from_gpr<mode>, p8_mtvsrd_sf, reload_vsx_from_gprsf, p8_mfvsrd_3_<mode>, reload_gpr_from_vsx<mode>, reload_gpr_from_vsxsf, unpack<mode>_dm): Simplify TARGET_DIRECT_MOVE && TARGET_POWERPC64 as TARGET_DIRECT_MOVE_64BIT. (unpack<mode>_nodm): Simplify !TARGET_DIRECT_MOVE \|\| !TARGET_POWERPC64 as !TARGET_DIRECT_MOVE_64BIT. (fix_trunc<mode>si2, fix_trunc<mode>si2_stfiwx, fix_trunc<mode>si2_internal): Simplify TARGET_P8_VECTOR && TARGET_DIRECT_MOVE as TARGET_DIRECT_MOVE. (fix_trunc<mode>si2_stfiwx, fixuns_trunc<mode>si2_stfiwx): Remove some dead code as the guard TARGET_DIRECT_MOVE there never holds. (fixuns_trunc<mode>si2_stfiwx): Change TARGET_P8_VECTOR with TARGET_DIRECT_MOVE which is a better fit. config/rs6000/vsx.md (define_peephole2 for SFmode in GPR): Simplify TARGET_DIRECT_MOVE && TARGET_POWERPC64 as TARGET_DIRECT_MOVE_64BIT.	2024-11-21 07:41:33 +00:00
Torbjörn SVENSSON	e7e6608387	testsuite: arm: Use -march=unset for pr69175.C test Update test cases to use -mcpu=unset/-march=unset feature introduced in r15-3606-g7d6c6a0d15c. gcc/testsuite/ChangeLog: * g++.dg/opt/pr69175.C: Added option "-mcpu=unset". Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>	2024-11-21 07:51:49 +01:00
Torbjörn SVENSSON	49d3da0518	testsuite: arm: Use -march=unset for cortex-m55* tests Update test cases to use -mcpu=unset/-march=unset feature introduced in r15-3606-g7d6c6a0d15c. gcc/testsuite/ChangeLog: * gcc.target/arm/cortex-m55-nodsp-flag-hard.c: Added option "-march=unset". * gcc.target/arm/cortex-m55-nodsp-flag-softfp.c: Likewise. * gcc.target/arm/cortex-m55-nodsp-nofp-flag-softfp.c: Likesie. * gcc.target/arm/cortex-m55-nofp-flag-hard.c: Likewise. * gcc.target/arm/cortex-m55-nofp-flag-softfp.c: Likewise. * gcc.target/arm/cortex-m55-nofp-nomve-flag-softfp.c: Likewise. * gcc.target/arm/cortex-m55-nomve-flag-hard.c: Likewise. * gcc.target/arm/cortex-m55-nomve-flag-softfp.c: Likewise. * gcc.target/arm/cortex-m55-nomve.fp-flag-hard.c: Likewise. * gcc.target/arm/cortex-m55-nomve.fp-flag-softfp.c: Likewise. Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>	2024-11-21 07:51:49 +01:00
Torbjörn SVENSSON	3b21edeef9	testsuite: arm: Use effective target for pr57735.C test Update test cases to use -mcpu=unset/-march=unset feature introduced in r15-3606-g7d6c6a0d15c. gcc/testsuite/ChangeLog: * g++.dg/ext/pr57735.C: Use effective-target arm_cpu_xscale_arm. Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>	2024-11-21 07:51:49 +01:00
Torbjörn SVENSSON	115ae676fc	testsuite: arm: Use effective-target for nomve_fp_1.c test Update test cases to use -mcpu=unset/-march=unset feature introduced in r15-3606-g7d6c6a0d15c. gcc/testsuite/ChangeLog: * g++.target/arm/mve/general-c++/nomve_fp_1.c: Added option "-mcpu=unset". Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>	2024-11-21 07:51:49 +01:00
Torbjörn SVENSSON	ec5adef9be	testsuite: arm: Use effective-target for vect-early-break-cbranch test Update test cases to use -mcpu=unset/-march=unset feature introduced in r15-3606-g7d6c6a0d15c. gcc/testsuite/ChangeLog: * gcc.target/arm/vect-early-break-cbranch.c: Use effective-target arm_arch_v8a_hard. Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>	2024-11-21 07:51:48 +01:00
Torbjörn SVENSSON	3192c1df36	testsuite: arm: Use effective-target for {gcc,g++}.target/arm/ tests Update test cases to use -mcpu=unset/-march=unset feature introduced in r15-3606-g7d6c6a0d15c. gcc/testsuite/ChangeLog: * g++.target/arm/pr103676.C: Use effective-target arm_cpu_cortex_m7. * gcc.target/arm/no-volatile-in-it.c: Likewise. * gcc.target/arm/fma-sp.c: Use effective-target arm_cpu_cortex_m4_hard. * gcc.target/arm/pr53859.c: Use effective-target arm_cpu_cortex_m4. * gcc.target/arm/mve/intrinsics/pr97327.c: Use effective-target arm_cpu_cortex_m55. * gcc.target/arm/pr65067.c: Use effective-target arm_cpu_cortex_m3. * lib/target-supports.exp: Define effective-target arm_cpu_cortex_m3, arm_cpu_cortex_m4, arm_cpu_cortex_m4_hard, arm_cpu_cortex_m7 and arm_cpu_cortex_m55. Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>	2024-11-21 07:51:48 +01:00
Torbjörn SVENSSON	b12bc0bd59	testsuite: arm: Use effective-target for thumb2-slow-flash-data* tests Update test cases to use -mcpu=unset/-march=unset feature introduced in r15-3606-g7d6c6a0d15c. gcc/testsuite/ChangeLog: * gcc.target/arm/thumb2-slow-flash-data-2.c: Use effective-target arm_arch_v7em_hard. * gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise. * gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise. * gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise. Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>	2024-11-21 07:51:48 +01:00
Torbjörn SVENSSON	f55cc57c6e	testsuite: arm: Use effective-target for small-multiply-m* tests Update test cases to use -mcpu=unset/-march=unset feature introduced in r15-3606-g7d6c6a0d15c. gcc/testsuite/ChangeLog: * gcc.target/arm/small-multiply-m0-1.c: Use effective-target arm_arch_v6m and added option "-march=unset". * gcc.target/arm/small-multiply-m0-2.c: Likewise. * gcc.target/arm/small-multiply-m0-3.c: Likewise. * gcc.target/arm/small-multiply-m0plus-1.c: Likewise. * gcc.target/arm/small-multiply-m0plus-2.c: Likewise. * gcc.target/arm/small-multiply-m0plus-3.c: Likewise. * gcc.target/arm/small-multiply-m1-1.c: Likewise. * gcc.target/arm/small-multiply-m1-2.c: Likewise. * gcc.target/arm/small-multiply-m1-3.c: Likewise. * lib/target-supports.exp: Define effective-target arm_cpu_cortex_m0_small, arm_cpu_cortex_m0plus_small and arm_cpu_cortex_m1_small. Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>	2024-11-21 07:51:42 +01:00
Torbjörn SVENSSON	703839b8bd	testsuite: arm: Use effective-target for pure-code/* tests Update test cases to use -mcpu=unset/-march=unset feature introduced in r15-3606-g7d6c6a0d15c. gcc/testsuite/ChangeLog: * gcc.target/arm/pure-code/no-literal-pool-m0.c: Use effective-target arm_cpu_cortex-m0. * gcc.target/arm/pure-code/no-literal-pool-m23.c: Use effective-target arm_cpu_cortex-m23. * gcc.target/arm/pure-code/pr94538-1.c: Likewise. * gcc.target/arm/pure-code/pr109800.c: Use effective-target arm_arch_v7em_hard. * lib/target-supports.exp: Define effective-target arm_cpu_cortex_m0, arm_cpu_cortex_m23 and arm_arch_v7em_hard. Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>	2024-11-21 07:50:30 +01:00
Torbjörn SVENSSON	0380051bba	testsuite: arm: Use effective-target for crc_hf_1.c test Update test cases to use -mcpu=unset/-march=unset feature introduced in r15-3606-g7d6c6a0d15c. gcc/testsuite/ChangeLog: * gcc.target/arm/acle/crc_hf_1.c: Use effective-target arm_arch_v8a_crc_hard. * lib/target-supports.exp: Define effective-target arm_arch_v8a_crc_hard. Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>	2024-11-21 07:49:11 +01:00
Torbjörn SVENSSON	dc044641a0	testsuite: arm: Use effective-target for pacbti-m-predef* tests Update test cases to use -mcpu=unset/-march=unset feature introduced in r15-3606-g7d6c6a0d15c. gcc/testsuite/ChangeLog: * gcc.target/arm/acle/pacbti-m-predef-1.c: Use effective-target arm_arch_v8_1m_main. * gcc.target/arm/acle/pacbti-m-predef-2.c: Likewise. * gcc.target/arm/acle/pacbti-m-predef-3.c: Likewise. * gcc.target/arm/acle/pacbti-m-predef-4.c: Likewise. * gcc.target/arm/acle/pacbti-m-predef-5.c: Likewise. * gcc.target/arm/acle/pacbti-m-predef-6.c: Likewise. * gcc.target/arm/acle/pacbti-m-predef-8.c: Likewise. * gcc.target/arm/acle/pacbti-m-predef-9.c: Likewise. * gcc.target/arm/acle/pacbti-m-predef-10.c: Likewise. Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>	2024-11-21 07:48:39 +01:00
Torbjörn SVENSSON	3ae9d01eb4	testsuite: arm: Use effective-target for bti* and pac* tests Update test cases to use -mcpu=unset/-march=unset feature introduced in r15-3606-g7d6c6a0d15c. gcc/testsuite/ChangeLog: * gcc.target/arm/pac-1.c: Use effective-target arm_arch_v8_1m_main_pacbti. * gcc.target/arm/pac-2.c: Likewise. * gcc.target/arm/pac-3.c: Likewise. * gcc.target/arm/pac-4.c: Likewise. * gcc.target/arm/pac-5.c: Likewise. * gcc.target/arm/pac-7.c: Likewise. * gcc.target/arm/pac-8.c: Likewise. * gcc.target/arm/pac-9.c: Likewise. * gcc.target/arm/pac-10.c: Likewise. * gcc.target/arm/pac-11.c: Likewise. * gcc.target/arm/pac-12.c: Added option "-mcpu=unset". * gcc.target/arm/pac-13.c: Likewise. * gcc.target/arm/pac-14.c: Likewise. * lib/target-supports.exp (check_effective_target_arm_pacbti_hw): Likewise. * gcc.target/arm/pac-6.c: Use effective-target arm_arch_v8_1m_main. * gcc.target/arm/pac-15.c: Use effective-target arm_arch_v8_1m_main_pacbti and added option "-mcpu=unset". Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com> Co-authored-by: Yvan ROUX <yvan.roux@foss.st.com>	2024-11-21 07:45:50 +01:00
GCC Administrator	cf261dd522	Daily bump.	2024-11-21 00:20:27 +00:00
Lewis Hyatt	81c29232b6	tree-cfg: Fix call to next_discriminator_for_locus() While testing future 64-bit location_t support, I ran into an -fcompare-debug issue that was traced back here. Despite the name, next_discriminator_for_locus() is meant to take an integer line number argument, not a location_t. There is one call site which has been passing a location_t instead. For the most part that is harmless, although in case there are two CALL stmts on the same line with different location_t, it may fail to generate a unique discriminator where it should. If/when location_t changes to be 64-bit, however, it will produce an -fcompare-debug failure. Fix it by passing the line number rather than the location_t. I am not aware of a testcase that demonstrates any observable wrong behavior, but the file debug/pr53466.C is an example where the discriminator assignment is indeed different before and after this change. gcc/ChangeLog: * tree-cfg.cc (assign_discriminators): Fix incorrect value passed to next_discriminator_for_locus().	2024-11-20 18:08:57 -05:00
Gaius Mulley	26f3efccaa	PR modula2/117703: libgm2 soname bumps for GCC 15 Bump libgm2 version ready for the gcc-15 release. libgm2/ChangeLog: PR modula2/117703 * configure: Regenerate. * configure.ac (libtool_VERSION): Bump to 20:0:0. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>	2024-11-20 22:17:30 +00:00
Harald Anlauf	3c130e410a	Fortran: fix checking of protected variables in submodules [PR83135] When a symbol was use-associated in the ancestor of a submodule, a PROTECTED attribute was ignored in the submodule or its descendants. Find the real ancestor of symbols when used in a variable definition context in a submodule. PR fortran/83135 gcc/fortran/ChangeLog: * expr.cc (sym_is_from_ancestor): New helper function. (gfc_check_vardef_context): Refine checking of PROTECTED attribute of symbols that are indirectly use-associated in a submodule. gcc/testsuite/ChangeLog: * gfortran.dg/protected_10.f90: New test.	2024-11-20 23:07:47 +01:00
Joseph Myers	d5cebf7e44	c: Diagnose compound literal for empty array [PR114266] As reported in bug 114266, GCC fails to pedwarn for a compound literal, whose type is an array of unknown size, initialized with an empty initializer. This case is disallowed by C23 (which doesn't have zero-size objects); the case of a named object is diagnosed as expected, but not that for compound literals. (Before C23, the pedwarn for empty initializers sufficed.) Add a check for this specific case with a pedwarn. Bootstrapped with no regressions for x86_64-pc-linux-gnu. PR c/114266 gcc/c/ * c-decl.cc (build_compound_literal): Diagnose array of unknown size with empty initializer for C23. gcc/testsuite/ * gcc.dg/c23-empty-init-4.c: New test.	2024-11-20 21:29:48 +00:00
Antoni Boucher	cf544af03a	libgccjit: Add support for setting the comment ident gcc/jit/ChangeLog: * docs/topics/compatibility.rst (LIBGCCJIT_ABI_34): New ABI tag. * docs/topics/contexts.rst: Document gcc_jit_context_set_output_ident. * jit-playback.cc (set_output_ident): New method. * jit-playback.h (set_output_ident): New method. * jit-recording.cc (recording::context::set_output_ident, recording::output_ident::output_ident, recording::output_ident::~output_ident, recording::output_ident::replay_into, recording::output_ident::make_debug_string, recording::output_ident::write_reproducer): New methods. * jit-recording.h (class output_ident): New class. * libgccjit.cc (gcc_jit_context_set_output_ident): New function. * libgccjit.h (gcc_jit_context_set_output_ident): New function. * libgccjit.map: New function. gcc/testsuite/ChangeLog: * jit.dg/all-non-failing-tests.h: New test. * jit.dg/test-output-ident.c: New test.	2024-11-20 16:10:59 -05:00
Antoni Boucher	d8cf8917ed	libgccjit: Add support for creating temporary variables gcc/jit/ChangeLog: * docs/topics/compatibility.rst (LIBGCCJIT_ABI_33): New ABI tag. * docs/topics/functions.rst: Document gcc_jit_function_new_temp. * jit-playback.cc (new_local): Add support for temporary variables. * jit-recording.cc (recording::function::new_temp): New method. (recording::local::write_reproducer): Support temporary variables. * jit-recording.h (new_temp): New method. * libgccjit.cc (gcc_jit_function_new_temp): New function. * libgccjit.h (gcc_jit_function_new_temp): New function. * libgccjit.map: New function. gcc/testsuite/ChangeLog: * jit.dg/all-non-failing-tests.h: Mention test-temp.c. * jit.dg/test-temp.c: New test.	2024-01-18 17:54:59 -05:00
Vladimir N. Makarov	56fc6a6d9e	[PR116587][LRA]: Fix last chance reload pseudo allocation On i686 PR116587 test compilation resulted in LRA failure to find registers for a reload insn pseudo. The insn requires 6 regs for 4 reload insn pseudos where two of them require 2 regs each. But we have only 5 free regs as sp is a fixed reg, bp is fixed because of -fno-omit-frame-pointer, bx is assigned to pic_offset_table_pseudo because of -fPIC. LRA spills pic_offset_table_pseudo as the last chance approach to allocate registers to the reload pseudo. Although it makes 2 free registers for the unallocated reload pseudo requiring also 2 regs, the pseudo still can not be allocated as the 2 free regs are disjoint. The patch spills all pseudos conflicting with the unallocated reload pseudo including already allocated reload insn pseudos, then standard LRA code allocates spilled pseudos requiring more one register first and avoid situation of the disjoint regs for reload pseudos requiring more one reg. gcc/ChangeLog: PR target/116587 * lra-assigns.cc (find_all_spills_for): Consider all pseudos whose classes intersect given pseudo class. gcc/testsuite/ChangeLog: PR target/116587 * gcc.target/i386/pr116587.c: New test.	2024-11-20 14:30:33 -05:00
Antoni Boucher	87f0136fa4	libgccjit: Add support for machine-dependent builtins gcc/jit/ChangeLog: PR jit/108762 * docs/topics/compatibility.rst (LIBGCCJIT_ABI_32): New ABI tag. * docs/topics/functions.rst: Add documentation for the function gcc_jit_context_get_target_builtin_function. * dummy-frontend.cc: Include headers target.h, jit-recording.h, print-tree.h, unordered_map and string, new variables (target_builtins, target_function_types, and target_builtins_ctxt), new function (tree_type_to_jit_type). * jit-builtins.cc: Specify that the function types are not from target builtins. * jit-playback.cc: New argument is_target_builtin to new_function. * jit-playback.h: New argument is_target_builtin to new_function. * jit-recording.cc: New argument is_target_builtin to new_function_type, function_type constructor and function constructor, new function (get_target_builtin_function). * jit-recording.h: Include headers string and unordered_map, new variable target_function_types, new argument is_target_builtin to new_function_type, function_type and function, new functions (get_target_builtin_function, copy). * libgccjit.cc: New function (gcc_jit_context_get_target_builtin_function). * libgccjit.h: New function (gcc_jit_context_get_target_builtin_function). * libgccjit.map: New functions (gcc_jit_context_get_target_builtin_function). gcc/testsuite: PR jit/108762 * jit.dg/all-non-failing-tests.h: New test test-target-builtins.c. * jit.dg/test-target-builtins.c: New test.	2024-11-20 14:03:57 -05:00
Andrew Pinski	beab0a3ecb	aarch64: Fix aarch64 after moving to C23 This fixes a few aarch64 specific testcases after the move to default to GNU C23. For the SME testcases, the GNU C23 cases as `()` changing to mean `(void)` instead of a non-prototype declaration; the non-prototype declaration merging was confusing some of the time so the updated way is the expected way even for that. For pic-.c `-Wno-old-style-definition` was added not to warn about old style definitions. For pr113573.c, I added `-std=gnu17` since I was not sure if `(...)` with C23 would invoke the same issue. tested for aarch64-linux-gnu. PR testsuite/117680 gcc/testsuite/ChangeLog: gcc.target/aarch64/pic-constantpool1.c: Add -Wno-old-style-definition. * gcc.target/aarch64/pic-symrefplus.c: Likewise. * gcc.target/aarch64/pr113573.c: Add `-std=gnu17` * gcc.target/aarch64/sme/streaming_mode_1.c: Correct testcase. * gcc.target/aarch64/sme/za_state_1.c: Likewise. * gcc.target/aarch64/sme/za_state_2.c: Likewise. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>	2024-11-20 10:59:48 -08:00
Andrew Pinski	e74f3eb189	rtl-reader: Disable reuse_rtx support for generator building reuse_rtx is not documented nor the format to use it is ever documented. So it should not be supported for the .md files. This also fixes the problem if an invalid index is supplied for reuse_rtx, instead of ICEing, put out a real error message. Note since this code still uses atoi, an invalid index can still be used in some cases but that is recorded as part of PR 44574. Note I did a grep of the sources to make sure that this was only used for the read rtl in the GCC rather than while reading in .md files. Bootstrapped and tested on x86_64-linux-gnu. gcc/ChangeLog: * read-md.h (class rtx_reader): Don't include m_reuse_rtx_by_id when GENERATOR_FILE is defined. * read-rtl.cc (rtx_reader::read_rtx_code): Disable reuse_rtx support when GENERATOR_FILE is defined. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>	2024-11-20 10:07:38 -08:00
Edwin Lu	342eb518bd	RISC-V: testsuite: restrict big endian test to non vector RISC-V vector currently does not support big endian so the postcommit was getting the sorry, not implemented error on vector targets. Restrict the testcase to non-vector targets gcc/testsuite/ChangeLog: * gcc.target/riscv/pr117595.c: Restrict to non vector targets. Signed-off-by: Edwin Lu <ewlu@rivosinc.com>	2024-11-20 09:49:53 -08:00
Richard Biener	f5bd88b5e8	tree-optimization/117709 - bogus offset for gather load When diverting to VMAT_GATHER_SCATTER we fail to zero poffset which was previously set if a load was classified as VMAT_CONTIGUOUS_REVERSE. The following refactors get_group_load_store_type a bit to avoid this but this all needs some serious TLC. PR tree-optimization/117709 tree-vect-stmts.cc (get_group_load_store_type): Only set *poffset when we end up with VMAT_CONTIGUOUS_DOWN or VMAT_CONTIGUOUS_REVERSE.	2024-11-20 18:38:14 +01:00
Richard Biener	2383ed144b	tree-optimization/117698 - SLP vectorization and alignment When SLP vectorizing we fail to mark the general alignment check as irrelevant when using VMAT_STRIDED_SLP (the implementation checks for itself) and when VMAT_INVARIANT the override isn't effective. This results in extra FAILs on sparc which the following fixes. PR tree-optimization/117698 * tree-vect-stmts.cc (get_group_load_store_type): Properly disregard alignment for VMAT_STRIDED_SLP and VMAT_INVARIANT. (vectorizable_load): Adjust guard for dumping whether we vectorize and unaligned access. (vectorizable_store): Likewise.	2024-11-20 18:38:14 +01:00
Antoni Boucher	16cf1c010d	libgccjit: Allow comparing aligned int types gcc/jit/ChangeLog: * jit-common.h: Add forward declaration of memento_of_get_aligned. * jit-recording.h (type::is_same_type_as): Compare integer types. (dyn_cast_aligned_type): New method. (type::is_aligned, memento_of_get_aligned::is_same_type_as, memento_of_get_aligned::is_aligned): new methods. gcc/testsuite/ChangeLog: * jit.dg/test-types.c: Add checks comparing aligned types.	2024-11-20 11:01:35 -05:00
Antoni Boucher	ede14092bc	libgccjit: Add option to allow special characters in function names gcc/jit/ChangeLog: * docs/topics/contexts.rst: Add documentation for new option. * jit-recording.cc (recording::context::get_str_option): New method. * jit-recording.h (get_str_option): New method. * libgccjit.cc (gcc_jit_context_new_function): Allow special characters in function names. * libgccjit.h (enum gcc_jit_str_option): New option. gcc/testsuite/ChangeLog: * jit.dg/test-special-chars.c: New test.	2024-11-20 10:46:45 -05:00
Antoni Boucher	452abe143e	libgccjit: Add vector permutation and vector access operations gcc/jit/ChangeLog: PR jit/112602 * docs/topics/compatibility.rst (LIBGCCJIT_ABI_31): New ABI tag. * docs/topics/expressions.rst: Document gcc_jit_context_new_rvalue_vector_perm and gcc_jit_context_new_vector_access. * jit-playback.cc (playback::context::new_rvalue_vector_perm, common_mark_addressable_vec, gnu_vector_type_p, lvalue_p, convert_vector_to_array_for_subscript, new_vector_access): new functions. * jit-playback.h (new_rvalue_vector_perm, new_vector_access): New functions. * jit-recording.cc (recording::context::new_rvalue_vector_perm, recording::context::new_vector_access, memento_of_new_rvalue_vector_perm, recording::memento_of_new_rvalue_vector_perm::replay_into, recording::memento_of_new_rvalue_vector_perm::visit_children, recording::memento_of_new_rvalue_vector_perm::make_debug_string, recording::memento_of_new_rvalue_vector_perm::write_reproducer, recording::vector_access::replay_into, recording::vector_access::visit_children, recording::vector_access::make_debug_string, recording::vector_access::write_reproducer): New methods. * jit-recording.h (class memento_of_new_rvalue_vector_perm, class vector_access): New classes. * libgccjit.cc (gcc_jit_context_new_vector_access, gcc_jit_context_new_rvalue_vector_perm): New functions. * libgccjit.h (gcc_jit_context_new_rvalue_vector_perm, gcc_jit_context_new_vector_access): New functions. * libgccjit.map: New functions. gcc/testsuite/ChangeLog: PR jit/112602 * jit.dg/all-non-failing-tests.h: New test test-vector-perm.c. * jit.dg/test-vector-perm.c: New test.	2024-11-20 10:39:24 -05:00
Paul-Antoine Arras	377eff7c38	OpenMP: common C/C++ testcases for dispatch + adjust_args gcc/testsuite/ChangeLog: * c-c++-common/gomp/declare-variant-2.c: Adjust dg-error directives. * c-c++-common/gomp/adjust-args-1.c: New test. * c-c++-common/gomp/adjust-args-2.c: New test. * c-c++-common/gomp/declare-variant-dup-match-clause.c: New test. * c-c++-common/gomp/dispatch-1.c: New test. * c-c++-common/gomp/dispatch-2.c: New test. * c-c++-common/gomp/dispatch-3.c: New test. * c-c++-common/gomp/dispatch-4.c: New test. * c-c++-common/gomp/dispatch-5.c: New test. * c-c++-common/gomp/dispatch-6.c: New test. * c-c++-common/gomp/dispatch-7.c: New test. * c-c++-common/gomp/dispatch-8.c: New test. * c-c++-common/gomp/dispatch-9.c: New test. * c-c++-common/gomp/dispatch-10.c: New test. libgomp/ChangeLog: * testsuite/libgomp.c-c++-common/dispatch-1.c: New test. * testsuite/libgomp.c-c++-common/dispatch-2.c: New test.	2024-11-20 15:31:22 +01:00
Paul-Antoine Arras	ed49709acd	OpenMP: C++ front-end support for dispatch + adjust_args This patch adds C++ support for the `dispatch` construct and the `adjust_args` clause. It relies on the c-family bits comprised in the corresponding C front end patch for pragmas and attributes. Additional C/C++ common testcases are provided in a subsequent patch in the series. gcc/cp/ChangeLog: * decl.cc (omp_declare_variant_finalize_one): Set adjust_args need_device_ptr attribute. * parser.cc (cp_parser_direct_declarator): Update call to cp_parser_late_return_type_opt. (cp_parser_late_return_type_opt): Add 'tree parms' parameter. Update call to cp_parser_late_parsing_omp_declare_simd. (cp_parser_omp_clause_name): Handle nocontext and novariants clauses. (cp_parser_omp_clause_novariants): New function. (cp_parser_omp_clause_nocontext): Likewise. (cp_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_NOVARIANTS and PRAGMA_OMP_CLAUSE_NOCONTEXT. (cp_parser_omp_dispatch_body): New function, inspired from cp_parser_assignment_expression and cp_parser_postfix_expression. (OMP_DISPATCH_CLAUSE_MASK): Define. (cp_parser_omp_dispatch): New function. (cp_finish_omp_declare_variant): Add parameter. Handle adjust_args clause. (cp_parser_late_parsing_omp_declare_simd): Add parameter. Update calls to cp_finish_omp_declare_variant and cp_finish_omp_declare_variant. (cp_parser_omp_construct): Handle PRAGMA_OMP_DISPATCH. (cp_parser_pragma): Likewise. * semantics.cc (finish_omp_clauses): Handle OMP_CLAUSE_NOCONTEXT and OMP_CLAUSE_NOVARIANTS. * pt.cc (tsubst_omp_clauses): Handle OMP_CLAUSE_NOCONTEXT and OMP_CLAUSE_NOVARIANTS. (tsubst_stmt): Handle OMP_DISPATCH. (tsubst_expr): Handle IFN_GOMP_DISPATCH. gcc/testsuite/ChangeLog: * g++.dg/gomp/adjust-args-1.C: New test. * g++.dg/gomp/adjust-args-2.C: New test. * g++.dg/gomp/adjust-args-3.C: New test. * g++.dg/gomp/dispatch-1.C: New test. * g++.dg/gomp/dispatch-2.C: New test. * g++.dg/gomp/dispatch-3.C: New test. * g++.dg/gomp/dispatch-4.C: New test. * g++.dg/gomp/dispatch-5.C: New test. * g++.dg/gomp/dispatch-6.C: New test. * g++.dg/gomp/dispatch-7.C: New test.	2024-11-20 15:31:22 +01:00
Paul-Antoine Arras	d7d8d9dae9	OpenMP: C front-end support for dispatch + adjust_args This patch adds support to the C front-end to parse the `dispatch` construct and the `adjust_args` clause. It also includes some common C/C++ bits for pragmas and attributes. Additional common C/C++ testcases are in a later patch in the series. gcc/c-family/ChangeLog: * c-attribs.cc (c_common_gnu_attributes): Add attribute for adjust_args need_device_ptr. * c-omp.cc (c_omp_directives): Uncomment dispatch. * c-pragma.cc (omp_pragmas): Add dispatch. * c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_DISPATCH. (enum pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_NOCONTEXT and PRAGMA_OMP_CLAUSE_NOVARIANTS. gcc/c/ChangeLog: * c-parser.cc (c_parser_omp_dispatch): New function. (c_parser_omp_clause_name): Handle nocontext and novariants clauses. (c_parser_omp_clause_novariants): New function. (c_parser_omp_clause_nocontext): Likewise. (c_parser_omp_all_clauses): Handle nocontext and novariants clauses. (c_parser_omp_dispatch_body): New function adapted from c_parser_expr_no_commas. (OMP_DISPATCH_CLAUSE_MASK): Define. (c_parser_omp_dispatch): New function. (c_finish_omp_declare_variant): Parse adjust_args. (c_parser_omp_construct): Handle PRAGMA_OMP_DISPATCH. * c-typeck.cc (c_finish_omp_clauses): Handle OMP_CLAUSE_NOVARIANTS and OMP_CLAUSE_NOCONTEXT. gcc/testsuite/ChangeLog: * gcc.dg/gomp/adjust-args-1.c: New test. * gcc.dg/gomp/dispatch-1.c: New test. * gcc.dg/gomp/dispatch-2.c: New test. * gcc.dg/gomp/dispatch-3.c: New test. * gcc.dg/gomp/dispatch-4.c: New test. * gcc.dg/gomp/dispatch-5.c: New test.	2024-11-20 15:31:22 +01:00
Paul-Antoine Arras	084ea8ad58	OpenMP: middle-end support for dispatch + adjust_args This patch adds middle-end support for the `dispatch` construct and the `adjust_args` clause. The heavy lifting is done in `gimplify_omp_dispatch` and `gimplify_call_expr` respectively. For `adjust_args`, this mostly consists in emitting a call to `omp_get_mapped_ptr` for the adequate device. For dispatch, the following steps are performed: * Handle the device clause, if any: set the default-device ICV at the top of the dispatch region and restore its previous value at the end. * Handle novariants and nocontext clauses, if any. Evaluate compile-time constants and select a variant, if possible. Otherwise, emit code to handle all possible cases at run time. gcc/ChangeLog: * builtins.cc (builtin_fnspec): Handle BUILT_IN_OMP_GET_MAPPED_PTR. * gimple-low.cc (lower_stmt): Handle GIMPLE_OMP_DISPATCH. * gimple-pretty-print.cc (dump_gimple_omp_dispatch): New function. (pp_gimple_stmt_1): Handle GIMPLE_OMP_DISPATCH. * gimple-walk.cc (walk_gimple_stmt): Likewise. * gimple.cc (gimple_build_omp_dispatch): New function. (gimple_copy): Handle GIMPLE_OMP_DISPATCH. * gimple.def (GIMPLE_OMP_DISPATCH): Define. * gimple.h (gimple_build_omp_dispatch): Declare. (gimple_has_substatements): Handle GIMPLE_OMP_DISPATCH. (gimple_omp_dispatch_clauses): New function. (gimple_omp_dispatch_clauses_ptr): Likewise. (gimple_omp_dispatch_set_clauses): Likewise. (gimple_return_set_retval): Handle GIMPLE_OMP_DISPATCH. * gimplify.cc (enum omp_region_type): Add ORT_DISPATCH. (struct gimplify_omp_ctx): Add in_call_args. (gimplify_call_expr): Handle need_device_ptr arguments. (is_gimple_stmt): Handle OMP_DISPATCH. (gimplify_scan_omp_clauses): Handle OMP_CLAUSE_DEVICE in a dispatch construct. Handle OMP_CLAUSE_NOVARIANTS and OMP_CLAUSE_NOCONTEXT. (omp_has_novariants): New function. (omp_has_nocontext): Likewise. (omp_construct_selector_matches): Handle OMP_DISPATCH with nocontext clause. (find_ifn_gomp_dispatch): New function. (gimplify_omp_dispatch): Likewise. (gimplify_expr): Handle OMP_DISPATCH. * gimplify.h (omp_has_novariants): Declare. * internal-fn.cc (expand_GOMP_DISPATCH): New function. * internal-fn.def (GOMP_DISPATCH): Define. * omp-builtins.def (BUILT_IN_OMP_GET_MAPPED_PTR): Define. (BUILT_IN_OMP_GET_DEFAULT_DEVICE): Define. (BUILT_IN_OMP_SET_DEFAULT_DEVICE): Define. * omp-general.cc (omp_construct_traits_to_codes): Add OMP_DISPATCH. (struct omp_ts_info): Add dispatch. (omp_resolve_declare_variant): Handle novariants. Adjust DECL_ASSEMBLER_NAME. * omp-low.cc (scan_omp_1_stmt): Handle GIMPLE_OMP_DISPATCH. (lower_omp_dispatch): New function. (lower_omp_1): Call it. * tree-inline.cc (remap_gimple_stmt): Handle GIMPLE_OMP_DISPATCH. (estimate_num_insns): Handle GIMPLE_OMP_DISPATCH.	2024-11-20 15:31:22 +01:00

1 2 3 4 5 ...

215633 Commits