Commit Graph

204718 Commits

Author SHA1 Message Date
Richard Biener
44e7e4498c tree-optimization/110243 - IVOPTs introducing undefined overflow
The following addresses IVOPTs rewriting expressions in its
strip_offset without caring for definedness of overflow.  Rather
than the earlier attempt of just using the proper
split_constant_offset from data-ref analysis the following adjusts
IVOPTs helper trying to minimize changes from this fix, possibly
easing backports.

	PR tree-optimization/110243
	PR tree-optimization/111336
	* tree-ssa-loop-ivopts.cc (strip_offset_1): Rewrite
	operations with undefined behavior on overflow to
	unsigned arithmetic.

	* gcc.dg/torture/pr110243.c: New testcase.
	* gcc.dg/torture/pr111336.c: Likewise.
2023-10-20 15:15:25 +02:00
Richard Biener
d70575f542 tree-optimization/111891 - fix assert in vectorizable_simd_clone_call
The following fixes the assert in vectorizable_simd_clone_call to
assert we have a vector type during transform.  Whether we have
one during analysis depends on whether another SLP user decided
on the type of a constant/external already.  When we end up with
a mismatch in desire the updating will fail and make vectorization
fail.

	PR tree-optimization/111891
	* tree-vect-stmts.cc (vectorizable_simd_clone_call): Fix
	assert.

	* gfortran.dg/pr111891.f90: New testcase.
2023-10-20 14:14:13 +02:00
Andrew Stubbs
c7ec7bd1c6 amdgcn: add -march=gfx1030 EXPERIMENTAL
Accept the architecture configure option and resolve build failures.  This is
enough to build binaries, but I've not got a device to test it on, so there
are probably runtime issues to fix.  The cache control instructions might be
unsafe (or too conservative), and the kernel metadata might be off.  Vector
reductions will need to be reworked for RDNA2.  In principle, it would be
better to use wavefrontsize32 for this architecture, but that would mean
switching everything to allow SImode masks, so wavefrontsize64 it is.

The multilib is not included in the default configuration so either configure
--with-arch=gfx1030 or include it in --with-multilib-list=gfx1030,....

The majority of this patch has no effect on other devices, but changing from
using scalar writes for the exit value to vector writes means we don't need
the scalar cache write-back instruction anywhere (which doesn't exist in RDNA2).

gcc/ChangeLog:

	* config.gcc: Allow --with-arch=gfx1030.
	* config/gcn/gcn-hsa.h (NO_XNACK): gfx1030 does not support xnack.
	(ASM_SPEC): gfx1030 needs -mattr=+wavefrontsize64 set.
	* config/gcn/gcn-opts.h (enum processor_type): Add PROCESSOR_GFX1030.
	(TARGET_GFX1030): New.
	(TARGET_RDNA2): New.
	* config/gcn/gcn-valu.md (@dpp_move<mode>): Disable for RDNA2.
	(addc<mode>3<exec_vcc>): Add RDNA2 syntax variant.
	(subc<mode>3<exec_vcc>): Likewise.
	(<convop><mode><vndi>2_exec): Add RDNA2 alternatives.
	(vec_cmp<mode>di): Likewise.
	(vec_cmp<u><mode>di): Likewise.
	(vec_cmp<mode>di_exec): Likewise.
	(vec_cmp<u><mode>di_exec): Likewise.
	(vec_cmp<mode>di_dup): Likewise.
	(vec_cmp<mode>di_dup_exec): Likewise.
	(reduc_<reduc_op>_scal_<mode>): Disable for RDNA2.
	(*<reduc_op>_dpp_shr_<mode>): Likewise.
	(*plus_carry_dpp_shr_<mode>): Likewise.
	(*plus_carry_in_dpp_shr_<mode>): Likewise.
	* config/gcn/gcn.cc (gcn_option_override): Recognise gfx1030.
	(gcn_global_address_p): RDNA2 only allows smaller offsets.
	(gcn_addr_space_legitimate_address_p): Likewise.
	(gcn_omp_device_kind_arch_isa): Recognise gfx1030.
	(gcn_expand_epilogue): Use VGPRs instead of SGPRs.
	(output_file_start): Configure gfx1030.
	* config/gcn/gcn.h (TARGET_CPU_CPP_BUILTINS): Add __RDNA2__;
	(ASSEMBLER_DIALECT): New.
	* config/gcn/gcn.md (rdna): New define_attr.
	(enabled): Use "rdna" attribute.
	(gcn_return): Remove s_dcache_wb.
	(addcsi3_scalar): Add RDNA2 syntax variant.
	(addcsi3_scalar_zero): Likewise.
	(addptrdi3): Likewise.
	(mulsi3): v_mul_lo_i32 should be v_mul_lo_u32 on all ISA.
	(*memory_barrier): Add RDNA2 syntax variant.
	(atomic_load<mode>): Add RDNA2 cache control variants, and disable
	scalar atomics for RDNA2.
	(atomic_store<mode>): Likewise.
	(atomic_exchange<mode>): Likewise.
	* config/gcn/gcn.opt (gpu_type): Add gfx1030.
	* config/gcn/mkoffload.cc (EF_AMDGPU_MACH_AMDGCN_GFX1030): New.
	(main): Recognise -march=gfx1030.
	* config/gcn/t-omp-device: Add gfx1030 isa.

libgcc/ChangeLog:

	* config/gcn/amdgcn_veclib.h (CDNA3_PLUS): Set false for __RDNA2__.

libgomp/ChangeLog:

	* plugin/plugin-gcn.c (EF_AMDGPU_MACH_AMDGCN_GFX1030): New.
	(isa_hsa_name): Recognise gfx1030.
	(isa_code): Likewise.
	* team.c (defined): Remove s_endpgm.
2023-10-20 12:40:25 +01:00
Richard Biener
d118738e71 tree-optimization/111000 - restrict invariant motion of shifts
The following restricts moving variable shifts to when they are
always executed in the loop as we currently do not have an efficient
way to rewrite them to something that is unconditionally
well-defined and value range analysis will otherwise compute
invalid ranges for the shift operand.

	PR tree-optimization/111000
	* stor-layout.h (element_precision): Move ..
	* tree.h (element_precision): .. here.
	* tree-ssa-loop-im.cc (movement_possibility_1): Restrict
	motion of shifts and rotates.

	* gcc.dg/torture/pr111000.c: New testcase.
2023-10-20 13:19:55 +02:00
Alexandre Oliva
551935d118 Control flow redundancy hardening
This patch introduces an optional hardening pass to catch unexpected
execution flows.  Functions are transformed so that basic blocks set a
bit in an automatic array, and (non-exceptional) function exit edges
check that the bits in the array represent an expected execution path
in the CFG.

Functions with multiple exit edges, or with too many blocks, call an
out-of-line checker builtin implemented in libgcc.  For simpler
functions, the verification is performed in-line.

-fharden-control-flow-redundancy enables the pass for eligible
functions, --param hardcfr-max-blocks sets a block count limit for
functions to be eligible, and --param hardcfr-max-inline-blocks
tunes the "too many blocks" limit for in-line verification.
-fhardcfr-skip-leaf makes leaf functions non-eligible.

Additional -fhardcfr-check-* options are added to enable checking at
exception escape points, before potential sibcalls, hereby dubbed
returning calls, and before noreturn calls and exception raises.  A
notable case is the distinction between noreturn calls expected to
throw and those expected to terminate or loop forever: the default
setting for -fhardcfr-check-noreturn-calls, no-xthrow, performs
checking before the latter, but the former only gets checking in the
exception handler.  GCC can only tell between them by explicit marking
noreturn functions expected to raise with the newly-introduced
expected_throw attribute, and corresponding ECF_XTHROW flag.


for  gcc/ChangeLog

	* tree-core.h (ECF_XTHROW): New macro.
	* tree.cc (set_call_expr): Add expected_throw attribute when
	ECF_XTHROW is set.
	(build_common_builtin_node): Add ECF_XTHROW to
	__cxa_end_cleanup and _Unwind_Resume or _Unwind_SjLj_Resume.
	* calls.cc (flags_from_decl_or_type): Check for expected_throw
	attribute to set ECF_XTHROW.
	* gimple.cc (gimple_build_call_from_tree): Propagate
	ECF_XTHROW from decl flags to gimple call...
	(gimple_call_flags): ... and back.
	* gimple.h (GF_CALL_XTHROW): New gf_mask flag.
	(gimple_call_set_expected_throw): New.
	(gimple_call_expected_throw_p): New.
	* Makefile.in (OBJS): Add gimple-harden-control-flow.o.
	* builtins.def (BUILT_IN___HARDCFR_CHECK): New.
	* common.opt (fharden-control-flow-redundancy): New.
	(-fhardcfr-check-returning-calls): New.
	(-fhardcfr-check-exceptions): New.
	(-fhardcfr-check-noreturn-calls=*): New.
	(Enum hardcfr_check_noreturn_calls): New.
	(fhardcfr-skip-leaf): New.
	* doc/invoke.texi: Document them.
	(hardcfr-max-blocks, hardcfr-max-inline-blocks): New params.
	* flag-types.h (enum hardcfr_noret): New.
	* gimple-harden-control-flow.cc: New.
	* params.opt (-param=hardcfr-max-blocks=): New.
	(-param=hradcfr-max-inline-blocks=): New.
	* passes.def (pass_harden_control_flow_redundancy): Add.
	* tree-pass.h (make_pass_harden_control_flow_redundancy):
	Declare.
	* doc/extend.texi: Document expected_throw attribute.

for  gcc/ada/ChangeLog

	* gcc-interface/trans.cc (gigi): Mark __gnat_reraise_zcx with
	ECF_XTHROW.
	(build_raise_check): Likewise for all rcheck subprograms.

for  gcc/c-family/ChangeLog

	* c-attribs.cc (handle_expected_throw_attribute): New.
	(c_common_attribute_table): Add expected_throw.

for  gcc/cp/ChangeLog

	* decl.cc (push_throw_library_fn): Mark with ECF_XTHROW.
	* except.cc (build_throw): Likewise __cxa_throw,
	_ITM_cxa_throw, __cxa_rethrow.

for  gcc/testsuite/ChangeLog

	* c-c++-common/torture/harden-cfr.c: New.
	* c-c++-common/harden-cfr-noret-never-O0.c: New.
	* c-c++-common/torture/harden-cfr-noret-never.c: New.
	* c-c++-common/torture/harden-cfr-noret-noexcept.c: New.
	* c-c++-common/torture/harden-cfr-noret-nothrow.c: New.
	* c-c++-common/torture/harden-cfr-noret.c: New.
	* c-c++-common/torture/harden-cfr-notail.c: New.
	* c-c++-common/torture/harden-cfr-returning.c: New.
	* c-c++-common/torture/harden-cfr-tail.c: New.
	* c-c++-common/torture/harden-cfr-abrt-always.c: New.
	* c-c++-common/torture/harden-cfr-abrt-never.c: New.
	* c-c++-common/torture/harden-cfr-abrt-no-xthrow.c: New.
	* c-c++-common/torture/harden-cfr-abrt-nothrow.c: New.
	* c-c++-common/torture/harden-cfr-abrt.c: New.
	* c-c++-common/torture/harden-cfr-always.c: New.
	* c-c++-common/torture/harden-cfr-never.c: New.
	* c-c++-common/torture/harden-cfr-no-xthrow.c: New.
	* c-c++-common/torture/harden-cfr-nothrow.c: New.
	* c-c++-common/torture/harden-cfr-bret-always.c: New.
	* c-c++-common/torture/harden-cfr-bret-never.c: New.
	* c-c++-common/torture/harden-cfr-bret-noopt.c: New.
	* c-c++-common/torture/harden-cfr-bret-noret.c: New.
	* c-c++-common/torture/harden-cfr-bret-no-xthrow.c: New.
	* c-c++-common/torture/harden-cfr-bret-nothrow.c: New.
	* c-c++-common/torture/harden-cfr-bret-retcl.c: New.
	* c-c++-common/torture/harden-cfr-bret.c: New.
	* g++.dg/harden-cfr-throw-always-O0.C: New.
	* g++.dg/harden-cfr-throw-returning-O0.C: New.
	* g++.dg/torture/harden-cfr-noret-always-no-nothrow.C: New.
	* g++.dg/torture/harden-cfr-noret-never-no-nothrow.C: New.
	* g++.dg/torture/harden-cfr-noret-no-nothrow.C: New.
	* g++.dg/torture/harden-cfr-throw-always.C: New.
	* g++.dg/torture/harden-cfr-throw-never.C: New.
	* g++.dg/torture/harden-cfr-throw-no-xthrow.C: New.
	* g++.dg/torture/harden-cfr-throw-no-xthrow-expected.C: New.
	* g++.dg/torture/harden-cfr-throw-nothrow.C: New.
	* g++.dg/torture/harden-cfr-throw-nocleanup.C: New.
	* g++.dg/torture/harden-cfr-throw-returning.C: New.
	* g++.dg/torture/harden-cfr-throw.C: New.
	* gcc.dg/torture/harden-cfr-noret-no-nothrow.c: New.
	* gcc.dg/torture/harden-cfr-tail-ub.c: New.
	* gnat.dg/hardcfr.adb: New.

for  libgcc/ChangeLog

	* Makefile.in (LIB2ADD): Add hardcfr.c.
	* hardcfr.c: New.
2023-10-20 07:50:33 -03:00
Alex Coplan
e90c7bd520 rtl-ssa: Don't leave NOTE_INSN_DELETED around
This patch tweaks change_insns to also call ::remove_insn to ensure the
underlying RTL insn gets removed from the insn chain in the case of a
deletion.

This avoids leaving NOTE_INSN_DELETED around after deleting insns.

For movement, the RTL insn chain is updated earlier in change_insns with
the call to move_insn.  For deletion, it seems reasonable to do it here.

gcc/ChangeLog:

	* rtl-ssa/changes.cc (function_info::change_insns): Ensure we call
	::remove_insn on deleted insns.
2023-10-20 11:46:27 +01:00
Richard Biener
d6add7aa90 Document {L,R}ROTATE_EXPR
The following amends the {L,R}SHIFT_EXPR documentation with
documentation about the {L,R}ROTATE_EXPR case.

	* doc/generic.texi ({L,R}ROTATE_EXPR): Document.
2023-10-20 12:01:43 +02:00
Oleg Endo
1d0ca7ecd4 SH: Fix PR 101177
Fix accidentally inverted comparison.

gcc/ChangeLog:

	PR target/101177
	* config/sh/sh.md (unnamed split pattern): Fix comparison of
	find_regno_note result.
2023-10-20 18:48:34 +09:00
Richard Biener
e489464acf Rewrite more refs for epilogue vectorization
The following makes sure to rewrite all gather/scatter detected by
dataref analysis plus stmts classified as VMAT_GATHER_SCATTER.  Maybe
we need to rewrite all refs, the following covers the cases I've
run into now.

	* tree-vect-loop.cc (update_epilogue_loop_vinfo): Rewrite
	both STMT_VINFO_GATHER_SCATTER_P and VMAT_GATHER_SCATTER
	stmt refs.
2023-10-20 11:23:36 +02:00
Richard Biener
5dde64775b Fixup vect_get_and_check_slp_defs for gathers and .MASK_LOAD
I went a little bit too simple with implementing SLP gather support
for emulated and builtin based gathers.  The following fixes the
conflict that appears when running into .MASK_LOAD where we rely
on vect_get_operand_map and the bolted-on STMT_VINFO_GATHER_SCATTER_P
checking wrecks that.  The following properly integrates this with
vect_get_operand_map, adding another special index refering to
the vect_check_gather_scatter analyzed offset.

This unbreaks aarch64 (and hopefully riscv), I'll followup with
more fixes and testsuite coverage for x86 where I think I got
masked gather SLP support wrong.

	* tree-vect-slp.cc (off_map, off_op0_map, off_arg2_map,
	off_arg3_arg2_map): New.
	(vect_get_operand_map): Get flag whether the stmt was
	recognized as gather or scatter and use the above
	accordingly.
	(vect_get_and_check_slp_defs): Adjust.
	(vect_build_slp_tree_2): Likewise.
2023-10-20 11:09:58 +02:00
Tobias Burnus
5f71e002f8 omp_lib.f90.in: Deprecate omp_lock_hint_* for OpenMP 5.0
The omp_lock_hint_* parameters were deprecated in favor of
omp_sync_hint_*.  While omp.h contained deprecation markers for those,
the omp_lib module only contained them for omp_{g,s}_nested.

Note: The -Wdeprecated-declarations warning will only become active once
openmp_version / _OPENMP is bumped from 201511 (4.5) to 201811 (5.0).

libgomp/ChangeLog:

	* omp_lib.f90.in: Tag omp_lock_hint_* as being deprecated when
	_OPENMP >= 201811.
2023-10-20 10:56:39 +02:00
Juzhe-Zhong
4fd09aed38 RISC-V: Rename some variables of vector_block_info[NFC]
1. Remove "m_" prefix as they are not private members.
2. Rename infos -> local_infos, info -> global_info to clarify their meaning.

Pushed as it is obvious.

gcc/ChangeLog:

	* config/riscv/riscv-vsetvl.cc (pre_vsetvl::fuse_local_vsetvl_info): Rename variables.
	(pre_vsetvl::pre_global_vsetvl_info): Ditto.
	(pre_vsetvl::emit_vsetvl): Ditto.
2023-10-20 16:27:44 +08:00
Tamar Christina
88c27070c2 ifcvt: Support bitfield lowering of multiple-exit loops
With the patch enabling the vectorization of early-breaks, we'd like to allow
bitfield lowering in such loops, which requires the relaxation of allowing
multiple exits when doing so.  In order to avoid a similar issue to PR107275,
the code that rejects loops with certain types of gimple_stmts was hoisted from
'if_convertible_loop_p_1' to 'get_loop_body_in_if_conv_order', to avoid trying
to lower bitfields in loops we are not going to vectorize anyway.

This also ensures 'ifcvt_local_dec' doesn't accidentally remove statements it
shouldn't as it will never come across them.  I made sure to add a comment to
make clear that there is a direct connection between the two and if we were to
enable vectorization of any other gimple statement we should make sure both
handle it.

gcc/ChangeLog:

	* tree-if-conv.cc (if_convertible_loop_p_1): Move check from here ...
	(get_loop_body_if_conv_order): ... to here.
	(if_convertible_loop_p): Remove single_exit check.
	(tree_if_conversion): Move single_exit check to if-conversion part and
	support multiple exits.

gcc/testsuite/ChangeLog:

	* gcc.dg/vect/vect-bitfield-read-1-not.c: New test.
	* gcc.dg/vect/vect-bitfield-read-2-not.c: New test.
	* gcc.dg/vect/vect-bitfield-read-8.c: New test.
	* gcc.dg/vect/vect-bitfield-read-9.c: New test.

Co-Authored-By:  Andre Vieira <andre.simoesdiasvieira@arm.com>
2023-10-20 08:09:45 +01:00
Tamar Christina
dd3e6f52e4 middle-end: Enable bit-field vectorization to work correctly when we're vectoring inside conds
The bitfield vectorization support does not currently recognize bitfields inside
gconds. This means they can't be used as conditions for early break
vectorization which is a functionality we require.

This adds support for them by explicitly matching and handling gcond as a
source.

Testcases are added in the testsuite update patch as the only way to get there
is with the early break vectorization.   See tests:

  - vect-early-break_20.c
  - vect-early-break_21.c

gcc/ChangeLog:

	* tree-vect-patterns.cc (vect_init_pattern_stmt): Copy STMT_VINFO_TYPE
	from original statement.
	(vect_recog_bitfield_ref_pattern): Support bitfields in gcond.

Co-Authored-By:  Andre Vieira <andre.simoesdiasvieira@arm.com>
2023-10-20 08:08:54 +01:00
Hu, Lin1
8ba8f0dea0 Fix testcases that are raised by support -mevex512
Hi, all

This patch aims to fix some scan-asm fail of pr89229-{5,6,7}b.c since we emit
scalar vmov{s,d} here, when trying to use x/ymm 16+ w/o avx512vl but with
avx512f+evex512.

If everyone has no objection to the modification of this behavior, then we tend
to solve these failures by modifying these testcases.

BRs,
Lin

gcc/testsuite/ChangeLog:

	* gcc.target/i386/pr89229-5b.c: Modify test.
	* gcc.target/i386/pr89229-6b.c: Ditto.
	* gcc.target/i386/pr89229-7b.c: Ditto.
2023-10-20 14:33:58 +08:00
Juzhe-Zhong
f0e28d8c13 RISC-V: Fix failed hoist in LICM of vmv.v.x instruction
Confirm dynamic LMUL algorithm works well for choosing LMUL = 4 for the PR:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111848

But it generate horrible register spillings.

The root cause is that we didn't hoist the vmv.v.x outside the loop which
increase the SLP loop register pressure.

So, change the COSNT_VECTOR move into vec_duplicate splitter that we can gain better optimizations:

1. better LICM.
2. More opportunities of transforming 'vv' into 'vx' in the future.

Before this patch:

f3:
        ble     a4,zero,.L8
        csrr    t0,vlenb
        slli    t1,t0,4
        csrr    a6,vlenb
        sub     sp,sp,t1
        csrr    a5,vlenb
        slli    a6,a6,3
        slli    a5,a5,2
        add     a6,a6,sp
        vsetvli a7,zero,e16,m8,ta,ma
        slli    a4,a4,3
        vid.v   v8
        addi    t6,a5,-1
        vand.vi v8,v8,-2
        neg     t5,a5
        vs8r.v  v8,0(sp)
        vadd.vi v8,v8,1
        vs8r.v  v8,0(a6)
        j       .L4
.L12:
        vsetvli a7,zero,e16,m8,ta,ma
.L4:
        csrr    t0,vlenb
        slli    t0,t0,3
        vl8re16.v       v16,0(sp)
        add     t0,t0,sp
        vmv.v.x v8,t6
        mv      t1,a4
        vand.vv v24,v16,v8
        mv      a6,a4
        vl8re16.v       v16,0(t0)
        vand.vv v8,v16,v8
        bleu    a4,a5,.L3
        mv      a6,a5
.L3:
        vsetvli zero,a6,e8,m4,ta,ma
        vle8.v  v20,0(a2)
        vle8.v  v16,0(a3)
        vsetvli a7,zero,e8,m4,ta,ma
        vrgatherei16.vv v4,v20,v24
        vadd.vv v4,v16,v4
        vsetvli zero,a6,e8,m4,ta,ma
        vse8.v  v4,0(a0)
        vle8.v  v20,0(a2)
        vsetvli a7,zero,e8,m4,ta,ma
        vrgatherei16.vv v4,v20,v8
        vadd.vv v4,v4,v16
        vsetvli zero,a6,e8,m4,ta,ma
        vse8.v  v4,0(a1)
        add     a4,a4,t5
        add     a0,a0,a5
        add     a3,a3,a5
        add     a1,a1,a5
        add     a2,a2,a5
        bgtu    t1,a5,.L12
        csrr    t0,vlenb
        slli    t1,t0,4
        add     sp,sp,t1
        jr      ra
.L8:
        ret

After this patch:

f3:
	ble	a4,zero,.L6
	csrr	a6,vlenb
	csrr	a5,vlenb
	slli	a6,a6,2
	slli	a5,a5,2
	addi	a6,a6,-1
	slli	a4,a4,3
	neg	t5,a5
	vsetvli	t1,zero,e16,m8,ta,ma
	vmv.v.x	v24,a6
	vid.v	v8
	vand.vi	v8,v8,-2
	vadd.vi	v16,v8,1
	vand.vv	v8,v8,v24
	vand.vv	v16,v16,v24
.L4:
	mv	t1,a4
	mv	a6,a4
	bleu	a4,a5,.L3
	mv	a6,a5
.L3:
	vsetvli	zero,a6,e8,m4,ta,ma
	vle8.v	v28,0(a2)
	vle8.v	v24,0(a3)
	vsetvli	a7,zero,e8,m4,ta,ma
	vrgatherei16.vv	v4,v28,v8
	vadd.vv	v4,v24,v4
	vsetvli	zero,a6,e8,m4,ta,ma
	vse8.v	v4,0(a0)
	vle8.v	v28,0(a2)
	vsetvli	a7,zero,e8,m4,ta,ma
	vrgatherei16.vv	v4,v28,v16
	vadd.vv	v4,v4,v24
	vsetvli	zero,a6,e8,m4,ta,ma
	vse8.v	v4,0(a1)
	add	a4,a4,t5
	add	a0,a0,a5
	add	a3,a3,a5
	add	a1,a1,a5
	add	a2,a2,a5
	bgtu	t1,a5,.L4
.L6:
	ret

Note that this patch triggers multiple FAILs:
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_arith_run-3.c execution test
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_arith_run-3.c execution test
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_arith_run-4.c execution test
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_arith_run-4.c execution test
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_arith_run-8.c execution test
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_arith_run-8.c execution test
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c execution test
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c execution test
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-2.c execution test
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-2.c execution test
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-1.c execution test
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-1.c execution test
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-2.c execution test
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-2.c execution test

They failed are all because of bugs on VSETVL PASS:

10dd4:       0c707057                vsetvli zero,zero,e8,mf2,ta,ma
   10dd8:       5e06b8d7                vmv.v.i v17,13
   10ddc:       9ed030d7                vmv1r.v v1,v13
   10de0:       b21040d7                vncvt.x.x.w     v1,v1           ----> raise illegal instruction since we don't have SEW = 8 -> SEW = 4 narrowing.
   10de4:       5e0785d7                vmv.v.v v11,v15

Confirm the recent VSETVL refactor patch: https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633231.html fixed all of them.

So this patch should be committed after the VSETVL refactor patch.

	PR target/111848

gcc/ChangeLog:

	* config/riscv/riscv-selftests.cc (run_const_vector_selftests): Adapt selftest.
	* config/riscv/riscv-v.cc (expand_const_vector): Change it into vec_duplicate splitter.

gcc/testsuite/ChangeLog:

	* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-7.c: Adapt test.
	* gcc.dg/vect/costmodel/riscv/rvv/pr111848.c: New test.
2023-10-20 11:51:21 +08:00
Lehua Ding
29331e72d0 RISC-V: Refactor and cleanup vsetvl pass
This patch refactors and cleanups the vsetvl pass in order to make the code
easier to modify and understand. This patch does several things:

1. Introducing a virtual CFG for vsetvl infos and Phase 1, 2 and 3 only maintain
   and modify this virtual CFG. Phase 4 performs insertion, modification and
   deletion of vsetvl insns based on the virtual CFG. The basic block in the
   virtual CFG is called vsetvl_block_info and the vsetvl information inside
   is called vsetvl_info.
2. Combine Phase 1 and 2 into a single Phase 1 and unified the demand system,
   this phase only fuse local vsetvl info in forward direction.
3. Refactor Phase 3, change the logic for determining whether to uplift vsetvl
   info to a pred basic block to a more unified method that there is a vsetvl
   info in the vsetvl defintion reaching in compatible with it.
4. Place all modification operations to the RTL in Phase 4 and Phase 5.
   Phase 4 is responsible for inserting, modifying and deleting vsetvl
   instructions based on fully optimized vsetvl infos. Phase 5 removes the avl
   operand from the RVV instruction and removes the unused dest operand
   register from the vsetvl insns.

These modifications resulted in some testcases needing to be updated. The reasons
for updating are summarized below:

1. more optimized
   vlmax_back_prop-{25,26}.c
   vlmax_conflict-{3,12}.c/vsetvl-{13,23}.c/vsetvl-23.c/
   avl_single-{23,84,95}.c/pr109773-1.c
2. less unnecessary fusion
   avl_single-46.c/imm_bb_prop-1.c/pr109743-2.c/vsetvl-18.c
3. local fuse direction (backward -> forward)
   scalar_move-1.c
4. add some bugfix testcases.
   pr111037-{3,4}.c/pr111037-4.c
   avl_single-{89,104,105,106,107,108,109}.c

	PR target/111037
	PR target/111234
	PR target/111725

gcc/ChangeLog:

	* config/riscv/riscv-vsetvl.cc (bitmap_union_of_preds_with_entry): New.
	(debug): Removed.
	(compute_reaching_defintion): New.
	(enum vsetvl_type): Moved.
	(vlmax_avl_p): Moved.
	(enum emit_type): Moved.
	(vlmul_to_str): Moved.
	(vlmax_avl_insn_p): Removed.
	(policy_to_str): Moved.
	(loop_basic_block_p): Removed.
	(valid_sew_p): Removed.
	(vsetvl_insn_p): Moved.
	(vsetvl_vtype_change_only_p): Removed.
	(after_or_same_p): Removed.
	(before_p): Removed.
	(anticipatable_occurrence_p): Removed.
	(available_occurrence_p): Removed.
	(insn_should_be_added_p): Removed.
	(get_all_sets): Moved.
	(get_same_bb_set): Moved.
	(gen_vsetvl_pat): Removed.
	(calculate_vlmul): Moved.
	(get_max_int_sew): New.
	(emit_vsetvl_insn): Removed.
	(get_max_float_sew): New.
	(eliminate_insn): Removed.
	(insert_vsetvl): Removed.
	(count_regno_occurrences): Moved.
	(get_vl_vtype_info): Removed.
	(enum def_type): Moved.
	(validate_change_or_fail): Moved.
	(change_insn): Removed.
	(get_all_real_uses): Moved.
	(get_forward_read_vl_insn): Removed.
	(get_backward_fault_first_load_insn): Removed.
	(change_vsetvl_insn): Removed.
	(avl_source_has_vsetvl_p): Removed.
	(source_equal_p): Moved.
	(calculate_sew): Removed.
	(same_equiv_note_p): Moved.
	(get_expr_id): New.
	(incompatible_avl_p): Removed.
	(get_regno): New.
	(different_sew_p): Removed.
	(get_bb_index): New.
	(different_lmul_p): Removed.
	(has_no_uses): Moved.
	(different_ratio_p): Removed.
	(different_tail_policy_p): Removed.
	(different_mask_policy_p): Removed.
	(possible_zero_avl_p): Removed.
	(enum demand_flags): New.
	(second_ratio_invalid_for_first_sew_p): Removed.
	(second_ratio_invalid_for_first_lmul_p): Removed.
	(enum class): New.
	(float_insn_valid_sew_p): Removed.
	(second_sew_less_than_first_sew_p): Removed.
	(first_sew_less_than_second_sew_p): Removed.
	(class vsetvl_info): New.
	(compare_lmul): Removed.
	(second_lmul_less_than_first_lmul_p): Removed.
	(second_ratio_less_than_first_ratio_p): Removed.
	(DEF_INCOMPATIBLE_COND): Removed.
	(greatest_sew): Removed.
	(first_sew): Removed.
	(second_sew): Removed.
	(first_vlmul): Removed.
	(second_vlmul): Removed.
	(first_ratio): Removed.
	(second_ratio): Removed.
	(vlmul_for_first_sew_second_ratio): Removed.
	(vlmul_for_greatest_sew_second_ratio): Removed.
	(ratio_for_second_sew_first_vlmul): Removed.
	(class vsetvl_block_info): New.
	(DEF_SEW_LMUL_FUSE_RULE): New.
	(always_unavailable): Removed.
	(avl_unavailable_p): Removed.
	(class demand_system): New.
	(sew_unavailable_p): Removed.
	(lmul_unavailable_p): Removed.
	(ge_sew_unavailable_p): Removed.
	(ge_sew_lmul_unavailable_p): Removed.
	(ge_sew_ratio_unavailable_p): Removed.
	(DEF_UNAVAILABLE_COND): Removed.
	(same_sew_lmul_demand_p): Removed.
	(propagate_avl_across_demands_p): Removed.
	(reg_available_p): Removed.
	(support_relaxed_compatible_p): Removed.
	(demands_can_be_fused_p): Removed.
	(earliest_pred_can_be_fused_p): Removed.
	(vsetvl_dominated_by_p): Removed.
	(avl_info::avl_info): Removed.
	(avl_info::single_source_equal_p): Removed.
	(avl_info::multiple_source_equal_p): Removed.
	(DEF_SEW_LMUL_RULE): New.
	(avl_info::operator=): Removed.
	(avl_info::operator==): Removed.
	(DEF_POLICY_RULE): New.
	(avl_info::operator!=): Removed.
	(avl_info::has_non_zero_avl): Removed.
	(vl_vtype_info::vl_vtype_info): Removed.
	(vl_vtype_info::operator==): Removed.
	(DEF_AVL_RULE): New.
	(vl_vtype_info::operator!=): Removed.
	(vl_vtype_info::same_avl_p): Removed.
	(vl_vtype_info::same_vtype_p): Removed.
	(vl_vtype_info::same_vlmax_p): Removed.
	(vector_insn_info::operator>=): Removed.
	(vector_insn_info::operator==): Removed.
	(class pre_vsetvl): New.
	(vector_insn_info::parse_insn): Removed.
	(vector_insn_info::compatible_p): Removed.
	(vector_insn_info::skip_avl_compatible_p): Removed.
	(vector_insn_info::compatible_avl_p): Removed.
	(vector_insn_info::compatible_vtype_p): Removed.
	(vector_insn_info::available_p): Removed.
	(vector_insn_info::fuse_avl): Removed.
	(vector_insn_info::fuse_sew_lmul): Removed.
	(vector_insn_info::fuse_tail_policy): Removed.
	(vector_insn_info::fuse_mask_policy): Removed.
	(vector_insn_info::local_merge): Removed.
	(vector_insn_info::global_merge): Removed.
	(vector_insn_info::get_avl_or_vl_reg): Removed.
	(vector_insn_info::update_fault_first_load_avl): Removed.
	(vector_insn_info::dump): Removed.
	(vector_infos_manager::vector_infos_manager): Removed.
	(vector_infos_manager::create_expr): Removed.
	(vector_infos_manager::get_expr_id): Removed.
	(vector_infos_manager::all_same_ratio_p): Removed.
	(vector_infos_manager::all_avail_in_compatible_p): Removed.
	(vector_infos_manager::all_same_avl_p): Removed.
	(vector_infos_manager::expr_set_num): Removed.
	(vector_infos_manager::release): Removed.
	(vector_infos_manager::create_bitmap_vectors): Removed.
	(vector_infos_manager::free_bitmap_vectors): Removed.
	(vector_infos_manager::dump): Removed.
	(class pass_vsetvl): Adjust.
	(pass_vsetvl::get_vector_info): Removed.
	(pass_vsetvl::get_block_info): Removed.
	(pass_vsetvl::update_vector_info): Removed.
	(pass_vsetvl::update_block_info): Removed.
	(pre_vsetvl::compute_avl_def_data): New.
	(pass_vsetvl::simple_vsetvl): Removed.
	(pass_vsetvl::compute_local_backward_infos): Removed.
	(pass_vsetvl::need_vsetvl): Removed.
	(pass_vsetvl::transfer_before): Removed.
	(pass_vsetvl::transfer_after): Removed.
	(pre_vsetvl::compute_vsetvl_def_data): New.
	(pass_vsetvl::emit_local_forward_vsetvls): Removed.
	(pass_vsetvl::prune_expressions): Removed.
	(pass_vsetvl::compute_local_properties): Removed.
	(pre_vsetvl::compute_lcm_local_properties): New.
	(pass_vsetvl::earliest_fusion): Removed.
	(pre_vsetvl::fuse_local_vsetvl_info): New.
	(pass_vsetvl::vsetvl_fusion): Removed.
	(pass_vsetvl::can_refine_vsetvl_p): Removed.
	(pre_vsetvl::earliest_fuse_vsetvl_info): New.
	(pass_vsetvl::refine_vsetvls): Removed.
	(pass_vsetvl::cleanup_vsetvls): Removed.
	(pass_vsetvl::commit_vsetvls): Removed.
	(pass_vsetvl::pre_vsetvl): Removed.
	(pass_vsetvl::get_vsetvl_at_end): Removed.
	(local_avl_compatible_p): Removed.
	(pass_vsetvl::local_eliminate_vsetvl_insn): Removed.
	(pre_vsetvl::pre_global_vsetvl_info): New.
	(get_first_vsetvl_before_rvv_insns): Removed.
	(pass_vsetvl::global_eliminate_vsetvl_insn): Removed.
	(pre_vsetvl::emit_vsetvl): New.
	(pass_vsetvl::ssa_post_optimization): Removed.
	(pre_vsetvl::cleaup): New.
	(pre_vsetvl::remove_avl_operand): New.
	(pass_vsetvl::df_post_optimization): Removed.
	(pre_vsetvl::remove_unused_dest_operand): New.
	(pass_vsetvl::init): Removed.
	(pass_vsetvl::done): Removed.
	(pass_vsetvl::compute_probabilities): Removed.
	(pass_vsetvl::lazy_vsetvl): Adjust.
	(pass_vsetvl::execute): Adjust.
	* config/riscv/riscv-vsetvl.def (DEF_INCOMPATIBLE_COND): Removed.
	(DEF_SEW_LMUL_RULE): New.
	(DEF_SEW_LMUL_FUSE_RULE): Removed.
	(DEF_POLICY_RULE): New.
	(DEF_UNAVAILABLE_COND): Removed
	(DEF_AVL_RULE): New demand type.
	(sew_lmul): New demand type.
	(ratio_only): New demand type.
	(sew_only): New demand type.
	(ge_sew): New demand type.
	(ratio_and_ge_sew): New demand type.
	(tail_mask_policy): New demand type.
	(tail_policy_only): New demand type.
	(mask_policy_only): New demand type.
	(ignore_policy): New demand type.
	(avl): New demand type.
	(non_zero_avl): New demand type.
	(ignore_avl): New demand type.
	* config/riscv/t-riscv: Removed riscv-vsetvl.h
	* config/riscv/riscv-vsetvl.h: Removed.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/scalar_move-1.c: Adjust.
	* gcc.target/riscv/rvv/vsetvl/avl_single-23.c: Adjust.
	* gcc.target/riscv/rvv/vsetvl/avl_single-46.c: Adjust.
	* gcc.target/riscv/rvv/vsetvl/avl_single-84.c: Adjust.
	* gcc.target/riscv/rvv/vsetvl/avl_single-89.c: Adjust.
	* gcc.target/riscv/rvv/vsetvl/avl_single-95.c: Adjust.
	* gcc.target/riscv/rvv/vsetvl/imm_bb_prop-1.c: Adjust.
	* gcc.target/riscv/rvv/vsetvl/pr109743-2.c: Adjust.
	* gcc.target/riscv/rvv/vsetvl/pr109773-1.c: Adjust.
	* gcc.target/riscv/rvv/base/pr111037-1.c: Moved to...
	* gcc.target/riscv/rvv/vsetvl/pr111037-1.c: ...here.
	* gcc.target/riscv/rvv/base/pr111037-2.c: Moved to...
	* gcc.target/riscv/rvv/vsetvl/pr111037-2.c: ...here.
	* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-25.c: Adjust.
	* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-26.c: Adjust.
	* gcc.target/riscv/rvv/vsetvl/vlmax_conflict-12.c: Adjust.
	* gcc.target/riscv/rvv/vsetvl/vlmax_conflict-3.c: Adjust.
	* gcc.target/riscv/rvv/vsetvl/vsetvl-13.c: Adjust.
	* gcc.target/riscv/rvv/vsetvl/vsetvl-18.c: Adjust.
	* gcc.target/riscv/rvv/vsetvl/vsetvl-23.c: Adjust.
	* gcc.target/riscv/rvv/vsetvl/avl_single-104.c: New test.
	* gcc.target/riscv/rvv/vsetvl/avl_single-105.c: New test.
	* gcc.target/riscv/rvv/vsetvl/avl_single-106.c: New test.
	* gcc.target/riscv/rvv/vsetvl/avl_single-107.c: New test.
	* gcc.target/riscv/rvv/vsetvl/avl_single-108.c: New test.
	* gcc.target/riscv/rvv/vsetvl/avl_single-109.c: New test.
	* gcc.target/riscv/rvv/vsetvl/pr111037-3.c: New test.
	* gcc.target/riscv/rvv/vsetvl/pr111037-4.c: New test.
2023-10-20 11:50:38 +08:00
Alexandre Oliva
df252e0f25 return edge in make_eh_edges
The need to initialize edge probabilities has made make_eh_edges
undesirably hard to use.  I suppose we don't want make_eh_edges to
initialize the probability of the newly-added edge itself, so that the
caller takes care of it, but identifying the added edge in need of
adjustments is inefficient and cumbersome.  Change make_eh_edges so
that it returns the added edge.


for  gcc/ChangeLog

	* tree-eh.cc (make_eh_edges): Return the new edge.
	* tree-eh.h (make_eh_edges): Likewise.
2023-10-20 00:35:17 -03:00
Nathaniel Shead
1d260ab0e3 c++: indirect change of active union member in constexpr [PR101631,PR102286]
This patch adds checks for attempting to change the active member of a
union by methods other than a member access expression.

To be able to properly distinguish `*(&u.a) = ` from `u.a = `, this
patch redoes the solution for c++/59950 to avoid extranneous *&; it
seems that the only case that needed the workaround was when copying
empty classes.

This patch also ensures that constructors for a union field mark that
field as the active member before entering the call itself; this ensures
that modifications of the field within the constructor's body don't
cause false positives (as these will not appear to be member access
expressions). This means that we no longer need to start the lifetime of
empty union members after the constructor body completes.

As a drive-by fix, this patch also ensures that value-initialised unions
are considered to have activated their initial member for the purpose of
checking stores and accesses, which catches some additional mistakes
pre-C++20.

	PR c++/101631
	PR c++/102286

gcc/cp/ChangeLog:

	* call.cc (build_over_call): Fold more indirect refs for trivial
	assignment op.
	* class.cc (type_has_non_deleted_trivial_default_ctor): Create.
	* constexpr.cc (cxx_eval_call_expression): Start lifetime of
	union member before entering constructor.
	(cxx_eval_component_reference): Check against first member of
	value-initialised union.
	(cxx_eval_store_expression): Activate member for
	value-initialised union. Check for accessing inactive union
	member indirectly.
	* cp-tree.h (type_has_non_deleted_trivial_default_ctor):
	Forward declare.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp1y/constexpr-89336-3.C: Fix union initialisation.
	* g++.dg/cpp1y/constexpr-union6.C: New test.
	* g++.dg/cpp1y/constexpr-union7.C: New test.
	* g++.dg/cpp2a/constexpr-union2.C: New test.
	* g++.dg/cpp2a/constexpr-union3.C: New test.
	* g++.dg/cpp2a/constexpr-union4.C: New test.
	* g++.dg/cpp2a/constexpr-union5.C: New test.
	* g++.dg/cpp2a/constexpr-union6.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>
2023-10-19 23:25:46 -04:00
Nathaniel Shead
b69ee50081 c++: Improve diagnostics for constexpr cast from void*
This patch improves the errors given when casting from void* in C++26 to
include the expected type if the types of the pointed-to objects were
not similar. It also ensures (for all standard modes) that void* casts
are checked even for DECL_ARTIFICIAL declarations, such as
lifetime-extended temporaries, and is only ignored for cases where we
know it's OK (e.g. source_location::current) or have no other choice
(heap-allocated data).

gcc/cp/ChangeLog:

	* constexpr.cc (is_std_source_location_current): New.
	(cxx_eval_constant_expression): Only ignore cast from void* for
	specific cases and improve other diagnostics.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp0x/constexpr-cast4.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Marek Polacek  <polacek@redhat.com>
Reviewed-by: Jason Merrill <jason@redhat.com>
2023-10-19 23:25:31 -04:00
GCC Administrator
c85f74813f Daily bump. 2023-10-20 00:16:39 +00:00
Marek Polacek
4d81962ba0 c++: small tweak for cp_fold_r
This patch is an optimization tweak for cp_fold_r.  If we cp_fold_r the
COND_EXPR's op0 first, we may be able to evaluate it to a constant if -O.
cp_fold has:

3143         if (callee && DECL_DECLARED_CONSTEXPR_P (callee)
3144             && !flag_no_inline)
...
3151             r = maybe_constant_value (x, /*decl=*/NULL_TREE,

flag_no_inline is 1 for -O0:

1124   if (opts->x_optimize == 0)
1125     {
1126       /* Inlining does not work if not optimizing,
1127          so force it not to be done.  */
1128       opts->x_warn_inline = 0;
1129       opts->x_flag_no_inline = 1;
1130     }

but otherwise it's 0 and cp_fold will maybe_constant_value calls to
constexpr functions.  And if it doesn't, then folding the COND_EXPR
will keep both arms, and we can avoid calling maybe_constant_value.

gcc/cp/ChangeLog:

	* cp-gimplify.cc (cp_fold_r): Don't call maybe_constant_value.
2023-10-19 16:18:14 -04:00
Marek Polacek
86d0b08664 doc: Update contrib.texi
I noticed that Patrick is missing here.

gcc/ChangeLog:

	* doc/contrib.texi: Add entry for Patrick Palka.
2023-10-19 16:16:01 -04:00
Andre Vieira
d8e4e7def3 vect: Use inbranch simdclones in masked loops
This patch enables the compiler to use inbranch simdclones when generating
masked loops in autovectorization.

gcc/ChangeLog:

	* omp-simd-clone.cc (simd_clone_adjust_argument_types): Make function
	compatible with mask parameters in clone.
	* tree-vect-stmts.cc (vect_build_all_ones_mask): Allow vector boolean
	typed masks.
	(vectorizable_simd_clone_call): Enable the use of masked clones in
	fully masked loops.
2023-10-19 18:30:25 +01:00
Andre Vieira
8b704ed0b8 vect: don't allow fully masked loops with non-masked simd clones [PR 110485]
When analyzing a loop and choosing a simdclone to use it is possible to choose
a simdclone that cannot be used 'inbranch' for a loop that can use partial
vectors.  This may lead to the vectorizer deciding to use partial vectors which
are not supported for notinbranch simd clones.  This patch fixes that by
disabling the use of partial vectors once a notinbranch simd clone has been
selected.

gcc/ChangeLog:

	PR tree-optimization/110485
	* tree-vect-stmts.cc (vectorizable_simd_clone_call): Disable partial
	vectors usage if a notinbranch simdclone has been selected.

gcc/testsuite/ChangeLog:

	* gcc.dg/gomp/pr110485.c: New test.
2023-10-19 18:30:25 +01:00
Andre Vieira
c9ce846763 vect: Fix vect_get_smallest_scalar_type for simd clones
The vect_get_smallest_scalar_type helper function was using any argument to a
simd clone call when trying to determine the smallest scalar type that would be
vectorized.  This included the function pointer type in a MASK_CALL for
instance, and would result in the wrong type being selected.  Instead this
patch special cases simd_clone_call's and uses only scalar types of the
original function that get transformed into vector types.

gcc/ChangeLog:

	* tree-vect-data-refs.cc (vect_get_smallest_scalar_type): Special case
	simd clone calls and only use types that are mapped to vectors.
	(simd_clone_call_p): New helper function.

gcc/testsuite/ChangeLog:

	* gcc.dg/vect/vect-simd-clone-16f.c: Remove unnecessary differentation
	between targets with different pointer sizes.
	* gcc.dg/vect/vect-simd-clone-17f.c: Likewise.
	* gcc.dg/vect/vect-simd-clone-18f.c: Likewise.
2023-10-19 18:30:15 +01:00
Andre Vieira
53d40858c8 parloops: Allow poly nit and bound
Teach parloops how to handle a poly nit and bound e ahead of the changes to
enable non-constant simdlen.

gcc/ChangeLog:

	* tree-parloops.cc (try_transform_to_exit_first_loop_alt): Accept
	poly NIT and ALT_BOUND.
2023-10-19 18:27:18 +01:00
Andre Vieira
87d97e2607 parloops: Copy target and optimizations when creating a function clone
SVE simd clones require to be compiled with a SVE target enabled or the argument
types will not be created properly. To achieve this we need to copy
DECL_FUNCTION_SPECIFIC_TARGET from the original function declaration to the
clones.  I decided it was probably also a good idea to copy
DECL_FUNCTION_SPECIFIC_OPTIMIZATION in case the original function is meant to
be compiled with specific optimization options.

gcc/ChangeLog:

	* tree-parloops.cc (create_loop_fn): Copy specific target and
	optimization options to clone.
2023-10-19 18:26:45 +01:00
Andre Vieira
79a50a1740 omp: Replace simd_clone_subparts with TYPE_VECTOR_SUBPARTS
Refactor simd clone handling code ahead of support for poly simdlen.

gcc/ChangeLog:

	* omp-simd-clone.cc (simd_clone_subparts): Remove.
	(simd_clone_init_simd_arrays): Replace simd_clone_supbarts with
	TYPE_VECTOR_SUBPARTS.
	(ipa_simd_modify_function_body): Likewise.
	* tree-vect-stmts.cc (vectorizable_simd_clone_call): Likewise.
	(simd_clone_subparts): Remove.
2023-10-19 18:26:12 +01:00
François Dumont
c714b4d30d libstdc++: [_Hashtable] Do not reuse untrusted cached hash code
On merge, reuse a merged node's possibly cached hash code only if we are on the
same type of hash and this hash is stateless.

Usage of function pointers or std::function as hash functor will prevent reusing
cached hash code.

libstdc++-v3/ChangeLog

	* include/bits/hashtable_policy.h
	(_Hash_code_base::_M_hash_code(const _Hash&, const _Hash_node_value<>&)): Remove.
	(_Hash_code_base::_M_hash_code<_H2>(const _H2&, const _Hash_node_value<>&)): Remove.
	* include/bits/hashtable.h
	(_M_src_hash_code<_H2>(const _H2&, const key_type&, const __node_value_type&)): New.
	(_M_merge_unique<>, _M_merge_multi<>): Use latter.
	* testsuite/23_containers/unordered_map/modifiers/merge.cc
	(test04, test05, test06): New test cases.
2023-10-19 19:06:08 +02:00
Andrew Pinski
2454ba9e2d c: Fix ICE when an argument was an error mark [PR100532]
In the case of convert_argument, we would return the same expression
back rather than error_mark_node after the error message about
trying to convert to an incomplete type. This causes issues in
the gimplfier trying to see if another conversion is needed.

The code here dates back to before the revision history too so
it might be the case it never noticed we should return an error_mark_node.

Bootstrapped and tested on x86_64-linux-gnu with no regressions.

	PR c/100532

gcc/c/ChangeLog:

	* c-typeck.cc (convert_argument): After erroring out
	about an incomplete type return error_mark_node.

gcc/testsuite/ChangeLog:

	* gcc.dg/pr100532-1.c: New test.
2023-10-19 16:52:02 +00:00
Andrew Pinski
9f33e4c50e c: Don't warn about converting NULL to different sso endian [PR104822]
In a similar way we don't warn about NULL pointer constant conversion to
a different named address we should not warn to a different sso endian
either.
This adds the simple check.

Bootstrapped and tested on x86_64-linux-gnu with no regressions.

	PR c/104822

gcc/c/ChangeLog:

	* c-typeck.cc (convert_for_assignment): Check for null pointer
	before warning about an incompatible scalar storage order.

gcc/testsuite/ChangeLog:

	* gcc.dg/sso-18.c: New test.
	* gcc.dg/sso-19.c: New test.
2023-10-19 09:49:13 -07:00
Jason Merrill
00e7c49fa0 ABOUT-GCC-NLS: add usage guidance
gcc/ChangeLog:

	* ABOUT-GCC-NLS: Add usage guidance.
2023-10-19 12:34:35 -04:00
Jason Merrill
1ec36bcda3 diagnostic: rename new permerror overloads
While checking another change, I noticed that the new permerror overloads
break gettext with "permerror used incompatibly as both
 --keyword=permerror:2 --flag=permerror:2:gcc-internal-format and
 --keyword=permerror:3 --flag=permerror:3:gcc-internal-format".  So let's
change the name.

gcc/ChangeLog:

	* diagnostic-core.h (permerror): Rename new overloads...
	(permerror_opt): To this.
	* diagnostic.cc: Likewise.

gcc/cp/ChangeLog:

	* typeck2.cc (check_narrowing): Adjust.
2023-10-19 11:44:13 -04:00
Jason Merrill
f53de2baae c++: use G_ instead of _
Since these strings are passed to error_at, they should be marked for
translation with G_, like other diagnostic messages, rather than _, which
forces immediate (redundant) translation.  The use of N_ is less
problematic, but also imprecise.

gcc/cp/ChangeLog:

	* parser.cc (cp_parser_primary_expression): Use G_.
	(cp_parser_using_enum): Likewise.
	* decl.cc (identify_goto): Likewise.
2023-10-19 11:30:03 -04:00
Yannick Moy
04d6c74564 ada: Support new SPARK aspect Side_Effects
SPARK RM 6.1.11 introduces a new aspect Side_Effects to denote
those functions which may have output parameters, write global
variables, raise exceptions and not terminate. This adds support
for this aspect and the corresponding pragma in the frontend.

Handling of this aspect in the frontend is very similar to
the handling of aspect Extensions_Visible: both are Boolean
aspects whose expression should be static, they can be specified
on the same entities, with the same rule of inheritance from
overridden to overriding primitives for tagged types.

There is no impact on code generation.

gcc/ada/

	* aspects.ads: Add aspect Side_Effects.
	* contracts.adb (Add_Pre_Post_Condition)
	(Inherit_Subprogram_Contract): Add support for new contract.
	* contracts.ads: Update comments.
	* einfo-utils.adb (Get_Pragma): Add support.
	* einfo-utils.ads (Prag): Update comment.
	* errout.ads: Add explain codes.
	* par-prag.adb (Prag): Add support.
	* sem_ch13.adb (Analyze_Aspect_Specifications)
	(Check_Aspect_At_Freeze_Point): Add support.
	* sem_ch6.adb (Analyze_Subprogram_Body_Helper)
	(Analyze_Subprogram_Declaration): Call new analysis procedure to
	check SPARK legality rules.
	(Analyze_SPARK_Subprogram_Specification): New procedure to check
	SPARK legality rules. Use an explain code for the error.
	(Analyze_Subprogram_Specification): Move checks to new subprogram.
	This code was effectively dead, as the kind for parameters was set
	to E_Void at this point to detect early references.
	* sem_ch6.ads (Analyze_Subprogram_Specification): Add new
	procedure.
	* sem_prag.adb (Analyze_Depends_In_Decl_Part)
	(Analyze_Global_In_Decl_Part): Adapt legality check to apply only
	to functions without side-effects.
	(Analyze_If_Present): Extract functionality in new procedure
	Analyze_If_Present_Internal.
	(Analyze_If_Present_Internal): New procedure to analyze given
	pragma kind.
	(Analyze_Pragmas_If_Present): New procedure to analyze given
	pragma kind associated with a declaration.
	(Analyze_Pragma): Adapt support for Always_Terminates and
	Exceptional_Cases. Add support for Side_Effects. Make sure to call
	Analyze_If_Present to ensure pragma Side_Effects is analyzed prior
	to analyzing pragmas Global and Depends. Use explain codes for the
	errors.
	* sem_prag.ads (Analyze_Pragmas_If_Present): Add new procedure.
	* sem_util.adb (Is_Function_With_Side_Effects): New query function
	to determine if a function is a function with side-effects.
	* sem_util.ads (Is_Function_With_Side_Effects): Same.
	* snames.ads-tmpl: Declare new names for pragma and aspect.
	* doc/gnat_rm/implementation_defined_aspects.rst: Document new aspect.
	* doc/gnat_rm/implementation_defined_pragmas.rst: Document new pragma.
	* gnat_rm.texi: Regenerate.
2023-10-19 16:35:22 +02:00
Sheri Bernstein
c1fbfe5acb ada: Refactor code to remove GNATcheck violation
Rewrite for loop containing an exit (which violates GNATcheck
rule Exits_From_Conditional_Loops), to use a while loop
which contains the exit criteria in its condition.
Also, move special case of first time through loop, to come
before loop.

gcc/ada/

	* libgnat/s-imagef.adb (Set_Image_Fixed): Refactor loop.
2023-10-19 16:35:22 +02:00
Sheri Bernstein
0f3c634840 ada: Add pragma Annotate for GNATcheck exemptions
Exempt the GNATcheck rule "Unassigned_OUT_Parameters"
with the rationale "the OUT parameter is assigned by component".

gcc/ada/

	* libgnat/s-imguti.adb (Set_Decimal_Digits): Add pragma to exempt
	Unassigned_OUT_Parameters.
	(Set_Floating_Invalid_Value): Likewise
2023-10-19 16:35:22 +02:00
Patrick Bernardi
1555d18143 ada: Document gnatbind -Q switch
Add documentation for the -Q gnatbind switch in GNAT User's Guide and
improve gnatbind's help output for the switch to emphasize that it adds the
requested number of stacks to the secondary stack pool generated by the
binder.

gcc/ada/

	* bindusg.adb (Display): Make it clear -Q adds to the number of
	secondary stacks generated by the binder.
	* doc/gnat_ugn/building_executable_programs_with_gnat.rst:
	Document the -Q gnatbind switch and fix references to old
	runtimes.
	* gnat-style.texi: Regenerate.
	* gnat_rm.texi: Regenerate.
	* gnat_ugn.texi: Regenerate.
2023-10-19 16:35:21 +02:00
Ronan Desplanques
0c29a990a6 ada: Seize opportunity to reuse List_Length
This patch is intended as a readability improvement. It doesn't
change the behavior of the compiler.

gcc/ada/

	* sem_ch3.adb (Constrain_Array): Replace manual list length
	computation by call to List_Length.
2023-10-19 16:35:21 +02:00
Piotr Trojanek
7b1b787baa ada: Simplify "not Present" with "No"
gcc/ada/

	* exp_aggr.adb (Expand_Container_Aggregate): Simplify with "No".
2023-10-19 16:35:21 +02:00
Lewis Hyatt
19cc4b9d74 c++: Make -Wunknown-pragmas controllable by #pragma GCC diagnostic [PR89038]
As noted on the PR, commit r13-1544, the fix for PR53431, did not handle
the specific case of -Wunknown-pragmas, because that warning is issued
during preprocessing, but not by libcpp directly (it comes from the
cb_def_pragma callback).  Address that by handling this pragma in
addition to libcpp pragmas during the early pragma handler.

gcc/c-family/ChangeLog:

	PR c++/89038
	* c-pragma.cc (handle_pragma_diagnostic_impl):  Handle
	-Wunknown-pragmas during early processing.

gcc/testsuite/ChangeLog:

	PR c++/89038
	* c-c++-common/cpp/Wunknown-pragmas-1.c: New test.
2023-10-19 09:09:39 -04:00
Lewis Hyatt
202a214d68 libcpp: testsuite: Add test for fixed _Pragma bug [PR82335]
This PR was fixed by r12-4797 and r12-5454. Add test coverage from the PR
that is not represented elsewhere.

gcc/testsuite/ChangeLog:

	PR preprocessor/82335
	* c-c++-common/cpp/diagnostic-pragma-3.c: New test.
2023-10-19 09:08:55 -04:00
Tamar Christina
217a0fcb85 middle-end: don't create LC-SSA PHI variables for PHI nodes who dominate loop
As the testcase shows, when a PHI node dominates the loop there is no new
definition inside the loop.  As such there would be no PHI nodes to update.

When we maintain LCSSA form we create an intermediate node in between the two
loops to thread alongt the value.  However later on when we update the second
loop we don't have any PHI nodes to update and so adjust_phi_and_debug_stmts
does nothing.   This leaves us with an incorrect phi node.  Normally this does
nothing and just gets ignored.  But in the case of the vUSE chain we end up
corrupting the chain.

As such whenever a PHI node's argument dominates the loop, we should remove
the newly created PHI node after edge redirection.

The one exception to this is when the loop has been versioned.  In such cases
the versioned loop may not use the value but the second loop can.

When this happens and we add the loop guard unless the join block has the PHI
it can't find the original value for use inside the guard block.

The next refactoring in the series moves the formation of the guard block
inside peeling itself.  Here we have all the information and wouldn't
need to re-create it later.

gcc/ChangeLog:

	PR tree-optimization/111860
	* tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg):
	Remove PHI nodes that dominate loop.

gcc/testsuite/ChangeLog:

	PR tree-optimization/111860
	* gcc.dg/vect/pr111860.c: New test.
2023-10-19 13:44:01 +01:00
Richard Biener
beab5b95c5 tree-optimization/111131 - SLP for non-IFN gathers
The following implements SLP vectorization support for gathers
without relying on IFNs being pattern detected (and supported by
the target).  That includes support for emulated gathers but also
the legacy x86 builtin path.

	PR tree-optimization/111131
	* tree-vect-loop.cc (update_epilogue_loop_vinfo): Make
	sure to update all gather/scatter stmt DRs, not only those
	that eventually got VMAT_GATHER_SCATTER set.
	* tree-vect-slp.cc (_slp_oprnd_info::first_gs_info): Add.
	(vect_get_and_check_slp_defs): Handle gathers/scatters,
	adding the offset as SLP operand and comparing base and scale.
	(vect_build_slp_tree_1): Handle gathers.
	(vect_build_slp_tree_2): Likewise.

	* gcc.dg/vect/vect-gather-1.c: Now expected to vectorize
	everywhere.
	* gcc.dg/vect/vect-gather-2.c: Expected to not SLP anywhere.
	Massage the scale case to more reliably produce a different
	one.  Scan for the specific messages.
	* gcc.dg/vect/vect-gather-3.c: Masked gather is also supported
	for AVX2, but not emulated.
	* gcc.dg/vect/vect-gather-4.c: Expected to not SLP anywhere.
	Massage to more properly ensure this.
	* gcc.dg/vect/tsvc/vect-tsvc-s353.c: Expect to vectorize
	everywhere.
2023-10-19 14:25:36 +02:00
Richard Biener
b068886dcd Refactor x86 vectorized gather path
The following moves the builtin decl gather vectorization path along
the internal function and emulated gather vectorization paths,
simplifying the existing function down to generating the call and
required conversions to the actual argument types.  This thereby
exposes the unique support of two times larger number of offset
or data vector lanes.  It also makes the code path handle SLP
in principle (but SLP build needs adjustments for this, patch coming).

	* tree-vect-stmts.cc (vect_build_gather_load_calls): Rename
	to ...
	(vect_build_one_gather_load_call): ... this.  Refactor,
	inline widening/narrowing support ...
	(vectorizable_load): ... here, do gather vectorization
	with builtin decls along other gather vectorization.
2023-10-19 14:25:36 +02:00
Alex Coplan
947fb34a16 aarch64: Generalise TFmode load/store pair patterns
This patch generalises the TFmode load/store pair patterns to TImode and
TDmode.  This brings them in line with the DXmode patterns, and uses the
same technique with separate mode iterators (TX and TX2) to allow for
distinct modes in each arm of the load/store pair.

For example, in combination with the post-RA load/store pair fusion pass
in the following patch, this improves the codegen for the following
varargs testcase involving TImode stores:

void g(void *);
int foo(int x, ...)
{
    __builtin_va_list ap;
    __builtin_va_start (ap, x);
    g(&ap);
    __builtin_va_end (ap);
}

from:

foo:
.LFB0:
	stp	x29, x30, [sp, -240]!
.LCFI0:
	mov	w9, -56
	mov	w8, -128
	mov	x29, sp
	add	x10, sp, 176
	stp	x1, x2, [sp, 184]
	add	x1, sp, 240
	add	x0, sp, 16
	stp	x1, x1, [sp, 16]
	str	x10, [sp, 32]
	stp	w9, w8, [sp, 40]
	str	q0, [sp, 48]
	str	q1, [sp, 64]
	str	q2, [sp, 80]
	str	q3, [sp, 96]
	str	q4, [sp, 112]
	str	q5, [sp, 128]
	str	q6, [sp, 144]
	str	q7, [sp, 160]
	stp	x3, x4, [sp, 200]
	stp	x5, x6, [sp, 216]
	str	x7, [sp, 232]
	bl	g
	ldp	x29, x30, [sp], 240
.LCFI1:
	ret

to:

foo:
.LFB0:
	stp	x29, x30, [sp, -240]!
.LCFI0:
	mov	w9, -56
	mov	w8, -128
	mov	x29, sp
	add	x10, sp, 176
	stp	x1, x2, [sp, 1bd4971b7c71e70a637a1dq84]
	add	x1, sp, 240
	add	x0, sp, 16
	stp	x1, x1, [sp, 16]
	str	x10, [sp, 32]
	stp	w9, w8, [sp, 40]
	stp	q0, q1, [sp, 48]
	stp	q2, q3, [sp, 80]
	stp	q4, q5, [sp, 112]
	stp	q6, q7, [sp, 144]
	stp	x3, x4, [sp, 200]
	stp	x5, x6, [sp, 216]
	str	x7, [sp, 232]
	bl	g
	ldp	x29, x30, [sp], 240
.LCFI1:
	ret

Note that this patch isn't neeed if we only use the mode
canonicalization approach in the new ldp fusion pass (since we
canonicalize T{I,F,D}mode to V16QImode), but we seem to get slightly
better performance with mode canonicalization disabled (see
--param=aarch64-ldp-canonicalize-modes in the following patch).

gcc/ChangeLog:

	* config/aarch64/aarch64.md (load_pair_dw_tftf): Rename to ...
	(load_pair_dw_<TX:mode><TX2:mode>): ... this.
	(store_pair_dw_tftf): Rename to ...
	(store_pair_dw_<TX:mode><TX2:mode>): ... this.
	* config/aarch64/iterators.md (TX2): New.
2023-10-19 11:12:23 +01:00
Alex Coplan
61ea0a89c6 aarch64, testsuite: Fix up pr71727.c
The test is trying to check that we don't use q-register stores with
-mstrict-align, so actually check specifically for that.

This is a prerequisite to avoid regressing:

scan-assembler-not "add\tx0, x0, :"

with the upcoming ldp fusion pass, as we change where the ldps are
formed such that a register is used rather than a symbolic (lo_sum)
address for the first load.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/pr71727.c: Adjust scan-assembler-not to
	make sure we don't have q-register stores with -mstrict-align.
2023-10-19 11:12:23 +01:00
Alex Coplan
cf776eebe8 aarch64, testsuite: Tweak sve/pcs/args_9.c to allow stps
With the new ldp/stp pass enabled, there is a change in the codegen for
this test as follows:

        add     x8, sp, 16
        ptrue   p3.h, mul3
        str     p3, [x8]
-       str     x8, [sp, 8]
-       str     x9, [sp]
+       stp     x9, x8, [sp]
        ptrue   p3.d, vl8
        ptrue   p2.s, vl7
        ptrue   p1.h, vl6

i.e. we now form an stp that we were missing previously. This patch
adjusts the scan-assembler such that it should pass whether or not
we form the stp.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/sve/pcs/args_9.c: Adjust scan-assemblers to
	allow for stp.
2023-10-19 11:12:23 +01:00
Alex Coplan
583ca5f599 aarch64, testsuite: Prevent stp in lr_free_1.c
The test is looking for individual stores which are able to be merged
into stp instructions.  The test currently passes -fno-schedule-fusion
-fno-peephole2, presumably to prevent these stores from being turned
into stps, but this is no longer sufficient with the new ldp/stp fusion
pass.

As such, we add --param=aarch64-stp-policy=never to prevent stps being
formed.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/lr_free_1.c: Add
	--param=aarch64-stp-policy=never to dg-options.
2023-10-19 11:12:23 +01:00