This patch would like to fix one unused variable as below:
../../gcc/common/config/riscv/riscv-common.cc: In static member function
'static riscv_subset_list* riscv_subset_list::parse(const char*, location_t)':
../../gcc/common/config/riscv/riscv-common.cc:1501:19: error: unused variable 'itr'
[-Werror=unused-variable]
1501 | riscv_subset_t *itr;
The variable consume code was removed but missed the var itself in
previous. Thus, we have unused variable here.
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc (riscv_subset_list::parse):
Remove unused var decl.
Signed-off-by: Pan Li <pan2.li@intel.com>
We were assuming TYPE_NO_NAMED_ARGS_STDARG_P don't have any named
arguments and there is nothing to advance, but that is not the case
for (...) functions returning by hidden reference which have one such
artificial argument. This is causing gcc.dg/c23-stdarg-{6,8,9}.c to
fail.
Fix the issue by checking if arg.type is NULL, as r14-9503 explains.
gcc/ChangeLog:
PR target/114175
* config/mips/mips.cc (mips_setup_incoming_varargs): Only skip
mips_function_arg_advance for TYPE_NO_NAMED_ARGS_STDARG_P
functions if arg.type is NULL.
There was a typo in the testcase, with GCC_CPUINFO pointing to the
wrong file.
2024-03-29 Christophe Lyon <christophe.lyon@linaro.org>
gcc/testsuite/
* gcc.target/aarch64/cpunative/native_cpu_24.c: Fix GCC_CPUINFO.
It was mistakenly added to these files.
libstdc++-v3/ChangeLog:
* testsuite/24_iterators/range_generators/01.cc: Drop GCC
Runtime Library Exception.
* testsuite/24_iterators/range_generators/02.cc: Drop GCC
Runtime Library Exception.
* testsuite/24_iterators/range_generators/copy.cc: Drop GCC
Runtime Library Exception.
* testsuite/24_iterators/range_generators/except.cc: Drop GCC
Runtime Library Exception.
* testsuite/24_iterators/range_generators/subrange.cc: Drop GCC
Runtime Library Exception.
* testsuite/24_iterators/range_generators/synopsis.cc: Drop GCC
Runtime Library Exception.
* testsuite/24_iterators/range_generators/iter_deref_return.cc:
Drop GCC Runtime Library Exception from the "You should have
received a copy" paragraph.
... as made apparent by commit 4e1fcf44bd
"testsuite: vect: Require vect_hw_misalign in gcc.dg/vect/vect-cost-model-1.c etc. [PR98238]"
causing:
PASS: gcc.dg/vect/vect-cost-model-1.c (test for excess errors)
-PASS: gcc.dg/vect/vect-cost-model-1.c scan-tree-dump vect "LOOP VECTORIZED"
PASS: gcc.dg/vect/vect-cost-model-3.c (test for excess errors)
-PASS: gcc.dg/vect/vect-cost-model-3.c scan-tree-dump vect "LOOP VECTORIZED"
PASS: gcc.dg/vect/vect-cost-model-5.c (test for excess errors)
-PASS: gcc.dg/vect/vect-cost-model-5.c scan-tree-dump vect "LOOP VECTORIZED"
..., and similarly commit ffd47fb63d
"testsuite: Fix pr113431.c FAIL on sparc* [PR113431]" causing:
PASS: gcc.dg/vect/pr113431.c (test for excess errors)
PASS: gcc.dg/vect/pr113431.c execution test
-PASS: gcc.dg/vect/pr113431.c scan-tree-dump-times slp1 "optimized: basic block part vectorized" 2
..., which this commit all restores, and also enables a good number of further
FAIL -> PASS, UNSUPPORTED -> PASS, etc. progressions. There are also a small
number of regressions, mostly in the SLP area apparently:
PASS: gcc.dg/vect/bb-slp-layout-12.c (test for excess errors)
+XPASS: gcc.dg/vect/bb-slp-layout-12.c scan-tree-dump-not slp1 "duplicating permutation node"
+XFAIL: gcc.dg/vect/bb-slp-layout-12.c scan-tree-dump-times slp1 "add new stmt: [^\\n\\r]* = VEC_PERM_EXPR" 3
PASS: gcc.dg/vect/bb-slp-layout-6.c (test for excess errors)
+FAIL: gcc.dg/vect/bb-slp-layout-6.c scan-tree-dump slp2 "absorbing input layouts"
PASS: gcc.dg/vect/pr97428.c (test for excess errors)
PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected interleaving load of size 8"
PASS: gcc.dg/vect/pr97428.c scan-tree-dump vect "Detected interleaving store of size 16"
PASS: gcc.dg/vect/pr97428.c scan-tree-dump-not vect "gap of 6 elements"
-XFAIL: gcc.dg/vect/pr97428.c scan-tree-dump-times vect "vectorizing stmts using SLP" 2
+FAIL: gcc.dg/vect/pr97428.c scan-tree-dump-times vect "vectorizing stmts using SLP" 2
PASS: gcc.dg/vect/vect-33.c (test for excess errors)
+FAIL: gcc.dg/vect/vect-33.c scan-tree-dump vect "Vectorizing an unaligned access"
PASS: gcc.dg/vect/vect-33.c scan-tree-dump-not optimized "Invalid sum"
PASS: gcc.dg/vect/vect-33.c scan-tree-dump-times vect "vectorized 1 loops" 1
..., so some further conditionalizing etc. seems necessary. These seem to
mostly appear next to pre-existing similar FAILs in related test cases.
(Overall, way more PASS than FAIL.)
gcc/testsuite/
* lib/target-supports.exp
(check_effective_target_vect_hw_misalign): Enable for GCN.
(check_effective_target_vect_element_align): Adjust.
Via XPASSing test cases after commit a657c7e351
"testsuite: un-xfail TSVC loops that check for exit control flow vectorization":
PASS: gcc.dg/vect/tsvc/vect-tsvc-s332.c (test for excess errors)
PASS: gcc.dg/vect/tsvc/vect-tsvc-s332.c execution test
[-XFAIL:-]{+XPASS:+} gcc.dg/vect/tsvc/vect-tsvc-s332.c scan-tree-dump vect "vectorized 1 loops"
PASS: gcc.dg/vect/tsvc/vect-tsvc-s481.c (test for excess errors)
PASS: gcc.dg/vect/tsvc/vect-tsvc-s481.c execution test
[-XFAIL:-]{+XPASS:+} gcc.dg/vect/tsvc/vect-tsvc-s481.c scan-tree-dump vect "vectorized 1 loops"
PASS: gcc.dg/vect/tsvc/vect-tsvc-s482.c (test for excess errors)
PASS: gcc.dg/vect/tsvc/vect-tsvc-s482.c execution test
[-XFAIL:-]{+XPASS:+} gcc.dg/vect/tsvc/vect-tsvc-s482.c scan-tree-dump vect "vectorized 1 loops"
..., it became apparent that GCN, too, does support vectorization of loops with
early breaks. The relevant test cases are all-PASS with just the following
exceptions, to be looked into individually, later on:
PASS: gcc.dg/vect/vect-early-break_25.c (test for excess errors)
PASS: gcc.dg/vect/vect-early-break_25.c scan-tree-dump-times vect "vectorized 1 loops" 1
FAIL: gcc.dg/vect/vect-early-break_25.c scan-tree-dump-times vect "Alignment of access forced using peeling" 1
PASS: gcc.dg/vect/vect-early-break_56.c (test for excess errors)
PASS: gcc.dg/vect/vect-early-break_56.c execution test
XPASS: gcc.dg/vect/vect-early-break_56.c scan-tree-dump-times vect "vectorized 2 loops" 2
gcc/testsuite/
* lib/target-supports.exp
(check_effective_target_vect_early_break)
(check_effective_target_vect_early_break_hw): Enable for GCN.
2024-03-29 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/36337
PR fortran/110987
PR fortran/113885
* trans-expr.cc (gfc_trans_assignment_1): Place finalization
block before rhs post block for elemental rhs.
* trans.cc (gfc_finalize_tree_expr): Check directly if a type
has no components, rather than the zero components attribute.
Treat elemental zero component expressions in the same way as
scalars.
gcc/testsuite/
PR fortran/113885
* gfortran.dg/finalize_54.f90: New test.
* gfortran.dg/finalize_55.f90: New test.
gcc/testsuite/
PR fortran/110987
* gfortran.dg/finalize_56.f90: New test.
This changes an internal error to be a fatal error for when the ZSTD
is not enabled but the section was compressed as ZSTD.
Committed as approved after bootstrap/test on x86_64-linux-gnu.
gcc/ChangeLog:
* lto-compress.cc (lto_end_uncompression): Use
fatal_error instead of internal_error when ZSTD
is not enabled.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Recently I've fixed two wrong FP vector negate implementation which
caused wrong sign bits in zeros in targets (r14-8786 and r14-8801). To
prevent a similar issue from happening again, add a test case.
Tested on x86_64 (with SSE2, AVX, AVX2, and AVX512F), AArch64, MIPS
(with MSA), LoongArch (with LSX and LASX).
gcc/testsuite:
* gcc.dg/vect/vect-neg-zero.c: New test.
Segher's recent combine change, quite unexpectedly, triggered a regression on
the H8 port. It failed to build newlib.
The zero_extendqihi2 pattern provided two alternatives. One where the source
and destination matched. That turns into a suitable instruction trivially.
The second alternative was actually meant to capture cases where the value is
coming from memory.
What was missing here was the reg->reg case where the source and destination do
not match. That fell into the second case which was requested to be split by
the pattern's output template.
The splitter had a suitable condition to make sure it only triggered in the
right cases. Unfortunately with the pattern requiring a split in a case where
the splitter was going to fail led to the fault.
So regardless of what's going on in the combiner, this code was just wrong.
Fixed thusly by providing a suitable output template for the reg->reg case.
Regression tested on h8300-elf. Pushing to the trunk.
gcc/
* config/h8300/extensions.md (zero_extendqihi*): Add output
template for reg->reg case where the regs don't match.
2024-03-28 John David Anglin <danglin@gcc.gnu.org>
gcc/testsuite/ChangeLog:
PR analyzer/111289
* c-c++-common/analyzer/stdarg-pr111289-int.c: Don't include
<limits.h>.
The requirement that a type argument be complete is excessive in the case of
direct reference binding to the same type, which does not rely on any
properties of the type. This is LWG 2939.
PR c++/100667
gcc/cp/ChangeLog:
* semantics.cc (same_type_ref_bind_p): New.
(finish_trait_expr): Use it.
gcc/testsuite/ChangeLog:
* g++.dg/ext/is_constructible8.C: New test.
When matching actual arguments in match_actual_arg, these are initially
treated as a possible dummy procedure, assuming that the correct type is
determined later. This resolution could fail when the procedure is a
derived type constructor with a pointer component and appears in a DATA
statement, where the pointer shall be associated with an initial data
target. Check for those cases where the type obviously has not been
resolved yet, and which were missed because there was no component
reference.
gcc/fortran/ChangeLog:
PR fortran/114474
* primary.cc (gfc_variable_attr): Catch variables used in structure
constructors within DATA statements that are still tagged with a
temporary type BT_PROCEDURE from match_actual_arg and which have the
target attribute, and fix their typespec.
gcc/testsuite/ChangeLog:
PR fortran/114474
* gfortran.dg/data_pointer_3.f90: New test.
Per classic Vector calling convention ABI, vtype is call clobbered,
so ensure gcc regenerates a VSETVLI in following cases:
- after a function call.
- after an inline asm fragment which clobbers vtype.
ATM gcc seems to be doing the right thing, but a test can never hurt.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/vtype-call-clobbered.c: New Test.
Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
The error recovery causes misleading error messages to appear if an
EXPORT and IMPORT statement are in the wrong order. This patch
detects the incorrect order and issues an error message and prevents
error recovery. The fix should be improved and made more general if
another similar case is required.
gcc/m2/ChangeLog:
PR modula2/114520
* gm2-compiler/P0SyntaxCheck.bnf (DetectImport): New
procedure.
(EnableImportCheck): New boolean.
(Expect): Call DetectImport.
(Export): Set EnableImportCheck TRUE before ';' and FALSE
afterwards.
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
This patch allows -fno-cpp to be supplied to gm2. Without this patch
it causes an ICE. The patch allows -fno-cpp to turn off cpp flags.
These are tested in m2.flex to decide whether a change of state is
allowed (enabling handling of #line directives).
gcc/ChangeLog:
PR modula2/114517
* doc/gm2.texi: Mention gm2 treats a # in the first column
as a preprocessor directive unless -fno-cpp is supplied.
gcc/m2/ChangeLog:
PR modula2/114517
* gm2-compiler/M2Options.def (SetCpp): Add comment.
(GetCpp): Move after SetCpp.
(GetLineDirectives): New procedure function.
* gm2-compiler/M2Options.mod (GetLineDirectives): New
procedure function.
* gm2-gcc/m2options.h (M2Options_GetLineDirectives): New
prototype.
* gm2-lang.cc (gm2_langhook_init_options): OPT_fcpp only
assert if !value.
* m2.flex: Test GetLineDirectives before changing to LINE0
state.
gcc/testsuite/ChangeLog:
PR modula2/114517
* gm2/cpp/fail/hashfirstcolumn2.mod: New test.
* gm2/imports/fail/imports-fail.exp: New test.
* gm2/imports/fail/localmodule2.mod: New test.
* gm2/imports/run/pass/localmodule.mod: New test.
Signed-off-by: Gaius Mulley <(no_default)>
I've noticed a typo in a comment.
2024-03-28 Jakub Jelinek <jakub@redhat.com>
* predict.cc (estimate_bb_frequencies): Fix comment typo,
scalling -> scaling.
The testcase in the patch ICEs with
--- gcc/tree-scalar-evolution.cc
+++ gcc/tree-scalar-evolution.cc
@@ -3881,7 +3881,7 @@ final_value_replacement_loop (class loop *loop)
/* Propagate constants immediately, but leave an unused initialization
around to avoid invalidating the SCEV cache. */
- if (CONSTANT_CLASS_P (def) && !SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rslt))
+ if (0 && CONSTANT_CLASS_P (def) && !SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rslt))
replace_uses_by (rslt, def);
/* Create the replacement statements. */
(the addition of the above made the ICE latent), because profile_count
addition doesn't check for overflows and if unlucky, we can even overflow
into the uninitialized value.
Getting really huge profile counts is very easy even when not using
recursive inlining in loops, e.g.
__attribute__((noipa)) void
bar (void)
{
__builtin_exit (0);
}
__attribute__((noipa)) void
foo (void)
{
for (int i = 0; i < 1000; ++i)
for (int j = 0; j < 1000; ++j)
for (int k = 0; k < 1000; ++k)
for (int l = 0; l < 1000; ++l)
for (int m = 0; m < 1000; ++m)
for (int n = 0; n < 1000; ++n)
for (int o = 0; o < 1000; ++o)
for (int p = 0; p < 1000; ++p)
for (int q = 0; q < 1000; ++q)
for (int r = 0; r < 1000; ++r)
for (int s = 0; s < 1000; ++s)
for (int t = 0; t < 1000; ++t)
for (int u = 0; u < 1000; ++u)
for (int v = 0; v < 1000; ++v)
for (int w = 0; w < 1000; ++w)
for (int x = 0; x < 1000; ++x)
for (int y = 0; y < 1000; ++y)
for (int z = 0; z < 1000; ++z)
for (int a = 0; a < 1000; ++a)
for (int b = 0; b < 1000; ++b)
bar ();
}
int
main ()
{
foo ();
}
reaches the maximum count already on the 11th loop.
Some other methods of profile_count like apply_scale already
do use MIN (val, max_count) before assignment to m_val, this patch
just extends that to operator{+,+=} methods.
Furthermore, one overload of apply_probability wasn't using
safe_scale_64bit and so could very easily overflow as well
- prob is required to be [0, 10000] and if m_val is near the max_count,
it can overflow even with multiplications by 8.
2024-03-28 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/112303
* profile-count.h (profile_count::operator+): Perform
addition in uint64_t variable and set m_val to MIN of that
val and max_count.
(profile_count::operator+=): Likewise.
(profile_count::operator-=): Formatting fix.
(profile_count::apply_probability): Use safe_scale_64bit
even in the int overload.
* gcc.c-torture/compile/pr112303.c: New test.
Testsuites driven by vect.exp rely on check_vect_support_and_set_flags
to set appropriate DEFAULT_VECTFLAGS for a given target (e.g., add
-mfpu=neon for arm-linux-gnueabi). Unfortunately, these flags are
overwritten by dg-options directive, which can cause tests to fail.
Behavior of dg-options is documented in vect.exp files, but not
all developers look at the .exp file when adding a new testcase.
This caused a few dg-options directives to be used instead of
the more appropriate dg-additional-options.
This patch changes target-independent dg-options into
dg-additional-options. This patch does not touch target-specific
dg-options and target-specific tests to avoid disturbing the gentle
balance of target-specific vectorization.
This patch also removes a couple of unneeded "dg-do run" directives
to avoid failures on compile-only targets. Default action is, again,
set by check_vect_support_and_set_flags.
Lastly, I avoided renaming tests that use -O<n> options to O<n>-*
filename format because this support is not consistent between
gcc.dg/vect/, g++.dg/vect/, and gfortran.dg/vect/ testsuites.
It seems dg-additional-options is cleaner.
This patch does the following,
1. do not change target-specific tests, e.g., gcc.dg/vect/costmodel/riscv/*;
2. do not change { dg-options FOO { target { target-*-pattern } } };
3. do not remove { dg-do run { target { target-*-pattern } } };
4. change { dg-options FOO } to { dg-additional-options FOO };
5. remove { dg-do run } in several tests, where it is clearly not needed.
gcc/testsuite/ChangeLog:
PR testsuite/114307
* gcc.dg/vect/vect-cond-reduc-in-order-2-signed-zero.c: Remove dg-run.
* gcc.dg/vect/complex/complex-operations-run.c: Likewise.
* gcc.dg/vect/pr113576.c: Remove dg-run. Use dg-additional-options for
test-specific flags.
* gcc.dg/vect/gimplefe-40.c: Use dg-additional-options for
test-specific flags.
* gcc.dg/vect/gimplefe-41.c: Likewise.
* gcc.dg/vect/pr101145inf.c: Likewise.
* gcc.dg/vect/pr101145inf_1.c: Likewise.
* gcc.dg/vect/pr108316.c: Likewise.
* gcc.dg/vect/pr109011-1.c: Likewise.
* gcc.dg/vect/pr109011-2.c: Likewise.
* gcc.dg/vect/pr109011-3.c: Likewise.
* gcc.dg/vect/pr109011-4.c: Likewise.
* gcc.dg/vect/pr109011-5.c: Likewise.
* gcc.dg/vect/pr111846.c: Likewise.
* gcc.dg/vect/pr111860-2.c: Likewise.
* gcc.dg/vect/pr111860-3.c: Likewise.
* gcc.dg/vect/pr113002.c: Likewise.
* gcc.dg/vect/pr84711.c: Likewise.
* gcc.dg/vect/pr85597.c: Likewise.
* gcc.dg/vect/pr88497-1.c: Likewise.
* gcc.dg/vect/pr88497-2.c: Likewise.
* gcc.dg/vect/pr88497-3.c: Likewise.
* gcc.dg/vect/pr88497-4.c: Likewise.
* gcc.dg/vect/pr88497-5.c: Likewise.
* gcc.dg/vect/pr88497-7.c: Likewise.
* gcc.dg/vect/pr92347.c: Likewise.
* gcc.dg/vect/pr93069.c: Likewise.
* gcc.dg/vect/pr97241.c: Likewise.
* gcc.dg/vect/pr99102.c: Likewise.
* gcc.dg/vect/vect-early-break_65.c: Likewise.
* gcc.dg/vect/vect-fold-1.c: Likewise.
* gcc.dg/vect/vect-ifcvt-19.c: Likewise.
* gcc.dg/vect/vect-ifcvt-20.c: Likewise.
* gcc.dg/vect/vect-reduc-epilogue-gaps.c: Likewise.
* gcc.dg/vect/vect-singleton_1.c: Likewise.
* g++.dg/vect/pr84556.cc: Likewise.
* gfortran.dg/vect/fast-math-mgrid-resid.f: Likewise.
* gfortran.dg/vect/pr77848.f: Likewise.
* gfortran.dg/vect/pr90913.f90: Likewise.
This patch fixes cache colision on function whose body differs only by constants
at PHI operands. As for
if (test)
a = cst1;
else
a = cst2;
gcc/ChangeLog:
PR middle-end/113907
* ipa-icf.cc (sem_function::init): Hash PHI operands
(sem_function::compare_phi_node): Add argument about preserving order
This testcase was made latent by r14-4089 and got fixed both
on the trunk and 13 branch with PR113372 fix.
Adding testcase to the testsuite and will close the PR as a dup.
2024-03-28 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/109925
* gcc.c-torture/execute/pr109925.c: New test.
The patch fixing PR111781 made the check of specification expressions more
restrictive, disallowing local variables in specification expressions of
dummy arguments. PR114475 showed an example where that change regressed,
disallowing in submodules expressions that had been allowed in the parent
module. In submodules indeed, the hierarchy of namespaces inherited from
the parent module is not reproduced so the host-association of symbols
can't be recognized by checking the nesting of namespaces.
This change fixes the problem by allowing in specification expressions
all the symbols in a submodule that are inherited from the parent module.
PR fortran/111781
PR fortran/114475
gcc/fortran/ChangeLog:
* expr.cc (check_restricted): In submodules, allow variables host-
associated from the parent module.
gcc/testsuite/ChangeLog:
* gfortran.dg/spec_expr_10.f90: New test.
Co-authored-by: Harald Anlauf <anlauf@gmx.de>
This patch rebuilds the documentation for the target independent
library sections.
gcc/m2/ChangeLog:
* target-independent/m2/Builtins.texi: Rebuilt.
* target-independent/m2/gm2-libs.texi: Rebuilt.
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
The testcase in this PR shows very slow IDF compute:
tree SSA rewrite : 76.99 ( 31%)
24.78% 243663 cc1plus cc1plus [.] compute_idf
which can be mitigated to some extent by refactoring the bitmap
operations to simpler variants. With the patch below this becomes
tree SSA rewrite : 15.23 ( 8%)
when not optimizing and in addition to that
tree SSA incremental : 181.52 ( 30%)
to
tree SSA incremental : 24.09 ( 6%)
when optimizing.
PR middle-end/114480
* cfganal.cc (compute_idf): Use simpler bitmap iteration,
touch work_set only when phi_insertion_points changed.
We aren't doing anything with vxsat right now, but I'd like to add it as
an accepted register to the clobber list. If we get this into GCC-14
then we'll avoid some preprocessor-based twiddling if we ever start
using vxsat in the future.
gcc/ChangeLog:
* config/riscv/riscv.h (REGISTER_NAMES): Add vxsat.
gcc/analyzer/ChangeLog:
PR analyzer/114473
* call-summary.cc
(call_summary_replay::convert_svalue_from_summary): Assert that
the types match.
(call_summary_replay::convert_region_from_summary): Likewise.
(call_summary_replay::convert_region_from_summary_1): Add missing
cast for the deref of RK_SYMBOLIC case.
gcc/testsuite/ChangeLog:
PR analyzer/114473
* gcc.dg/analyzer/call-summaries-pr114473.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
> -/* The offset entry for each variable in a DATSEC should be 0 at compile time. */
> -/* { dg-final { scan-assembler-times "0\[\t \]+\[^\n\]*bts_offset" 7 } } */
> +/* The offset entry for each variable in a DATSEC should contain a label. */
> +/* { dg-final { scan-assembler-times ".4byte\[\t \]\[a-e\]\[\t \]+\[^\n\]*bts_offset" 5 } } */
4byte is used only on some targets, what exact assembler directive is used
for 4byte unaligned data is heavily target dependent.
2024-03-27 Jakub Jelinek <jakub@redhat.com>
* gcc.dg/debug/btf/btf-cvr-quals-1.c: Use dg-additional-options
instead of multiple dg-options.
* gcc.dg/debug/btf/btf-datasec-1.c: Likewise. Accept all supported
unaligned 4 byte assembler directives rather than assuming it must
be .4byte.
As written in the PR, torture/bitint-64.c test fails with -O2 -flto
and the reason is that on _BitInt arches where the padding bits
are undefined, the padding bits in the _Atomic vars are also undefined,
but when __atomic_load or __atomic_exchange on a _BitInt _Atomic variable
with some padding bits is lowered into __atomic_load_{1,2,4,8,16} or
__atomic_exchange_*, the mode precision unsigned result is VIEW_CONVERT_EXPR
converted to _BitInt and because of the VCE nothing actually sign/zero
extends it as needed for later uses - the var is no longer addressable and
expansion assumes such automatic vars are properly extended.
The following patch fixes that by using NOP_EXPR on it (the
VIEW_CONVERT_EXPR after it will then be optimized away during
gimplification, didn't want to repeat it in the code as else result = build1
(VIEW_CONVERT_EXPR, ...); twice.
2024-03-27 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/114469
* c-common.cc (resolve_overloaded_builtin): For _BitInt result
on !extended targets convert result to the _BitInt type before
using VIEW_CONVERT_EXPR.
In some cases combine will "combine" an I2 and I3, but end up putting
exactly the same thing back as I2 as was there before. This is never
progress, so we shouldn't do it, it will lead to oscillating behaviour
and the like.
If we want to canonicalise things, that's fine, but this is not the
way to do it.
2024-03-27 Segher Boessenkool <segher@kernel.crashing.org>
PR rtl-optimization/101523
* combine.cc (try_combine): Don't do a 2-insn combination if
it does not in fact change I2.
We got internally a question about the Spec File syntax, misunderstanding
what is the literal syntax and what are the placeholder variables in
the syntax descriptions.
The following patch attempts to use @var{S} etc. instead of just S
to clarify it stands for any option (or start of option etc.) rather
than literal S, say in %{S:X}. At least in HTML documentation it
then uses italics.
2024-03-27 Jakub Jelinek <jakub@redhat.com>
* doc/invoke.texi (Spec Files): Use @var{S} instead of S,
@var{X} instead of X etc. for other placeholders.
This resolves further failures (-Wreturn-type warnings) and test
failures for where-* tests targeting AVX-512.
Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:
* include/experimental/bits/simd_x86.h (_S_masked_unary):
Cast inputs < 16 bytes to 16 byte vectors before calling the
right subtraction builtin. Before returning, truncate to the
return vector type.
This resolves failures in the "expensive" where-* test of check-simd
when targeting AVX-512.
Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:
* include/experimental/bits/simd_x86.h (_S_masked_unary): Call
the 4- and 8-byte variants of __builtin_ia32_subp[ds] without
rounding direction argument.
libstdc++-v3/ChangeLog:
* include/Makefile.am: Add simd_sve.h.
* include/Makefile.in: Add simd_sve.h.
* include/experimental/bits/simd.h: Add new SveAbi.
* include/experimental/bits/simd_builtin.h: Use
__no_sve_deduce_t to support existing Neon Abi.
* include/experimental/bits/simd_converter.h: Convert
sequentially when sve is available.
* include/experimental/bits/simd_detail.h: Define sve
specific macro.
* include/experimental/bits/simd_math.h: Fallback frexp
to execute sequntially when sve is available, to handle
fixed_size_simd return type that always uses sve.
* include/experimental/simd: Include bits/simd_sve.h.
* testsuite/experimental/simd/tests/bits/main.h: Enable
testing for sve128, sve256, sve512.
* include/experimental/bits/simd_sve.h: New file.
Signed-off-by: Srinivas Yadav Singanaboina <vasu.srinivasvasu.14@gmail.com>
The following makes sure to record the scalars we add to the BB
reduction vectorization result as scalar uses for the purpose of
computing live lanes. This restores vectorization in the
bondfree.c TU of 435.gromacs.
PR tree-optimization/114057
* tree-vect-slp.cc (vect_bb_slp_mark_live_stmts): Mark
BB reduction remain defs as scalar uses.
These tests FAIL for quite a while on i686-linux since July last year,
likely r14-2628 . Since that patch gcc claims _Float16 and __bf16
support even without -msse2 because some functions could be using
target attribute.
Later r14-2691 added -msse2 to add_options_for_float16, but didn't do that
for bfloat16, plus ext-floating{3,12}.C tests need the added dg-add-options,
so that float16 and bfloat16 effective targets match the __STDCPP_FLOAT16_T__
or __STDCPP_BFLOAT16_T__ macros.
Fixes
-FAIL: g++.dg/cpp23/ext-floating12.C -std=gnu++23 (test for errors, line 144)
-FAIL: g++.dg/cpp23/ext-floating12.C -std=gnu++23 (test for errors, line 146)
-FAIL: g++.dg/cpp23/ext-floating12.C -std=gnu++23 (test for errors, line 148)
-FAIL: g++.dg/cpp23/ext-floating12.C -std=gnu++23 (test for errors, line 150)
-FAIL: g++.dg/cpp23/ext-floating12.C -std=gnu++23 (test for errors, line 152)
-FAIL: g++.dg/cpp23/ext-floating12.C -std=gnu++23 (test for errors, line 154)
-FAIL: g++.dg/cpp23/ext-floating12.C -std=gnu++26 (test for errors, line 144)
-FAIL: g++.dg/cpp23/ext-floating12.C -std=gnu++26 (test for errors, line 146)
-FAIL: g++.dg/cpp23/ext-floating12.C -std=gnu++26 (test for errors, line 148)
-FAIL: g++.dg/cpp23/ext-floating12.C -std=gnu++26 (test for errors, line 150)
-FAIL: g++.dg/cpp23/ext-floating12.C -std=gnu++26 (test for errors, line 152)
-FAIL: g++.dg/cpp23/ext-floating12.C -std=gnu++26 (test for errors, line 154)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++23 (test for errors, line 107)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++23 (test for errors, line 114)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++23 (test for errors, line 126)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++23 (test for errors, line 79)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++23 (test for errors, line 86)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++23 (test for errors, line 98)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++23 (test for warnings, line 22)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++23 (test for warnings, line 23)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++23 (test for warnings, line 24)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++23 (test for warnings, line 25)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++26 (test for errors, line 107)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++26 (test for errors, line 114)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++26 (test for errors, line 126)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++26 (test for errors, line 79)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++26 (test for errors, line 86)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++26 (test for errors, line 98)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++26 (test for warnings, line 22)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++26 (test for warnings, line 23)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++26 (test for warnings, line 24)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++26 (test for warnings, line 25)
on the latter and changes nothing on the former.
2024-03-27 Jakub Jelinek <jakub@redhat.com>
* lib/target-supports.exp (add_options_for_bfloat16): Add -msse2 on
i?86/x86_64.
* g++.dg/cpp23/ext-floating3.C: Add dg-add-options float16.
* g++.dg/cpp23/ext-floating12.C: Add dg-add-options float16 and
bfloat16.
Due to the Linux kernel exposing the lrcpc3 architectural feature as
"lrcpc3", this patch corrects the relevant FEATURE_STRING entry in the
"rcpc3" AARCH64_OPT_FMV_EXTENSION macro, such that the feature can be
correctly detected when doing native compilation on rcpc3-enabled
targets.
gcc/ChangeLog:
* config/aarch64/aarch64-option-extensions.def (rcpc3):
Fix FEATURE_STRING field to "lrcpc3".
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/cpunative/info_24: New.
* gcc.target/aarch64/cpunative/native_cpu_24.c: Likewise.
Given how, at present, the choice of using LSE128 atomic instructions
by the toolchain is delegated to run-time selection in the form of
Libatomic ifuncs, responsible for querying target support, the
`+lse128' target architecture compile-time flag is absent from GCC.
This, however, contrasts with the Binutils implementation, which gates
LSE128 instructions behind the `+lse128' flag. This can lead to
problems in GCC for certain use-cases. One such example is in the use
of inline assembly, whereby the inability of enabling the feature in
the command-line prevents the compiler from automatically issuing the
necessary LSE128 `.arch' directive.
This patch therefore brings GCC into alignment with LLVM and Binutils
in adding support for the `+lse128' architectural extension flag.
gcc/ChangeLog:
* config/aarch64/aarch64-option-extensions.def: Add LSE128
AARCH64_OPT_EXTENSION, adding it as a dependency for the D128
feature.
* doc/invoke.texi (AArch64 Options): Document +lse128.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/lse128-flag.c: New.
* gcc.target/aarch64/cpunative/info_23: Likewise.
* gcc.target/aarch64/cpunative/native_cpu_23.c: Likewise.
For targets where LOGICAL_OP_NON_SHORT_CIRCUIT evaluates to false, two
conditional jumps are emitted instead of a combined conditional which
this test is all about. Thus, set it to true.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/copy-headers-8.c: Set
LOGICAL_OP_NON_SHORT_CIRCUIT to true.