On some configurations, PowerPC emits -Wpsabi warnings when using IEEE
long doubles on a machine configured with IBM long double by default.
This patch suppresses these warnings for this testcase.
PR testsuite/114320
gcc/testsuite/ChangeLog:
* g++.dg/modules/target-powerpc-1_a.C: Suppress -Wpsabi.
* g++.dg/modules/target-powerpc-1_b.C: Likewise.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
xor insn can handle some more values without the requirement of a
scratch register. This patch adds a new constraint alternative for
such values. The output function avr_out_bitop already handles
these cases, so no change is needed there.
gcc/
* config/avr/constraints.md (CX2, CX3, CX4): New constraints.
* config/avr/avr-protos.h (avr_xor_noclobber_dconst): New proto.
* config/avr/avr.cc (avr_xor_noclobber_dconst): New function.
* config/avr/avr.md (xorhi3, *xorhi3): Add "d,0,CX2,X" alternative.
(xorpsi3, *xorpsi3): Add "d,0,CX3,X" alternative.
(xorsi3, *xorsi3): Add "d,0,CX4,X" alternative.
As the tests assume that strndup() is visible (only part of
POSIX.1-2008) define the guard to ensure that it's visible. Currently,
glibc appears to always have this defined in C++, newlib does not.
Without this patch, fails like this can be seen:
Testing analyzer/strndup-1.c, -std=c++98
.../strndup-1.c: In function 'void test_1(const char*)':
.../strndup-1.c:11:13: error: 'strndup' was not declared in this scope; did you mean 'strncmp'?
.../strndup-1.c: In function 'void test_2(const char*)':
.../strndup-1.c:16:13: error: 'strndup' was not declared in this scope; did you mean 'strncmp'?
.../strndup-1.c: In function 'void test_3(const char*)':
.../strndup-1.c:21:13: error: 'strndup' was not declared in this scope; did you mean 'strncmp'?
Patch has been verified on Linux.
gcc/testsuite/ChangeLog:
* c-c++-common/analyzer/strndup-1.c: Define _POSIX_C_SOURCE.
Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
It will be used by copysignm3/xorsignm3/lroundmn2 expanders.
gcc/ChangeLog:
PR target/114334
* config/i386/i386.md (mode): Add new number V8BF,V16BF,V32BF.
(MODEF248): New mode iterator.
(ssevecmodesuffix): Hanlde BF and HF.
* config/i386/sse.md (andnot<mode>3): Extend to HF/BF.
(<code><mode>3): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr114334.c: New test.
In looking at PR 112415, it became clear that improvements could be
made in the handling of loads and stores using REG+D addresses. A
change in 2002 conflated two issues:
1) We can't generate insns with 14-bit displacements before reload
completes when generating PA 1.x code since floating-point loads and
stores only support 5-bit offsets in PA 1.x.
2) The GNU ELF 32-bit linker lacks relocation support for PA 2.0
floating point instructions with 14-bit displacements. These
relocations affect instructions with symbolic references.
The result of the change was to block creation of PA 2.0 instructions
with 14-bit REG_D displacements for SImode, DImode, SFmode and DFmode
on the GNU linux target before reload. This was unnecessary as these
instructions don't need relocation.
This change revises the INT14_OK_STRICT define to allow creation
of instructions with 14-bit REG+D addresses before reload when
generating PA 2.0 code.
2024-03-17 John David Anglin <danglin@gcc.gnu.org>
gcc/ChangeLog:
PR rtl-optimization/112415
* config/pa/pa.cc (pa_emit_move_sequence): Revise condition
for symbolic memory operands.
(pa_legitimate_address_p): Revise LO_SUM condition.
* config/pa/pa.h (INT14_OK_STRICT): Revise define. Move
comment about GNU linker to predicates.md.
* config/pa/predicates.md (floating_point_store_memory_operand):
Revise condition for symbolic memory operands. Update
comment.
Consider range of value-initialized iterators as valid and empty.
libstdc++-v3/ChangeLog:
PR libstdc++/114316
* include/debug/safe_iterator.tcc (_Safe_iterator<>::_M_valid_range):
First check if both iterators are value-initialized before checking if
singular.
* testsuite/23_containers/set/debug/114316.cc: New test case.
* testsuite/23_containers/vector/debug/114316.cc: New test case.
This patch corrects the virtual token creation for the aggregate constant
and also corrects tokens for constructor components.
gcc/m2/ChangeLog:
PR modula2/114296
* gm2-compiler/M2ALU.mod (ElementsSolved): Add tokenno parameter.
Add constant checks and generate error messages.
(EvalSetValues): Pass tokenno parameter to ElementsSolved.
* gm2-compiler/M2LexBuf.mod (stop): New procedure.
(MakeVirtualTok): Call stop if caret = BadTokenNo.
* gm2-compiler/M2Quads.def (BuildNulExpression): Add tokpos
parameter.
(BuildSetStart): Ditto.
(BuildEmptySet): Ditto.
(BuildConstructorEnd): Add startpos parameter.
(BuildTypeForConstructor): Add tokpos parameter.
* gm2-compiler/M2Quads.mod (BuildNulExpression): Add tokpos
parameter and push tokpos to the quad stack.
(BuildSetStart): Add tokpos parameter and push tokpos.
(BuildSetEnd): Rewrite.
(BuildEmptySet): Add tokpos parameter and push tokpos with
the set type.
(BuildConstructorStart): Pop typepos.
(BuildConstructorEnd): Add startpos parameter.
Create valtok from startpos and cbratokpos.
(BuildTypeForConstructor): Add tokpos parameter.
* gm2-compiler/M2Range.def (InitAssignmentRangeCheck): Rename
d to des and e to expr.
Add destok and exprtok parameters.
* gm2-compiler/M2Range.mod (InitAssignmentRangeCheck): Rename
d to des and e to expr.
Add destok and exprtok parameters.
Save destok and exprtok into range record.
(FoldAssignment): Pass exprtok to TryDeclareConstant.
* gm2-compiler/P3Build.bnf (ComponentValue): Rewrite.
(Constructor): Rewrite.
(ConstSetOrQualidentOrFunction): Rewrite.
(SetOrQualidentOrFunction): Rewrite.
* gm2-compiler/PCBuild.bnf (ConstSetOrQualidentOrFunction): Rewrite.
(SetOrQualidentOrFunction): Rewrite.
* gm2-compiler/PHBuild.bnf (Constructor): Rewrite.
(ConstSetOrQualidentOrFunction): Rewrite.
gcc/testsuite/ChangeLog:
PR modula2/114296
* gm2/pim/fail/badtype2.mod: New test.
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
The c23-stdarg-6.c testcase I've added recently apparently works fine with
-O0 but aborts with -O1 and higher on x86_64-linux.
The problem is in setup of incoming varargs.
Like function.cc before r14-9249 even ix86_setup_incoming_varargs assumes
that TYPE_NO_NAMED_ARGS_STDARG_P don't have any named arguments and there
is nothing to advance, but that is not the case for (...) functions
returning by hidden reference which have one such artificial argument.
If the setup_incoming_varargs hook is called from the
if (TYPE_NO_NAMED_ARGS_STDARG_P (TREE_TYPE (fndecl))
&& fnargs.is_empty ())
{
struct assign_parm_data_one data = {};
assign_parms_setup_varargs (&all, &data, false);
}
spot, i.e. where there is no hidden return argument passed, arg.type
is always NULL, while when it is called in the
if (cfun->stdarg && !DECL_CHAIN (parm))
assign_parms_setup_varargs (&all, &data, false);
spot, even when it is TYPE_NO_NAMED_ARGS_STDARG_P arg.type will be non-NULL.
The tree-stdarg.cc pass in f in c23-stdarg-6.cc at -O1 or higher determines
that va_arg is used on integral types at most twice (loads 2 words),
and because ix86_setup_incoming_varargs doesn't advance, the code saves
just the %rdi and %rsi registers to the save area. But that isn't correct,
it should save %rsi and %rdx because %rdi is the hidden return argument.
With -O0 tree-stdarg.cc doesn't attempt to optimize and we save all the
registers, so it works fine in that case.
Now, I think we'll need the same fix also on
aarch64, alpha, arc, csky, ia64, loongarch, mips, mmix, nios2, riscv, visium
which have pretty much the similarly looking snippet in their hooks
changed by the r13-3549 commit.
Then arm, epiphany, fr30, frv, ft32, m32r, mcore, nds32, rs6000, sh
have different changes but most likely need something similar too.
I don't have access to most of those, could test aarch64 and rs6000 I guess.
2024-03-16 Jakub Jelinek <jakub@redhat.com>
PR target/114175
* config/i386/i386.cc (ix86_setup_incoming_varargs): Only skip
ix86_function_arg_advance for TYPE_NO_NAMED_ARGS_STDARG_P functions
if arg.type is NULL.
* gcc.dg/c23-stdarg-7.c: New test.
* gcc.dg/c23-stdarg-8.c: New test.
The verifier requires BIT_FIELD_REFs with INTEGRAL_TYPE_P first operand
to have mode precision. In most cases for the large/huge _BitInt bitfield
stores the code uses bitfield representatives, which are typically arrays
of chars, but if the bitfield starts at byte boundary on big endian,
the code uses as nlhs in lower_mergeable_store COMPONENT_REF of the
bitfield FIELD_DECL instead, which is fine for the limb accesses,
but when used for the most significant limb can result in invalid
BIT_FIELD_REF because the first operand then has BITINT_TYPE and
usually VOIDmode.
The following patch adds a helper method for the 4 creatikons of
BIT_FIELD_REF which when needed adds a VIEW_CONVERT_EXPR.
2024-03-16 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/114329
* gimple-lower-bitint.cc (struct bitint_large_huge): Declare
build_bit_field_ref method.
(bitint_large_huge::build_bit_field_ref): New method.
(bitint_large_huge::lower_mergeable_stmt): Use it.
* gcc.dg/bitint-101.c: New test.
Block-scope declarations of functions or extern values are not allowed
when attached to a named module. Similarly, class member functions are
not inline if attached to a named module. However, in both these cases
we currently only check if the declaration is within the module purview;
it is possible for such a declaration to occur within the module purview
but not be attached to a named module (e.g. in an 'extern "C++"' block).
This patch makes the required adjustments.
PR c++/112631
gcc/cp/ChangeLog:
* cp-tree.h (named_module_attach_p): New function.
* decl.cc (start_decl): Check for attachment not purview.
(grokmethod): Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/modules/block-decl-1_a.C: New test.
* g++.dg/modules/block-decl-1_b.C: New test.
* g++.dg/modules/block-decl-2.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Use INCLUDE_VECTOR before including system.h, instead of directly
including <vector>, to avoid running into poisoned identifiers.
Signed-off-by: Dimitry Andric <dimitry@andric.com>
libcc1/ChangeLog:
PR middle-end/111632
* libcc1plugin.cc: Fix include.
* libcp1plugin.cc: Fix include.
While for __mulbitint3 we actually don't negate anything and perform the
multiplication in unsigned style always, for __divmodbitint4 if the operands
aren't unsigned and are negative, we negate them first and then try to
negate them as needed at the end.
quotient is negated if just one of the operands was negated and the other
wasn't or vice versa, and remainder is negated if the first operand was
negated.
The case which doesn't work correctly is if due to limited range of the
operands we perform the division/modulo in some smaller number of limbs
and then extend it to the desired precision of the quotient and/or
remainder results. If they aren't negated, the extension is done with
memset to 0, if they are negated, the extension was done with memset
to -1. The problem is that if the quotient or remainder is zero,
then bitint_negate negates it again to zero (that is ok), but we should
then extend with memset to 0, not memset to -1.
The following patch achieves that by letting bitint_negate also check if
the negated operand is zero and changes the memset argument based on that.
2024-03-15 Jakub Jelinek <jakub@redhat.com>
PR libgcc/114327
* libgcc2.c (bitint_negate): Return UWtype bitwise or of all the limbs
before negation rather than void.
(__divmodbitint4): Determine whether to fill in the upper limbs after
negation based on whether bitint_negate returned 0 or non-zero, rather
then always filling with -1.
* gcc.dg/torture/bitint-63.c: New test.
As mentioned in the PR, the new testcase FAILs on sparc*-* due to
lack of support of misaligned store.
This patch restricts that to vect_hw_misalign targets.
2024-03-15 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/113431
* gcc.dg/vect/pr113431.c: Restrict scan-tree-dump-times to
vect_hw_misalign targets.
When backporting r14-9315 to 13 branch, I've noticed I've missed
one letter in a comment. And grepping for similar issues I found
one word with too many.
2024-03-15 Jakub Jelinek <jakub@redhat.com>
* lower-subreg.cc (resolve_simple_move): Fix comment typo,
betwee -> between.
* edit-context.cc (class line_event): Fix comment typo,
betweeen -> between.
In r13-3803-gfa271afb58 I've added an optimization for LE/LEU/GE/GEU
comparison against CONST_VECTOR. As the comments say:
/* x <= cst can be handled as x < cst + 1 unless there is
wrap around in cst + 1. */
...
/* For LE punt if some element is signed maximum. */
...
/* For LEU punt if some element is unsigned maximum. */
and
/* x >= cst can be handled as x > cst - 1 unless there is
wrap around in cst - 1. */
...
/* For GE punt if some element is signed minimum. */
...
/* For GEU punt if some element is zero. */
Apparently I wrote the GE/GEU (second case) first and then
copied/adjusted it for LE/LEU, most of the adjustments look correct, but
I've left if (code == GE) comparison when testing if it should punt for
signed maximum. That condition is never true, because this is in
switch (code) { ... case LE: case LEU: block and we really meant to
be what the comment says, for LE punt if some element is signed maximum,
as then cst + 1 wraps around.
The following patch fixes the pasto.
2024-03-15 Jakub Jelinek <jakub@redhat.com>
PR target/114339
* config/i386/i386-expand.cc (ix86_expand_int_sse_cmp) <case LE>: Fix
a pasto, compare code against LE rather than GE.
* gcc.target/i386/pr114339.c: New test.
This optimisation does not honour signed zeros, so should not be
enabled except with -fno-signed-zeros.
gcc/ChangeLog:
* match.pd: Fix truncation pattern for -fno-signed-zeroes
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/no_merge_trunc_signed_zero.c: New test.
The x86-64 and aarch64 psABIs (and the unwritten ia64 psABI part) say that
the padding bits of _BitInt are undefined, while the expansion internally
typically assumes that non-mode precision integers are sign/zero extended
and extends after operations. We handle that mismatch with EXTEND_BITINT
done when reading from untrusted sources like function arguments, reading
_BitInt from memory etc. but otherwise keep relying on stuff being extended
internally (say in pseudos).
The return value of a function is an ABI boundary though too and we need
to extend that too.
2024-03-15 Jakub Jelinek <jakub@redhat.com>
PR middle-end/114332
* expr.cc (expand_expr_real_1): EXTEND_BITINT also CALL_EXPR results.
r14-9478 added [[nodiscard]] to various <algorithm> APIs including find_if
the pr104601.C testcase uses. As it is an optimization bug fix testcase,
haven't tried to adjust the testcase to use the find_if result, but instead
have added -Wno-unused-result flag to quiet the warning.
2024-03-15 Jakub Jelinek <jakub@redhat.com>
* g++.dg/torture/pr104601.C: Add -Wno-unused-result to dg-options.
This patch (on top of the just posted gsi_safe_insert* fixes patch)
fixes the instrumentation of large/huge _BitInt SSA_NAME arguments of
returns_twice calls.
In this case it isn't just a matter of using gsi_safe_insert_before instead
of gsi_insert_before, we need to do more.
One thing is that unlike the asan/ubsan instrumentation which does just some
checking, here we want the statement before the call to load into a SSA_NAME
which is passed to the call. With another edge we need to add a PHI,
with one PHI argument the loaded SSA_NAME, another argument an uninitialized
warning free SSA_NAME and a result and arrange for all 3 SSA_NAMEs to be
preserved (i.e. stay as is, be no longer lowered afterwards).
Unfortunately, edge_before_returns_twice_call can create new SSA_NAMEs using
copy_ssa_name and while we can have a reasonable partition for them (same
partition as PHI result correspoding to the PHI argument newly added), adding
SSA_NAMEs into a partition after the partitions are finalized is too ugly.
So, this patch takes a different approach suggested by Richi, just emit
the argument loads before the returns_twice call normally (i.e. temporarily
create invalid IL) and just remember that we did that, and when the bitint
lowering is otherwise done fix this up, gsi_remove those statements,
gsi_safe_insert_before and and create the needed new PHIs.
2024-03-15 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/113466
* gimple-lower-bitint.cc (bitint_large_huge): Add m_returns_twice_calls
member.
(bitint_large_huge::bitint_large_huge): Initialize it.
(bitint_large_huge::~bitint_large_huge): Release it.
(bitint_large_huge::lower_call): Remember ECF_RETURNS_TWICE call stmts
before which at least one statement has been inserted.
(gimple_lower_bitint): Move argument loads before ECF_RETURNS_TWICE
calls to a different block and add corresponding PHIs.
* gcc.dg/bitint-100.c: New test.
2024-03-15 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/87477
PR fortran/89645
PR fortran/99065
PR fortran/114141
PR fortran/114280
* class.cc (gfc_change_class): New function needed for
associate names, when rank changes or a derived type is
produced by resolution
* dump-parse-tree.cc (show_code_node): Make output for SELECT
TYPE more comprehensible.
* expr.cc (find_inquiry_ref): Do not simplify expressions of
an inferred type.
* gfortran.h : Add 'gfc_association_list' to structure
'gfc_association_list'. Add prototypes for
'gfc_find_derived_types', 'gfc_fixup_inferred_type_refs' and
'gfc_change_class'. Add macro IS_INFERRED_TYPE.
* match.cc (copy_ts_from_selector_to_associate): Add bolean arg
'select_type' with default false. If this is a select type name
and the selector is a inferred type, build the class type and
apply it to the associate name.
(build_associate_name): Pass true to 'select_type' in call to
previous.
* parse.cc (parse_associate): If the selector is inferred type
the associate name is too. Make sure that function selector
class and rank, if known, are passed to the associate name. If
a function result exists, pass its typespec to the associate
name.
* primary.cc (resolvable_fcns): New function to check that all
the function references are resolvable.
(gfc_match_varspec): If a scalar derived type select type
temporary has an array reference, match the array reference,
treating this in the same way as an equivalence member. Do not
set 'inquiry' if applied to an unknown type the inquiry name
is ambiguous with the component of an accessible derived type.
Check that resolution of the target expression is OK by testing
if the symbol is declared or is an operator expression, then
using 'resolvable_fcns' recursively. If all is well, resolve
the expression. If this is an inferred type with a component
reference, call 'gfc_find_derived_types' to find a suitable
derived type. If there is an inquiry ref and the symbol either
is of unknown type or is inferred to be a derived type, set the
primary and symbol TKR appropriately.
* resolve.cc (resolve_variable): Call new function below.
(gfc_fixup_inferred_type_refs): New function to ensure that the
expression references for a inferred type are consistent with
the now fixed up selector.
(resolve_assoc_var): Ensure that derived type or class function
selectors transmit the correct arrayspec to the associate name.
(resolve_select_type): If the selector is an associate name of
inferred type and has no component references, the associate
name should have its typespec. Simplify the conversion of a
class array to class scalar by calling 'gfc_change_class'.
Make sure that a class, inferred type selector with an array
ref transfers the typespec from the symbol to the expression.
* symbol.cc (gfc_set_default_type): If an associate name with
unknown type has a selector expression, try resolving the expr.
(find_derived_types, gfc_find_derived_types): New functions
that search for a derived type with a given name.
* trans-expr.cc (gfc_conv_variable): Some inferred type exprs
escape resolution so call 'gfc_fixup_inferred_type_refs'.
* trans-stmt.cc (trans_associate_var): Tidy up expression for
'class_target'. Finalize and free class function results.
Correctly handle selectors that are class functions and class
array references, passed as derived types.
gcc/testsuite/
PR fortran/87477
PR fortran/89645
PR fortran/99065
* gfortran.dg/associate_64.f90 : New test
* gfortran.dg/associate_66.f90 : New test
* gfortran.dg/associate_67.f90 : New test
PR fortran/114141
* gfortran.dg/associate_65.f90 : New test
PR fortran/114280
* gfortran.dg/associate_68.f90 : New test
We support options -m(no-)unaligned-access 2 years ago, while
currently most of other ports prefer -m(no-)strict-align.
Let's support -m(no-)strict-align, and keep -m(no-)unaligned-access
as alias.
gcc
* config/mips/mips.opt: Support -mstrict-align, and use
TARGET_STRICT_ALIGN as the flag; keep -m(no-)unaligned-access
as alias.
* config/mips/mips.h: Use TARGET_STRICT_ALIGN.
* config/mips/mips.opt.urls: Regenerate.
* doc/invoke.texi: Document -m(no-)strict-algin for MIPSr6.
This patch fixes a bug where vect_recog_abd_pattern called vect_convert_output
with the incorrect vecitype for the corresponding pattern_stmt.
vect_convert_output expects vecitype to be the vector form of the scalar type
of the LHS of pattern_stmt, but we were passing in the vector form of the LHS
of the new impending conversion statement. This caused a skew in ABD's
pattern_stmt having the vectype of the following gimple pattern_stmt.
2024-03-06 Tejas Belagod <tejas.belagod@arm.com>
gcc/ChangeLog:
PR middle-end/114108
* tree-vect-patterns.cc (vect_recog_abd_pattern): Call
vect_convert_output with the correct vecitype.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/pr114108.c: New test.
The behavior of non-zero unused bits in xvpermi.q instruction's
third operand is undefined on LoongArch, according to our
discussion (https://github.com/llvm/llvm-project/pull/83540),
we think that keeping original insn operand as unmodified
state is better solution.
This patch partially reverts 7b158e036a.
gcc/ChangeLog:
* config/loongarch/lasx.md (lasx_xvpermi_q_<LASX:mode>):
Remove masking of operand 3.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/vector/lasx/lasx-xvpermi_q.c:
Reposition operand 3's value into instruction's defined accept range.
It came up on the mailing list that OBJECT_BEGIN/END are described as
marking object lifetime, but mark the beginning of the constructor and end
of the destructor, whereas the C++ notion of lifetime is between the end of
the constructor and beginning of the destructor. So let's fix the comments.
gcc/ChangeLog:
* tree-core.h (enum clobber_kind): Clarify CLOBBER_OBJECT_*
comments.
This patch fixes an ICE when encountering an expression:
1 + HIGH (a[0]). The fix was to assign a type to the constant
created by BuildConstHighFromSym in M2Quads.mod.
gcc/m2/ChangeLog:
PR modula2/114294
* gm2-compiler/M2Quads.mod (BuildConstHighFromSym):
Call PutConst to assign the type Cardinal in the result
constant.
gcc/testsuite/ChangeLog:
PR modula2/114294
* gm2/pim/pass/log: Removed.
* gm2/pim/pass/highexp.mod: New test.
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
When generating PA 1.x code or code for GNU ld, floating-point
accesses only support 5-bit displacements but integer accesses
support 14-bit displacements. I mistakenly assumed reload
could fix an invalid 14-bit displacement in a floating-point
access but this is not the case.
2024-03-14 John David Anglin <danglin@gcc.gnu.org>
gcc/ChangeLog:
PR target/114288
* config/pa/pa.cc (pa_legitimate_address_p): Don't allow
14-bit displacements before reload for modes that may use
a floating-point load or store.
Change the BPF backend to define INT8_TYPE with an explicit sign, rather
than a plain char. This is in line with other targets and removes the
risk of int8_t being affected by the signedness of the plain char type
of the host system.
The motivation for this change is that even if `char' is defined to be
signed in BPF targets, some BPF programs use the (mal)practice of
including internal libc headers, either directly or indirectly via
kernel headers, which in turn may trigger compilation errors regarding
redefinitions of types.
gcc/
* config/bpf/bpf.h (INT8_TYPE): Change to signed char.
After switching to LRA xtensa backend generates the following code for
saving/loading registers:
movi a9, 0x190
add a9, a9, sp
s32i.n a3, a9, 0
instead of the shorter and more efficient
s32i a3, a9, 0x190
E.g. the following code can be used to reproduce it:
int f1(int a, int b, int c, int d, int e, int f, int *p);
int f2(int a, int b, int c, int d, int e, int f, int *p);
int f3(int a, int b, int c, int d, int e, int f, int *p);
int foo(int a, int b, int c, int d, int e, int f)
{
int g[100];
return
f1(a, b, c, d, e, f, g) +
f2(a, b, c, d, e, f, g) +
f3(a, b, c, d, e, f, g);
}
This happens in the LRA pass because s32i.n and l32i.n are listed before
the s32i and l32i in the movsi_internal pattern and alternative
consideration loop stops early.
gcc/
* config/xtensa/xtensa.md (movsi_internal): Move l32i and s32i
patterns ahead of the l32i.n and s32i.n.
The fast path for "{}" format strings has a bug for negative integers
where the length passed to std::to_chars is too long.
libstdc++-v3/ChangeLog:
PR libstdc++/114325
* include/std/format (_Scanner::_M_scan): Pass correct length to
__to_chars_10_impl.
* testsuite/std/format/functions/format.cc: Check negative
integers with empty format-spec.
Add the [[nodiscard]] attribute to several functions in <algorithm>.
These all have no side effects and are only called for their return
value (e.g. std::count) or produce a result that must not be discarded
for correctness (e.g. std::remove).
I was intending to add the attribute to a number of other functions like
std::copy_if, std::unique_copy, std::set_union, and std::set_difference.
I stopped when I noticed that MSVC doesn't use it on those functions,
which I suspect is because they're often used with an insert iterator
(e.g. std::back_insert_iterator). In that case it doesn't matter if
you discard the result, because you have the container to tell you how
many elements were copied to the output range.
libstdc++-v3/ChangeLog:
* include/bits/stl_algo.h (find_end, all_of, none_of, any_of)
(find_if_not, is_partitioned, partition_point, remove)
(remove_if, unique, lower_bound, upper_bound, equal_range)
(binary_search, includes, is_sorted, is_sorted_until, minmax)
(minmax_element, is_permutation, clamp, find_if, find_first_of)
(adjacent_find, count, count_if, search, search_n, min_element)
(max_element): Add nodiscard attribute.
* include/bits/stl_algobase.h (min, max, lower_bound, equal)
(lexicographical_compare, lexicographical_compare_three_way)
(mismatch): Likewise.
* include/bits/stl_heap.h (is_heap, is_heap_until): Likewise.
* testsuite/25_algorithms/equal/debug/1_neg.cc: Add dg-warning.
* testsuite/25_algorithms/equal/debug/2_neg.cc: Likewise.
* testsuite/25_algorithms/equal/debug/3_neg.cc: Likewise.
* testsuite/25_algorithms/find_first_of/concept_check_1.cc:
Likewise.
* testsuite/25_algorithms/is_permutation/2.cc: Likewise.
* testsuite/25_algorithms/lexicographical_compare/71545.cc:
Likewise.
* testsuite/25_algorithms/lower_bound/33613.cc: Likewise.
* testsuite/25_algorithms/lower_bound/debug/irreflexive.cc:
Likewise.
* testsuite/25_algorithms/lower_bound/debug/partitioned_neg.cc:
Likewise.
* testsuite/25_algorithms/lower_bound/debug/partitioned_pred_neg.cc:
Likewise.
* testsuite/25_algorithms/minmax/3.cc: Likewise.
* testsuite/25_algorithms/search/78346.cc: Likewise.
* testsuite/25_algorithms/search_n/58358.cc: Likewise.
* testsuite/25_algorithms/unique/1.cc: Likewise.
* testsuite/25_algorithms/unique/11480.cc: Likewise.
* testsuite/25_algorithms/upper_bound/33613.cc: Likewise.
* testsuite/25_algorithms/upper_bound/debug/partitioned_neg.cc:
Likewise.
* testsuite/25_algorithms/upper_bound/debug/partitioned_pred_neg.cc:
Likewise.
* testsuite/ext/concept_checks.cc: Likewise.
* testsuite/ext/is_heap/47709.cc: Likewise.
* testsuite/ext/is_sorted/cxx0x.cc: Likewise.
AFAIK we have no code in LTO streaming to stream out or in
SSA_NAME_{RANGE,PTR}_INFO, so LTO effectively throws it all away
and let vrp1 and alias analysis after IPA recompute that. There is
just one spot, for IPA VRP and IPA bit CCP we save/restore ranges
and set SSA_NAME_{PTR,RANGE}_INFO e.g. on parameters depending on what
we saved and propagated, but that is after streaming in bodies for the
post IPA optimizations.
Now, without LTO SSA_NAME_{RANGE,PTR}_INFO is already computed from
earlier in many cases (er.g. evrp and early alias analysis but other spots
too), but IPA ICF is ignoring the ranges and points-to details when
comparing the bodies. I think ignoring that is just fine, that is
effectively what we do for LTO where we throw that information away
before the analysis, and not ignoring it could lead to fewer ICF merging
possibilities.
So, the following patch instead verifies that for LTO SSA_NAME_{PTR,RANGE}_INFO
just isn't there on SSA_NAMEs in functions into which other functions have
been ICFed, and for non-LTO throws that information away (which matches the
LTO behavior).
Another possibility would be to remember the SSA_NAME <-> SSA_NAME mapping
vector (just one of the 2) on successful sem_function::equals on the
sem_function which is not the chosen leader (e.g. how SSA_NAMEs in the
leader map to SSA_NAMEs in the other function) and use that vector
to union the ranges in sem_function::merge. I can implement that for
comparison, but wanted to post this first if there is an agreement on
doing that or if Honza thinks we should take SSA_NAME_{RANGE,PTR}_INFO
into account. I think we can compare SSA_NAME_RANGE_INFO, but have
no idea how to try to compare points to info. And I think it will result
in less effective ICF for non-LTO vs. LTO unnecessarily.
2024-03-12 Jakub Jelinek <jakub@redhat.com>
PR middle-end/113907
* ipa-icf.cc (sem_item_optimizer::merge_classes): Reset
SSA_NAME_RANGE_INFO and SSA_NAME_PTR_INFO on successfully ICF merged
functions.
* gcc.dg/pr113907-1.c: New test.
This patch applies the new stricter type checking procedure function to
the remaining 6 comparisons: less, greater, lessequ, greequ, ifin and
ifnotin.
gcc/m2/ChangeLog:
PR modula2/114333
* gm2-compiler/M2GenGCC.mod (CodeStatement): Remove op1, op2 and
op3 parameters to CodeIfLess, CodeIfLessEqu, CodeIfGreEqu, CodeIfGre,
CodeIfIn, CodeIfNotIn.
(CodeIfLess): Rewrite.
(PerformCodeIfLess): New procedure.
(CodeIfLess): Rewrite.
(PerformCodeIfLess): New procedure.
(CodeIfLessEqu): Rewrite.
(PerformCodeIfLessEqu): New procedure.
(CodeIfGreEqu): Rewrite.
(PerformCodeIfGreEqu): New procedure.
(CodeIfGre): Rewrite.
(PerformCodeIfGre): New procedure.
(CodeIfIn): Rewrite.
(PerformCodeIfIn): New procedure.
(CodeIfNotIn): Rewrite.
(PerformCodeIfNotIn): New procedure.
gcc/testsuite/ChangeLog:
PR modula2/114333
* gm2/pim/fail/badset5.mod: New test.
* gm2/pim/fail/badset6.mod: New test.
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
If this insn is really used, we'll have something like
slti $r4,$r0,$r5
in the code. The assembler will reject it because slti wants 2
register operands and 1 immediate operand. But we've not got any bug
report for this, indicating this define_insn is unused at all.
Note that do_store_flag (in expr.cc) is already converting x >= 1 to
x > 0 unconditionally, so this define_insn is indeed unused and we can
just remove it.
gcc/ChangeLog:
* config/loongarch/loongarch.md (any_ge): Remove.
(sge<u>_<X:mode><GPR:mode>): Remove.
For 80-bit long double we need to clear the padding bits on
construction.
libstdc++-v3/ChangeLog:
* include/bits/atomic_base.h (__atomic_float::__atomic_float(Fp)):
Clear padding.
* testsuite/29_atomics/atomic_float/compare_exchange_padding.cc:
New test.
Signed-off-by: xndcn <xndchn@gmail.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
Our dear friend '-Wuninitialized' reported:
[...]/libgomp.oacc-fortran/acc-memcpy.f90:18:27:
18 | char(j) = int (j, int8)
| ^
Warning: ‘j’ may be used uninitialized [-Wmaybe-uninitialized]
[...]/libgomp.oacc-fortran/acc-memcpy.f90:14:20:
14 | integer(int8) :: j
| ^
note: ‘j’ was declared here
..., but actually there were other issues.
libgomp/
* testsuite/libgomp.oacc-fortran/acc-memcpy.f90: Fix 'char'
initialization, copy, check.
The following testcase ICEs with LSE atomics.
The problem is that the @atomic_compare_and_swap<mode> expander uses
aarch64_reg_or_zero predicate for the desired operand, which is fine,
given that for most of the modes and even for TImode in some cases
it can handle zero immediate just fine, but the TImode
@aarch64_compare_and_swap<mode>_lse just uses register_operand for
that operand instead, again intentionally so, because the casp,
caspa, caspl and caspal instructions need to use a pair of consecutive
registers for the operand and xzr is just one register and we can't
just store zero into the link register to emulate pair of zeros.
So, the following patch fixes that by forcing the newval operand into
a register for the TImode LSE case.
2024-03-14 Jakub Jelinek <jakub@redhat.com>
PR target/114310
* config/aarch64/aarch64.cc (aarch64_expand_compare_and_swap): For
TImode force newval into a register.
* gcc.dg/pr114310.c: New test.
Transactional and non-transactional stores to the same cache line cause
transactions to abort on newer generations. Add sufficient padding to make
sure another cache line is used.
Tested on s390.
gcc/testsuite/ChangeLog:
* gcc.target/s390/htm-builtins-1.c: Fix.
* gcc.target/s390/htm-builtins-2.c: Fix.
Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>