Clear __eh_globals_init's _S_init in the dtor before deleting the
gthread key.
This ensures that, in case any code involved in deleting the key
interacts with eh_globals, the key that is being deleted won't be
used, and the non-thread-specific eh_globals fallback will.
for libstdc++-v3/ChangeLog
* libsupc++/eh_globals.cc [!_GLIBCXX_HAVE_TLS]
(__eh_globals_init::~__eh_globals_init): Clear _S_init first.
As in the gcc testsuite, systems without preemptive multi-threading
require sched_yield calls to be placed at points in which a context
switch might be needed to enable the test to complete.
for libstdc++-v3/ChangeLog
* testsuite/30_threads/this_thread/60421.cc (test02): Call
sched_yield.
nexttowardl is only expected to be available with C99 math, but
20_util/to_chars/long_double.cc uses it unconditionally.
State the cmath requirement in the test.
for libstdc++-v3/ChangeLog
* testsuite/20_util/to_chars/long_double.cc: Require cmath.
rtems6 declares a global struct bitset in a header file included
indirectly by sys/types.h, that ambiguates the unqualified references
to bitset after "using namespace std" in the testsuite.
Work around the namespace pollution with using declarations of
std::bitset.
for libstdc++-v3/ChangeLog
* testsuite/23_containers/bitset/cons/dr1325-2.cc: Work around
global struct bitset.
* testsuite/23_containers/bitset/ext/15361.cc: Likewise.
* testsuite/23_containers/bitset/input/1.cc: Likewise.
* testsuite/23_containers/bitset/to_string/1.cc: Likewise.
* testsuite/23_containers/bitset/to_string/dr396.cc: Likewise.
Use the just-added dry-run infrastructure to clean up files that may
have been left over by interrupted runs of outputs.exp, which used to
lead to spurious non-repeatable (self-fixing) failures.
for gcc/testsuite/ChangeLog
* gcc.misc-tests/outputs.exp: Clean up left-overs first.
The presence of -I or -L flags in link command lines changes the
driver's, and thus the linker's behavior, WRT naming files with
command-line options. With such flags, the driver creates .args.0 and
.args.1 files, whereas without them it's the linker (collect2, really)
that creates .ld1_args.
I've hit some fails on a target system that doesn't have -I or -L
flags in the board config file, but it does add some of them
implicitly with configured-in driver self specs. Alas, the test in
outputs.exp doesn't catch that, so we proceed to run rather than
skip_atsave tests.
I've reworked the outest procedure to allow dry runs and to return
would-have-been pass/fail results as lists, so we can now test whether
certain files are created and use that to configure the actual test
runs.
for gcc/testsuite/ChangeLog
* gcc.misc-tests/outputs.exp (outest): Introduce quiet mode,
create and return lists of passes and fails. Use it to catch
skip_atsave cases where -L flags are implicitly added by
driver self specs.
Other LTO tests that use -r require the lto_incremental effective
target. I suppose pr90990_0.C is missing it due to an oversight.
This patch arranges for this test to also be skipped on
non-lto_incremental targets.
for gcc/testsuite/ChangeLog
* g++.dg/lto/pr90990_0.C: Require lto_incremental target.
gcc/testsuite/ChangeLog:
* gcc.target/i386/amx-check.h (request_perm_xtile_data):
New function to check if AMX is usable and enable AMX.
(main): Run test if AMX is usable.
Fortify buffer overflow message reported.
(see https://github.com/earlephilhower/esp-quick-toolchain/issues/36)
gcc/ChangeLog:
* config/xtensa/xtensa.md (bswapsi2_internal):
Enlarge the buffer that is obviously smaller than the template
string given to sprintf().
This patch addresses PR target/105991 where a change to prefer representing
shifts and adds at the tree-level as multiplications, causes problems for
the rldimi patterns in the powerpc backend. The issue is that rs6000.md
models this pattern using IOR, and some variants that have the equivalent
PLUS or XOR in the RTL fail to match some *rotl<mode>4_insert patterns.
This is fixed in this patch by adding a define_insn_and_split to locally
canonicalize the PLUS and XOR forms to the backend's preferred IOR form.
An alternative fix might be for the RTL optimizers to define a canonical
form for these plus_xor_ior equivalent expressions, but the logical
choice might be plus (which may appear in an addressing mode), and such
a change may require a number of tweaks to update various backends
(i.e. a more intrusive change than the one proposed here).
Many thanks for Marek Polacek for bootstrapping and regression testing
this change without problems.
2022-06-22 Roger Sayle <roger@nextmovesoftware.com>
Marek Polacek <polacek@redhat.com>
Segher Boessenkool <segher@kernel.crashing.org>
Kewen Lin <linkw@linux.ibm.com>
gcc/ChangeLog
PR target/105991
* config/rs6000/rs6000.md (rotl<mode>3_insert_3): Check that
exact_log2 doesn't return -1 (or zero).
(plus_xor): New code iterator.
(*rotl<mode>3_insert_3_<code>): New define_insn_and_split.
gcc/testsuite/ChangeLog
PR target/105991
* gcc.target/powerpc/pr105991.c: New test case.
The i variable is used inside of the parallel in:
#pragma omp simd safelen(32) private (v)
for (i = 0; i < 64; i++)
{
v = 3 * i;
ll[i] = u1 + v * u2[0] + u2[1] + x + y[0] + y[1] + v + h[0] + u3[i];
}
where i is predetermined linear (so while inside of the body
it is safe, private per SIMD lane var) the final value is written to
the shared variable, and in:
for (i = 0; i < 64; i++)
if (ll[i] != u1 + 3 * i * u2[0] + u2[1] + x + y[0] + y[1] + 3 * i + 13 + 14 + i)
#pragma omp atomic write
err = 1;
which is a normal loop and so it isn't in any way privatized there.
So we have a data race, fixed by adding private (i) clause to the
parallel.
2022-06-21 Jakub Jelinek <jakub@redhat.com>
Paul Iannetta <piannetta@kalrayinc.com>
PR libgomp/106045
* testsuite/libgomp.c/target-31.c: Add private (i) clause.
Expressions of the form "X + CST < Y + CST" where:
* CST is an unsigned integer constant with only the MSB set, and
* X and Y's types have integer conversion ranks <= CST's
can be simplified to "(signed) X < (signed) Y".
This is because, assuming a 32-bit signed numbers,
(unsigned) INT_MIN + 0x80000000 is 0, and
(unsigned) INT_MAX + 0x80000000 is UINT_MAX.
i.e. the result increases monotonically with signed input.
This means:
((signed) X < (signed) Y) iff (X + 0x80000000 < Y + 0x80000000)
gcc/
PR tree-optimization/94899
* match.pd (X + C < Y + C -> (signed) X < (signed) Y, if C is
0x80000000): New simplification.
gcc/testsuite/
* gcc.dg/pr94899.c: New test.
noce_try_sign_mask as documented will optimize
if (c < 0)
x = t;
else
x = 0;
into x = (c >> bitsm1) & t;
The optimization is done if either t is unconditional
(e.g. for
x = t;
if (c >= 0)
x = 0;
) or if it is cheap. We already check that t doesn't have side-effects,
but if t is conditional, we need to punt also if it may trap or fault,
as we make it unconditional.
I've briefly skimmed other noce_try* optimizations and didn't find one that
would suffer from the same problem.
2022-06-21 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/106032
* ifcvt.cc (noce_try_sign_mask): Punt if !t_unconditional, and
t may_trap_or_fault_p, even if it is cheap.
* gcc.c-torture/execute/pr106032.c: New test.
If expand_cond_expr_using_cmove can't find a cmove optab for a particular
mode, it tries to promote the mode and perform the cmove in the promoted
mode.
The testcase in the patch ICEs on arm because in that case we pass temp which
has the promoted mode (SImode) as target to expand_operands where the
operands have the non-promoted mode (QImode).
Later on the function uses paradoxical subregs:
if (GET_MODE (op1) != mode)
op1 = gen_lowpart (mode, op1);
if (GET_MODE (op2) != mode)
op2 = gen_lowpart (mode, op2);
to change the operand modes.
The following patch fixes it by passing NULL_RTX as target if it has
promoted mode.
2022-06-21 Jakub Jelinek <jakub@redhat.com>
PR middle-end/106030
* expr.cc (expand_cond_expr_using_cmove): Pass NULL_RTX instead of
temp to expand_operands if mode has been promoted.
* gcc.c-torture/compile/pr106030.c: New test.
The if condition is at last of first bb, so side effect statement in first BB
doesn't matter, then the first if condition could also be folded to switch
table.
gcc/ChangeLog:
PR target/105740
* gimple-if-to-switch.cc (find_conditions): Don't skip the first
condition bb.
gcc/testsuite/ChangeLog:
PR target/105740
* gcc.dg/tree-ssa/if-to-switch-11.c: New test.
Signed-off-by: Xionghu Luo <xionghuluo@tencent.com>
The addr_expr computation does not check for error_mark_node before
returning the size expression. This used to work in the constant case
because the conversion to uhwi would end up causing it to return
size_unknown, but that won't work for the dynamic case.
Modify the control flow to explicitly return size_unknown if the offset
computation returns an error_mark_node.
gcc/ChangeLog:
PR tree-optimization/105736
* tree-object-size.cc (addr_object_size): Return size_unknown
when object offset computation returns an error.
gcc/testsuite/ChangeLog:
PR tree-optimization/105736
* gcc.dg/builtin-dynamic-object-size-0.c (TV4): New struct.
(val3): New variable.
(test_pr105736): New test.
(main): Call it.
Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>
The presence of the color markers in the some of the asan tests
appears to confuse the dg-output matching (possibly a platform
TCL or termios bug) on some Darwin platforms.
Since the color is not being tested, switch it off (makes the log
files easier to read too). This fixes a large number of spurious
test fails on AVX512 Darwin19.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/testsuite/ChangeLog:
* lib/asan-dg.exp: Do not apply color to asan output when
under test.
Disallow siball when calling ifunc functions with PIC register so that
PIC register can be restored.
gcc/
PR target/105960
* config/i386/i386.cc (ix86_function_ok_for_sibcall): Return
false if PIC register is used when calling ifunc functions.
gcc/testsuite/
PR target/105960
* gcc.target/i386/pr105960.c: New test.
Darwin does not support patchable function entries, skip the test
there.
gcc/testsuite/ChangeLog:
* g++.dg/modules/pr105169_a.C: Skip the test on Darwin.
* g++.dg/modules/pr105169_b.C: Likewise.
For targets without alias support, we emit two essentially identical function
bodies into the gimple (complete and base CTORs). So this test needs to allow
for that when the target does not support aliases. The target support alias
test does not seem to be usable in the context of a single scan-tree-dump so
the fix here uses the target designation.
Note that the array has 10 elements, so that if the test were failing (because
we were emitting 10 inits instead of a loop) the count would be expected to
exceed 2, on Darwin and 1 where there's alias support.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/testsuite/ChangeLog:
* g++.dg/init/array61.C: Allow for two CTOR bodies on Darwin, where
aliases are not currently supported.
Unfortunately, there is more fall-out in the testsuite for my changes
to use MVE move-immediate operations instead of literal pool loads.
Fixed as follows:
gcc/testsuite/ChangeLog:
* gcc.target/arm/simd/mve-vcmp-f32-2.c: Adjust expected output.
* gcc.target/arm/simd/pr100757.c: Likewise.
* gcc.target/arm/simd/pr100757-2.c: Likewise.
* gcc.target/arm/simd/pr100757-3.c: Likewise.
* gcc.target/arm/simd/pr100757-4.c: Likewise.
Fixes this test on Darwin.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/testsuite/ChangeLog:
* g++.dg/modules/init-2_b.C: Add a missing USER_LABEL_PREFIX
to a regex.
The attr-cdtor-1 test fails on targets without init priority since the
diagnostic emitted concerns the absence of support. Disable the test
on such targets.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/testsuite/ChangeLog:
* c-c++-common/attr-cdtor-1.c: Requite init_priority support.
The fold_to_nonsharp_ineq_using_bound folding ends up creating invalid
typed IL which confuses later foldings. The following fixes that.
2022-06-20 Richard Biener <rguenther@suse.de>
PR middle-end/106027
* fold-const.cc (fold_to_nonsharp_ineq_using_bound): Use the
type of the prevailing comparison for the new comparison type.
(fold_binary_loc): Use proper types for the A < X && A + 1 > Y
to A < X && A >= Y folding.
* gcc.dg/pr106027.c: New testcase.
This follows Richi's suggestion in PR105940, it aims to avoid
inconsistent slp decision between when the suggested unroll
factor is worked out and when the suggested unroll factor is
applied.
If the previous slp decision is true when the suggested unroll
factor is worked out, when we are applying unroll factor we
don't need to start over with slp off if the analysis with slp
on fails. On the other hand, if the previous slp decision is
false when the suggested unroll factor is worked out, when we
are applying unroll factor we can skip the slp handlings.
Function vect_is_simple_reduction saves reduction chains for
subsequent slp analyses, we have to disable this early otherwise
there is an ICE in vectorizable_reduction for below:
if (REDUC_GROUP_FIRST_ELEMENT (stmt_info))
gcc_assert (slp_node
&& REDUC_GROUP_FIRST_ELEMENT (stmt_info)
== stmt_info);
PR tree-optimization/105940
gcc/ChangeLog:
* tree-vect-loop.cc (vect_analyze_loop_2): Add new parameter
slp_done_for_suggested_uf and adjust with it accordingly.
(vect_analyze_loop_1): Add new variable slp_done_for_suggested_uf,
pass it down to vect_analyze_loop_2 for the initial analysis and
applying suggested unroll factor.
(vect_is_simple_reduction): Add parameter slp and adjust with it.
(vect_analyze_scalar_cycles_1): Add parameter slp and pass down.
(vect_analyze_scalar_cycles): Likewise.
That supports skipping of an object file (LDPS_NO_SYMS).
lto-plugin/ChangeLog:
* lto-plugin.c (struct plugin_file_info): Add skip_file flag.
(write_resolution): Write resolution only if get_symbols != LDPS_NO_SYMS.
(all_symbols_read_handler): Ignore file if skip_file is true.
(onload): Handle LDPT_GET_SYMBOLS_V3.
We changed builtins format about zicbom and zicboz subextensions and modified test cases.
diff with the previous version:
1.We modified the FUNCTION_TYPE from RISCV_VOID_FTYPE_SI/DI to RISCV_VOID_FTYPE_VOID_PTR.
2.We added a new RISCV_ATYPE_VOID_PTR in riscv-builtins.cc and a new DEF_RISCV_FTYPE (1, (VOID, VOID_PTR)) in riscv-ftypes.def.
3.We deleted DEF_RISCV_FTYPE (1, (VOID, SI/DI)).
4.We modified the input parameters of the test cases.
Thanks, Simon and Kito.
gcc/ChangeLog:
* config/riscv/riscv-builtins.cc (RISCV_ATYPE_VOID_PTR): New.
* config/riscv/riscv-cmo.def (RISCV_BUILTIN): Changed the FUNCTION_TYPE
of RISCV_BUILTIN.
* config/riscv/riscv-ftypes.def (0): Remove unused.
(1): New.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/cmo-zicbom-1.c: modified the input parameters.
* gcc.target/riscv/cmo-zicbom-2.c: modified the input parameters.
* gcc.target/riscv/cmo-zicboz-1.c: modified the input parameters.
* gcc.target/riscv/cmo-zicboz-2.c: modified the input parameters.
These instructions will all be converted to L32R ones with litpool entries
by the assembler.
gcc/ChangeLog:
* config/xtensa/xtensa.cc (xtensa_is_insn_L32R_p):
Consider relaxed MOVI instructions as L32R.
No functional changes.
gcc/ChangeLog:
* config/xtensa/xtensa.cc (xtensa_emit_move_sequence):
Use can_create_pseudo_p(), instead of using individual
reload_in_progress and reload_completed.
(xtensa_expand_block_set_small_loop): Use xtensa_simm8x256(),
the existing predicate function.
(xtensa_is_insn_L32R_p, gen_int_relational, xtensa_emit_sibcall):
Use the standard RTX code predicate macros such as MEM_P,
SYMBOL_REF_P and/or CONST_INT_P.
* config/xtensa/xtensa.md: Avoid using numeric literals to determine
if callee-saved register, at the split patterns for indirect sibcall
fixups.
On Thu, Jun 16, 2022 at 09:32:02PM +0100, Jonathan Wakely wrote:
> It looks like clang has addressed this deficiency now:
>
> https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#usage
Thanks, that is roughly what I'd implement anyway and apparently they have
it already since 2015, we've added the -fsanitize-undefined-trap-on-error
support back in 2014 and didn't change it since then.
As a small divergence from clang, I chose -fsanitize-undefined-trap-on-error
to be a (deprecated) alias for -fsanitize-trap aka -fsanitize-trap=all
rather thn -fsanitize-trap=undefined which seems to be what clang does,
because for a deprecated option it is IMHO more important backwards
compatibility with what gcc did over the past 8 years rather than clang
compatibility.
Some sanitizers (e.g. asan, lsan, tsan) don't support traps,
-fsanitize-trap=address etc. will be rejected (if enabled at the end of
command line), -fno-sanitize-trap= can be specified even for them.
This is similar behavior to -fsanitize-recover=.
One complication is vptr sanitization, which can't easily trap,
as the whole slow path of the checking is inside of libubsan.
Previously, -fsanitize=vptr -fsanitize-undefined-trap-on-error
silently ignored vptr sanitization.
This patch similarly to what clang does will accept
-fsanitize-trap=all or -fsanitize-trap=undefined which enable
the vptr bit as trapping and again that causes silent disabling
of vptr sanitization, while -fsanitize-trap=vptr is rejected
(already during option processing).
2022-06-18 Jakub Jelinek <jakub@redhat.com>
gcc/
* common.opt (flag_sanitize_trap): New variable.
(fsanitize-trap=, fsanitize-trap): New options.
(fsanitize-undefined-trap-on-error): Change into deprecated alias
for -fsanitize-trap=all.
* opts.h (struct sanitizer_opts_s): Add can_trap member.
* opts.cc (finish_options): Complain about unsupported
-fsanitize-trap= options.
(sanitizer_opts): Add can_trap values to all entries.
(get_closest_sanitizer_option): Ignore -fsanitize-trap=
options which have can_trap false.
(parse_sanitizer_options): Add support for -fsanitize-trap=.
For -fsanitize-trap=all, enable
SANITIZE_UNDEFINED | SANITIZE_UNDEFINED_NONDEFAULT. Disallow
-fsanitize-trap=vptr here.
(common_handle_option): Handle OPT_fsanitize_trap_ and
OPT_fsanitize_trap.
* sanopt.cc (maybe_optimize_ubsan_null_ifn): Check
flag_sanitize_trap & SANITIZE_{NULL,ALIGNMENT} instead of
flag_sanitize_undefined_trap_on_error.
* gcc.cc (sanitize_spec_function): Use
flag_sanitize & ~flag_sanitize_trap instead of flag_sanitize
and drop use of flag_sanitize_undefined_trap_on_error in
"undefined" handling.
* ubsan.cc (ubsan_instrument_unreachable): Use
flag_sanitize_trap & SANITIZE_??? instead of
flag_sanitize_undefined_trap_on_error.
(ubsan_expand_bounds_ifn, ubsan_expand_null_ifn,
ubsan_expand_objsize_ifn, ubsan_expand_ptr_ifn,
ubsan_build_overflow_builtin, instrument_bool_enum_load,
ubsan_instrument_float_cast, instrument_nonnull_arg,
instrument_nonnull_return, instrument_builtin): Likewise.
* doc/invoke.texi (-fsanitize-trap=, -fsanitize-trap): Document.
(-fsanitize-undefined-trap-on-error): Document as deprecated
alias of -fsanitize-trap.
gcc/c-family/
* c-ubsan.cc (ubsan_instrument_division, ubsan_instrument_shift):
Use flag_sanitize_trap & SANITIZE_??? instead of
flag_sanitize_undefined_trap_on_error. If 2 sanitizers are involved
and flag_sanitize_trap differs for them, emit __builtin_trap only
for the comparison where trap is requested.
(ubsan_instrument_vla, ubsan_instrument_return): Use
lag_sanitize_trap & SANITIZE_??? instead of
flag_sanitize_undefined_trap_on_error.
gcc/cp/
* cp-ubsan.cc (cp_ubsan_instrument_vptr_p): Use
flag_sanitize_trap & SANITIZE_VPTR instead of
flag_sanitize_undefined_trap_on_error.
gcc/testsuite/
* c-c++-common/ubsan/nonnull-4.c: Use -fsanitize-trap=all
instead of -fsanitize-undefined-trap-on-error.
* c-c++-common/ubsan/div-by-zero-4.c: Use
-fsanitize-trap=signed-integer-overflow instead of
-fsanitize-undefined-trap-on-error.
* c-c++-common/ubsan/overflow-add-4.c: Use -fsanitize-trap=undefined
instead of -fsanitize-undefined-trap-on-error.
* c-c++-common/ubsan/pr56956.c: Likewise.
* c-c++-common/ubsan/pr68142.c: Likewise.
* c-c++-common/ubsan/pr80932.c: Use
-fno-sanitize-trap=all -fsanitize-trap=shift,undefined
instead of -fsanitize-undefined-trap-on-error.
* c-c++-common/ubsan/align-8.c: Use -fsanitize-trap=alignment
instead of -fsanitize-undefined-trap-on-error.
The following testcase ICEs because there is NON_LVALUE_EXPR (location
wrapper) around a VAR_DECL and has TYPE_MODE V2SImode and
SCALAR_INT_TYPE_MODE on that ICEs. Or for -m32 -march=i386 TYPE_MODE
is DImode, but SCALAR_INT_TYPE_MODE still uses the raw V2SImode and ICEs
too.
2022-06-18 Jakub Jelinek <jakub@redhat.com>
PR middle-end/105998
* varasm.cc (narrowing_initializer_constant_valid_p): Check
SCALAR_INT_MODE_P instead of INTEGRAL_MODE_P, also break on
! INTEGRAL_TYPE_P and do the same check also on op{0,1}'s type.
* c-c++-common/pr105998.c: New test.
This patch resolves PR tree-optimization/105835, which is a code quality
(dead code elimination) regression at -O1 triggered/exposed by a recent
change to canonicalize X&-Y as X*Y. The new (shorter) form exposes some
missed optimization opportunities that can be handled by adding some
extra simplifications to match.pd.
One transformation is to simplify "(short)(x ? 65535 : 0)" into the
equivalent "x ? -1 : 0", or more accurately x ? (short)-1 : (short)0",
as INTEGER_CSTs record their type, and integer conversions can be
pushed inside COND_EXPRs reducing the number of gimple statements.
The other transformation is that (short)(X * 65535), where X is [0,1],
into the equivalent (short)X * -1, (or again (short)-1 where tree's
INTEGER_CSTs encode their type). This is valid because multiplications
where one operand is [0,1] are guaranteed not to overflow, and hence
integer conversions can also be pushed inside these multiplications.
These narrowing conversion optimizations can be identified by range
analyses, such as EVRP, but these are only performed at -O2 and above,
which is why this regression is only visible with -O1.
2022-06-18 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR tree-optimization/105835
* match.pd (convert (mult zero_one_valued_p@1 INTEGER_CST@2)):
Narrow integer multiplication by a zero_one_valued_p operand.
(convert (cond @1 INTEGER_CST@2 INTEGER_CST@3)): Push integer
conversions inside COND_EXPR where both data operands are
integer constants.
gcc/testsuite/ChangeLog
PR tree-optimization/105835
* gcc.dg/pr105835.c: New test case.
In this case the STATIC_CAST_EXPR expressions in the call aren't
type nor value dependent, but maybe_constant_value still ICEs on those
when processing_template_decl. Calling fold_non_dependent_expr on it
instead fixes the ICE and folds them to INTEGER_CSTs.
2022-06-17 Jakub Jelinek <jakub@redhat.com>
PR c++/106001
* typeck.cc (build_x_shufflevector): Use fold_non_dependent_expr
instead of maybe_constant_value.
* g++.dg/ext/builtin-shufflevector-4.C: New test.
This patch introduces alpha-specific version of store_data_bypass_p that
ignores TRAP_IF that would result in assertion failure (and internal
compiler error) in the generic store_data_bypass_p function.
While at it, also remove ev4_ist_c reservation, store_data_bypass_p
can handle the patterns with multiple sets since some time ago.
2022-06-17 Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog:
PR target/105209
* config/alpha/alpha-protos.h (alpha_store_data_bypass_p): New.
* config/alpha/alpha.cc (alpha_store_data_bypass_p): New function.
(alpha_store_data_bypass_p_1): Ditto.
* config/alpha/ev4.md: Use alpha_store_data_bypass_p instead
of generic store_data_bypass_p.
(ev4_ist_c): Remove insn reservation.
gcc/testsuite/ChangeLog:
PR target/105209
* gcc.target/alpha/pr105209.c: New test.
The mode of pointer argument should equal ptr_mode, not Pmode.
2022-06-17 Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog:
PR target/105970
* config/i386/i386.cc (ix86_function_arg): Assert that
the mode of pointer argumet is equal to ptr_mode, not Pmode.
gcc/testsuite/ChangeLog:
PR target/105970
* gcc.target/i386/pr105970.c: New test.
REGNO should not be used with register_operand before reload because
subregs of registers or even subregs of memory match the predicate.
The build with RTL checking enabled does not tolerate REGNO with
non-reg operand.
The patch splits the splitter into two related splitters and uses
(match_dup ...) RTXes instead of REGNO comparisons.
2022-06-17 Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog:
PR target/105993
* config/i386/sse.md (vpmov splitter): Use (match_dup ...)
instead of REGNO comparisons in combine splitter.
gcc/testsuite/ChangeLog:
PR target/105993
* gcc.target/i386/pr105993.c: New test.
Sigh, another instance where I incorrectly used XUINT instead of
UINTVAL.
I've also made the code here a little more robust (although I think
this case can't in fact be reached) if the 32-bit clear mask includes
bit 31. This case, if reached, would print out an out-of-range value
based on the size of the compiler's HOST_WIDE_INT type due to
sign-extension. We avoid this by masking the value after inversion.
gcc/ChangeLog:
PR target/106004
* config/arm/arm.cc (arm_print_operand, case 'V'): Use UINTVAL.
Clear bits in the mask above bit 31.