The following patch adds a simple check to prevent slp stmts from
vector constructors being rearranged. vect_attempt_slp_rearrange_stmts
tries to rearrange to avoid a load permutation.
This fixes PR target/96837
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96827
gcc/ChangeLog:
2020-09-29 Joel Hutton <joel.hutton@arm.com>
PR target/96837
* tree-vect-slp.c (vect_analyze_slp): Do not call
vect_attempt_slp_rearrange_stmts for vector constructors.
gcc/testsuite/ChangeLog:
2020-09-29 Joel Hutton <joel.hutton@arm.com>
PR target/96837
* gcc.dg/vect/bb-slp-49.c: New test.
This is a small refactoring which introduces SLP_TREE_REF_COUNT and replaces
the uses of refcnt with it. This for consistency between the other properties.
A similar patch was pre-approved last year but since there are more use now I am
sending it for review anyway.
gcc/ChangeLog:
* tree-vectorizer.h (SLP_TREE_REF_COUNT): New.
* tree-vect-slp.c (_slp_tree::_slp_tree, _slp_tree::~_slp_tree,
vect_free_slp_tree, vect_build_slp_tree, vect_print_slp_tree,
slp_copy_subtree, vect_attempt_slp_rearrange_stmts): Use it.
Now hiddenness is managed by name-lookup, we no longer need DECL_HIDDEN_FRIEND_P.
This removes it. Mainly by deleting its bookkeeping, but there are a couple of uses
1) two name lookups look at it to see if they found a hidden thing.
In one we have the OVERLOAD, so can record OVL_HIDDEN_P. In the other
we're repeating a lookup that failed, but asking for hidden things --
so if that succeeds we know the thing was hidden. (FWIW CWG recently
discussed whether template specializations and instantiations should
see such hidden templates anyway, there is compiler divergence.)
2) We had a confusing setting of KOENIG_P when building a
non-dependent call. We don't repeat that lookup at instantiation time
anyway.
gcc/cp/
* cp-tree.h (struct lang_decl_fn): Remove hidden_friend_p.
(DECL_HIDDEN_FRIEND_P): Delete.
* call.c (add_function_candidate): Drop assert about anticipated
decl.
(build_new_op_1): Drop koenig lookup flagging for hidden friend.
* decl.c (duplicate_decls): Drop HIDDEN_FRIEND_P updating.
* name-lookup.c (do_pushdecl): Likewise.
(set_decl_namespace): Discover hiddenness from OVL_HIDDEN_P.
* pt.c (check_explicit_specialization): Record found_hidden
explicitly.
2020-30-09 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/97045
* trans-array.c (gfc_conv_array_ref): Make sure that the class
decl is passed to build_array_ref in the case of unlimited
polymorphic entities.
* trans-expr.c (gfc_conv_derived_to_class): Ensure that array
refs do not preceed the _len component. Free the _len expr.
* trans-stmt.c (trans_associate_var): Reset 'need_len_assign'
for polymorphic scalars.
* trans.c (gfc_build_array_ref): When the vptr size is used for
span, multiply by the _len field of unlimited polymorphic
entities, when non-zero.
gcc/testsuite/
PR fortran/97045
* gfortran.dg/select_type_50.f90 : New test.
GCC has a target hook TARGET_LIBC_HAS_FUNCTION, which tells the compiler
which functions it can expect to be present in libc.
The default target hook does not include the sincos functions.
The nvptx port of newlib does include sincos and sincosf, but not sincosl.
The target hook TARGET_LIBC_HAS_FUNCTION does not distinguish between sincos,
sincosf and sincosl, so if we enable it for the sincos functions, then for
test.c:
...
long double x, a, b;
int main (void) {
x = 0.5;
a = sinl (x);
b = cosl (x);
printf ("a: %f\n", (double)a);
printf ("b: %f\n", (double)b);
return 0;
}
...
we introduce a regression:
...
$ gcc test.c -lm -O2
unresolved symbol sincosl
collect2: error: ld returned 1 exit status
...
Add a type argument to target hook TARGET_LIBC_HAS_FUNCTION_TYPE, and use it
in nvptx_libc_has_function_type to enable sincos and sincosf, but not sincosl.
Build and reg-tested on x86_64-linux.
Build and tested on nvptx.
gcc/ChangeLog:
2020-09-28 Tobias Burnus <tobias@codesourcery.com>
Tom de Vries <tdevries@suse.de>
* builtins.c (expand_builtin_cexpi, fold_builtin_sincos): Update
targetm.libc_has_function call.
* builtins.def (DEF_C94_BUILTIN, DEF_C99_BUILTIN, DEF_C11_BUILTIN):
(DEF_C2X_BUILTIN, DEF_C99_COMPL_BUILTIN, DEF_C99_C90RES_BUILTIN):
Same.
* config/darwin-protos.h (darwin_libc_has_function): Update prototype.
* config/darwin.c (darwin_libc_has_function): Add arg.
* config/linux-protos.h (linux_libc_has_function): Update prototype.
* config/linux.c (linux_libc_has_function): Add arg.
* config/i386/i386.c (ix86_libc_has_function): Update
targetm.libc_has_function call.
* config/nvptx/nvptx.c (nvptx_libc_has_function): New function.
(TARGET_LIBC_HAS_FUNCTION): Redefine to nvptx_libc_has_function.
* convert.c (convert_to_integer_1): Update targetm.libc_has_function
call.
* match.pd: Same.
* target.def (libc_has_function): Add arg.
* doc/tm.texi: Regenerate.
* targhooks.c (default_libc_has_function, gnu_libc_has_function)
(no_c99_libc_has_function): Add arg.
* targhooks.h (default_libc_has_function, no_c99_libc_has_function)
(gnu_libc_has_function): Update prototype.
* tree-ssa-math-opts.c (pass_cse_sincos::execute): Update
targetm.libc_has_function call.
gcc/fortran/ChangeLog:
2020-09-30 Tom de Vries <tdevries@suse.de>
* f95-lang.c (gfc_init_builtin_functions): Update
targetm.libc_has_function call.
Since MOVDIRI and MOVDIR64B write to memory, similar to UNSPEC_MOVNT,
use SET operation in MOVDIRI and MOVDIR64B patterns with UNSPEC instead
of UNSPECV.
gcc/
PR target/97184
* config/i386/i386.md (UNSPECV_MOVDIRI): Renamed to ...
(UNSPEC_MOVDIRI): This.
(UNSPECV_MOVDIR64B): Renamed to ...
(UNSPEC_MOVDIR64B): This.
(movdiri<mode>): Use SET operation.
(@movdir64b_<mode>): Likewise.
gcc/testsuite/
PR target/97184
* gcc.target/i386/movdir64b.c: New test.
* gcc.target/i386/movdiri32.c: Likewise.
* gcc.target/i386/movdiri64.c: Likewise.
* lib/target-supports.exp (check_effective_target_movdir): New.
Before commit 7e43716200 "[testsuite] Require non_strict_align in
pr94600-{1,3}.c", some tests were failing for nvptx, because volatile stores
were expected, but memcpy's were found instead.
This was traced back to this bit in compute_record_mode:
...
/* If structure's known alignment is less than what the scalar
mode would need, and it matters, then stick with BLKmode. */
if (mode != BLKmode
&& STRICT_ALIGNMENT
&& ! (TYPE_ALIGN (type) >= BIGGEST_ALIGNMENT
|| TYPE_ALIGN (type) >= GET_MODE_ALIGNMENT (mode)))
{
/* If this is the only reason this type is BLKmode, then
don't force containing types to be BLKmode. */
TYPE_NO_FORCE_BLK (type) = 1;
mode = BLKmode;
}
...
which got triggered for nvptx, but not for x86_64.
The commit disabled the tests for non_strict_align effective target, but
that had the effect for the arm target that those tests were disabled, even
though they were passing before.
Further investigation in compute_record_mode shows that the if-condition
evaluates to false for arm because, because TYPE_ALIGN (type) == 32, while
it's 8 for nvptx. This again can be explained by the
PCC_BITFIELD_TYPE_MATTERS setting, which is 1 for arm, but 0 for nvptx.
Re-enable the test for arm by using effective target
(non_strict_align || pcc_bitfield_type_matters).
Tested on arm-eabi and nvptx.
gcc/testsuite/ChangeLog:
2020-09-30 Tom de Vries <tdevries@suse.de>
* gcc.dg/pr94600-1.c: Use effective target
(non_strict_align || pcc_bitfield_type_matters).
* gcc.dg/pr94600-3.c: Same.
These tests were missing dg-requires-effective-targets to ensure they
are UNSUPPORTED if the assembler doesn't have AMX support.
2020-09-30 Jakub Jelinek <jakub@redhat.com>
* gcc.target/i386/amxint8-dpbssd-2.c: Require effective targets
amx_tile and amx_int8.
* gcc.target/i386/amxint8-dpbsud-2.c: Likewise.
* gcc.target/i386/amxint8-dpbusd-2.c: Likewise.
* gcc.target/i386/amxint8-dpbuud-2.c: Likewise.
* gcc.target/i386/amxbf16-dpbf16ps-2.c: Require effective targets
amx_tile and amx_bf16.
* gcc.target/i386/amxtile-2.c: Require effective target amx_tile.
In this PR the second argument to the intrinsics should be signed but we
use an unsigned one erroneously.
The corresponding builtins are already using the correct types so it's
just a matter of correcting the signatures in arm_neon.h
gcc/
PR target/97150
* config/aarch64/arm_neon.h (vqrshlb_u8): Make second argument
signed.
(vqrshlh_u16): Likewise.
(vqrshls_u32): Likewise.
(vqrshld_u64): Likewise.
(vqshlb_u8): Likewise.
(vqshlh_u16): Likewise.
(vqshls_u32): Likewise.
(vqshld_u64): Likewise.
(vshld_u64): Likewise.
gcc/testsuite/
PR target/97150
* gcc.target/aarch64/pr97150.c: New test.
In this PR we have the wrong return type for some intrinsics. It should
be unsigned, but we implement it as signed.
Fix this by adjusting the type qualifiers used when creating the
builtins and fixing the type in the arm_neon.h intrinsic.
With the adjustment in qualifiers we now don't need to cast the result
when returning.
Bootstrapped and tested on aarch64-none-linux-gnu.
gcc/
PR target/96313
* config/aarch64/aarch64-simd-builtins.def (sqmovun): Use UNOPUS
qualifiers.
* config/aarch64/arm_neon.h (vqmovun_s16): Adjust builtin call.
Remove unnecessary result cast.
(vqmovun_s32): Likewise.
(vqmovun_s64): Likewise.
(vqmovunh_s16): Likewise. Fix return type.
(vqmovuns_s32): Likewise.
(vqmovund_s64): Likewise.
gcc/testsuite/
PR target/96313
* gcc.target/aarch64/pr96313.c: New test.
* gcc.target/aarch64/scalar_intrinsics.c (test_vqmovunh_s16):
Adjust return type.
(test_vqmovuns_s32): Likewise.
(test_vqmovund_s64): Likewise.
movti lacked an way of zeroing an FPR, meaning that we'd do:
mov x0, 0
mov x1, 0
fmov d0, x0
fmov v0.d[1], x1
instead of just:
movi v0.2d, #0
movtf had the opposite problem for GPRs: we'd generate:
movi v0.2d, #0
fmov x0, d0
fmov x1, v0.d[1]
instead of just:
mov x0, 0
mov x1, 0
Also, there was an unnecessary earlyclobber on the GPR<-GPR movtf
alternative (but not the movti one). The splitter handles overlap
correctly.
The TF splitter used aarch64_reg_or_imm, but the _imm part only
accepts integer constants, not floating-point ones. The patch
changes it to nonmemory_operand instead.
gcc/
* config/aarch64/aarch64.c (aarch64_split_128bit_move_p): Add a
function comment. Tighten check for FP moves.
* config/aarch64/aarch64.md (*movti_aarch64): Add a w<-Z alternative.
(*movtf_aarch64): Handle r<-Y like r<-r. Remove unnecessary
earlyclobber. Change splitter predicate from aarch64_reg_or_imm
to nonmemory_operand.
gcc/testsuite/
* gcc.target/aarch64/movtf_1.c: New test.
* gcc.target/aarch64/movti_1.c: Likewise.
This patch fixes ICEs when compiling
gcc/testsuite/gcc.target/arm/pure-code/no-literal-pool.c with
-mfp16-format=ieee -mfloat-abi=hard -march=armv8.1-m.main+mve
-mpure-code.
The existing conditions in the movsf/movdf expanders (as well as the
no_literal_pool patterns) were too restrictive, requiring
TARGET_HARD_FLOAT instead of TARGET_VFP_BASE, which caused unrecognised
insns when compiling this testcase with integer MVE and -mpure-code.
gcc/ChangeLog:
PR target/97251
* config/arm/arm.md (movsf): Relax TARGET_HARD_FLOAT to
TARGET_VFP_BASE.
(movdf): Likewise.
* config/arm/vfp.md (no_literal_pool_df_immediate): Likewise.
(no_literal_pool_sf_immediate): Likewise.
We have too many tablejump patterns. Using parameterized names
simplifies the code a bit.
2020-09-29 Segher Boessenkool <segher@kernel.crashing.org>
* config/rs6000/rs6000.md (tablejump): Simplify.
(tablejumpsi): Merge this ...
(tablejumpdi): ... and this ...
(@tablejump<mode>_normal): ... into this.
(tablejumpsi_nospec): Merge this ...
(tablejumpdi_nospec): ... and this ...
(@tablejump<mode>_nospec): ... into this.
(*tablejump<mode>_internal1): Delete, rename to ...
(@tablejump<mode>_insn_normal): ... this.
(*tablejump<mode>_internal1_nospec): Delete, rename to ...
(@tablejump<mode>_insn_nospec): ... this.
Resolves:
PR middle-end/97188 - ICE passing a null VLA to a function expecting at least one element
gcc/ChangeLog:
PR middle-end/97188
* calls.c (maybe_warn_rdwr_sizes): Simplify warning messages.
Correct handling of VLA argumments.
gcc/testsuite/ChangeLog:
PR middle-end/97188
* gcc.dg/Wstringop-overflow-23.c: Adjust text of expected warnings.
* gcc.dg/Wnonnull-4.c: New test.
This new warning can be used to prevent expensive copies inside range-based
for-loops, for instance:
struct S { char arr[128]; };
void fn () {
S arr[5];
for (const auto x : arr) { }
}
where auto deduces to S and then we copy the big S in every iteration.
Using "const auto &x" would not incur such a copy. With this patch the
compiler will warn:
q.C:4:19: warning: loop variable 'x' creates a copy from type 'const S' [-Wrange-loop-construct]
4 | for (const auto x : arr) { }
| ^
q.C:4:19: note: use reference type 'const S&' to prevent copying
4 | for (const auto x : arr) { }
| ^
| &
As per Clang, this warning is suppressed for trivially copyable types
whose size does not exceed 64B. The tricky part of the patch was how
to figure out if using a reference would have prevented a copy. To
that point, I'm using the new function called ref_conv_binds_directly_p.
This warning is enabled by -Wall. Further warnings of similar nature
should follow soon.
gcc/c-family/ChangeLog:
PR c++/94695
* c.opt (Wrange-loop-construct): New option.
gcc/cp/ChangeLog:
PR c++/94695
* call.c (ref_conv_binds_directly_p): New function.
* cp-tree.h (ref_conv_binds_directly_p): Declare.
* parser.c (warn_for_range_copy): New function.
(cp_convert_range_for): Call it.
gcc/ChangeLog:
PR c++/94695
* doc/invoke.texi: Document -Wrange-loop-construct.
gcc/testsuite/ChangeLog:
PR c++/94695
* g++.dg/warn/Wrange-loop-construct.C: New test.
PR analyzer/95188 reports that diagnostics from
-Wanalyzer-unsafe-call-within-signal-handler use the wrong
source location when reporting the signal-handler registration
event in the diagnostic_path. The diagnostics erroneously use the
location of the first stmt in the basic block containing the call
to "signal", rather than that of the call itself.
Fixed thusly.
gcc/analyzer/ChangeLog:
PR analyzer/95188
* engine.cc (stmt_requires_new_enode_p): Split enodes before
"signal" calls.
gcc/testsuite/ChangeLog:
PR analyzer/95188
* gcc.dg/analyzer/signal-registration-loc.c: New test.
Extends the configure check for zstd.h to also verify the zstd version,
since gcc requires features that only exist in 1.3.0 and newer. Without
this patch we get a build error for lto-compress.c when using an old zstd
version.
gcc/
PR bootstrap/97183
* configure.ac (gcc_cv_header_zstd_h): Check ZSTD_VERISON_NUMBER.
* configure: Regenerated.
This adds support for the Arm Cortex-X1 CPU. For more information about this
processor, see [0].
[0] : https://www.arm.com/products/cortex-x
gcc/ChangeLog:
* config/arm/arm-cpus.in: Add Cortex-X1 core.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm-tune.md: Regenerate.
* doc/invoke.texi: Update docs.
This adds support for the Arm Cortex-X1 CPU in AArch64 GCC. For more
information about this processor, see [0].
[0] : https://www.arm.com/products/cortex-x
gcc/ChangeLog:
* config/aarch64/aarch64-cores.def: Add Cortex-X1 Arm core.
* config/aarch64/aarch64-tune.md: Regenerate.
* doc/invoke.texi: Add -mtune=cortex-x1 docs.
This patch moves the handling of decl-hiddenness entirely into the
name lookup machinery, where it belongs. We need a few new flags,
because pressing the existing OVL_HIDDEN_P into play for non-function
decls doesn't work well. For a local binding we only need one marker,
as there cannot be both a hidden implicit typedef and a hidden
function. That's not true for namespace-scope, where they could both
be hidden.
The name-lookup machinery maintains the existing decl_hidden and co
flags, and asserts have been sprinkled around to make sure they are
consistent. The next series of patches will remove those old markers.
(we'll need to keep one, as there are some special restrictions on
redeclaring friend functions with in-class definitions or default args.)
gcc/cp/
* cp-tree.h (ovl_insert): Change final parm to hidden-or-using
indicator.
* name-lookup.h (HIDDEN_TYPE_BINDING_P): New.
(struct cxx_binding): Add type_is_hidden flag.
* tree.c (ovl_insert): Change using_p parm to using_or_hidden,
adjust.
(ovl_skip_hidden): Assert we never see a naked hidden decl.
* decl.c (xref_tag_1): Delete unhiding friend from here (moved to
lookup_elaborated_type_1).
* name-lookup.c (STAT_TYPE_HIDDEN_P, STAT_DECL_HIDDEN_P): New.
(name_lookup::search_namespace_only): Check new hidden markers.
(cxx_binding_make): Clear HIDDEN_TYPE_BINDING_P.
(update_binding): Update new hidden markers.
(lookup_name_1): Check HIDDEN_TYPE_BINDING_P and simplify friend
ignoring.
(lookup_elaborated_type_1): Use new hidden markers. Reveal the
decl here.
Fix 2 typos in config/i386/enqcmdintrin.h by replacing <enqcmdntrin.h>
with <enqcmdintrin.h>:
[hjl@gnu-cfl-2 x86-gcc]$ echo "#include <enqcmdintrin.h>" | gcc -S -o /dev/null -x c -
In file included from <stdin>:1:
/usr/lib/gcc/x86_64-redhat-linux/10/include/enqcmdintrin.h:25:3: error: #error "Never use <enqcmdntrin.h> directly; include <x86intrin.h> instead."
25 | # error "Never use <enqcmdntrin.h> directly; include <x86intrin.h> instead."
| ^~~~~
[hjl@gnu-cfl-2 x86-gcc]$
and _ENQCMDINTRIN_H_INCLUDED with _ENQCMDINTRIN_H_INCLUDED.
gcc/
PR target/97247
* config/i386/enqcmdintrin.h: Replace <enqcmdntrin.h> with
<enqcmdintrin.h>. Replace _ENQCMDNTRIN_H_INCLUDED with
_ENQCMDINTRIN_H_INCLUDED.
Here are a few cleanups, prior to landing the hidden decl changes.
1) Clear cxx_binding flags in the allocator, not at each user of the allocator.
2) Refactor update_binding. The logic was getting too convoluted.
3) Set friendliness and anticipatedness before pushing a template decl (not after).
gcc/cp/
* name-lookup.c (create_local_binding): Do not clear
INHERITED_VALUE_BINDING_P here.
(name_lookup::process_binding): Move done hidden-decl triage to ...
(name_lookup::search_namespace_only): ... here, its only caller.
(cxx_binding_make): Clear flags here.
(push_binding): Not here.
(pop_local_binding): RAII.
(update_binding): Refactor.
(do_pushdecl): Assert we're never revealing a local binding.
(do_pushdecl_with_scope): Directly call do_pushdecl.
(get_class_binding): Do not clear LOCAL_BINDING_P here.
* pt.c (push_template_decl): Set friend & anticipated before
pushing.
AIX stdio.h implicitly includes sys/types.h, which implicitly includes
inttypes.h. With a recent AIX header fixincludes change to unilaterally
define STDC Macros, the GCC testsuite uses of inttypes now fails.
This patch explicitly defines the _STD_TYPES_T macro when the test is
run on AIX so that the inttypes.h header behaves as the testcase requires.
gcc/testsuite/ChangeLog:
2020-09-29 David Edelsohn <dje.gcc@gmail.com>
* g++.dg/spellcheck-inttypes.C: Define _STD_TYPES_T on AIX.
* gcc.dg/spellcheck-inttypes.c: Same.
This simplification removes some unneeded behaviour in
set_identifier_type_value_with_scope, which was updating the namespace
binding. And causing update_binding to have to deal with meeting two
implicit typedefs. But the typedef is already there, and there's no
other way to have two such typedef's collide (we'll already have dealt
with that in lookup_elaborated_type).
So, let's kill this crufty code.
gcc/cp/
* name-lookup.c (update_binding): We never meet two implicit
typedefs.
(do_pushdecl): Adjust set_identifier_type_value_with_scope calls.
(set_identifier_type_value_with_scope): Do not update binding in
the namespace-case. Assert it is already there.
The following moves an ad-hoc attempt at discovering the SLP node
for a stmt to the place where we can find it in lock-step when
we find the stmt itself.
2020-09-29 Richard Biener <rguenther@suse.de>
PR tree-optimization/97241
* tree-vect-loop.c (vectorizable_reduction): Move finding
the SLP node for the reduction stmt to a better place.
* gcc.dg/vect/pr97241.c: New testcase.
This fixes a typo causing a NULL dereference.
2020-09-29 Richard Biener <rguenther@suse.de>
PR tree-optimization/97238
* tree-ssa-reassoc.c (ovce_extract_ops): Fix typo.
* gcc.dg/pr97238.c: New testcase.
Both GCN and NVPTX allow nested parallel regions, but the barrier
implementation did not allow the nested teams to run independently of each
other (due to hardware limitations). This patch fixes that, under the
assumption that each thread will create a new subteam of one thread, by
simply not using barriers when there's no other thread to synchronise.
libgomp/ChangeLog:
* config/gcn/bar.c (gomp_barrier_wait_end): Skip the barrier if the
total number of threads is one.
(gomp_team_barrier_wake): Likewise.
(gomp_team_barrier_wait_end): Likewise.
(gomp_team_barrier_wait_cancel_end): Likewise.
* config/nvptx/bar.c (gomp_barrier_wait_end): Likewise.
(gomp_team_barrier_wake): Likewise.
(gomp_team_barrier_wait_end): Likewise.
(gomp_team_barrier_wait_cancel_end): Likewise.
* testsuite/libgomp.c-c++-common/nested-parallel-unbalanced.c: New test.
The AArch32 port now has three vector extensions: iwMMXt, Neon
and MVE. We already have some named expanders that are shared
by all three, and soon we'll need more.
One way of handling this would be to use define_mode_iterators
that specify the condition for each mode. For example,
(V16QI "TARGET_NEON || TARGET_HAVE_MVE")
(V8QI "TARGET_NEON || TARGET_REALLY_IWMXXT")
...
(V2SF "TARGET_NEON && flag_unsafe_math_optimizations")
etc. However, we'll need several mode iterators, and it would
be repetitive to specify the mode condition every time.
This patch therefore introduces per-mode macros that say whether
we can perform general arithmetic on the mode. Initially there are
two sets of macros:
ARM_HAVE_NEON_<MODE>_ARITH
true if Neon can handle general arithmetic on <MODE>
ARM_HAVE_<MODE>_ARITH
true if any vector extension can handle general arithmetic on <MODE>
The macro definitions themselves are undeniably ugly, but hopefully
they're justified by the simplifications they allow.
The patch converts the addition patterns to use this scheme.
Previously there were three copies of the V8HF and V4HF addition
patterns for Neon:
(1) *add<VDQ:mode>3_neon, which provided plus:VnHF even without
TARGET_NEON_FP16INST. This was probably harmless since all the
named patterns had an appropriate guard, but it is possible that
something could have tried to generate the plus directly, such as
by using a REG_EQUAL note to generate a new pattern.
(2) addv8hf3_neon and addv4hf3, which had the correct
TARGET_NEON_FP16INST target condition, but unnecessarily required
flag_unsafe_math_optimizations. Unlike VnSF operations, VnHF
operations do not force flush to zero.
(3) add<VH:mode>3_fp16, which provided plus:VnHF with the
correct conditions (TARGET_NEON_FP16INST, with no
flag_unsafe_math_optimizations test).
The patch in essence renames add<VH:mode>3_fp16 to *add<VH:mode>3_neon
(part of *add<VDQ:mode>3_neon) and removes the other two patterns.
gcc/
* config/arm/arm.h (ARM_HAVE_NEON_V8QI_ARITH, ARM_HAVE_NEON_V4HI_ARITH)
(ARM_HAVE_NEON_V2SI_ARITH, ARM_HAVE_NEON_V16QI_ARITH): New macros.
(ARM_HAVE_NEON_V8HI_ARITH, ARM_HAVE_NEON_V4SI_ARITH): Likewise.
(ARM_HAVE_NEON_V2DI_ARITH, ARM_HAVE_NEON_V4HF_ARITH): Likewise.
(ARM_HAVE_NEON_V8HF_ARITH, ARM_HAVE_NEON_V2SF_ARITH): Likewise.
(ARM_HAVE_NEON_V4SF_ARITH, ARM_HAVE_V8QI_ARITH, ARM_HAVE_V4HI_ARITH)
(ARM_HAVE_V2SI_ARITH, ARM_HAVE_V16QI_ARITH, ARM_HAVE_V8HI_ARITH)
(ARM_HAVE_V4SI_ARITH, ARM_HAVE_V2DI_ARITH, ARM_HAVE_V4HF_ARITH)
(ARM_HAVE_V2SF_ARITH, ARM_HAVE_V8HF_ARITH, ARM_HAVE_V4SF_ARITH):
Likewise.
* config/arm/iterators.md (VNIM, VNINOTM): Delete.
* config/arm/vec-common.md (add<VNIM:mode>3, addv8hf3)
(add<VNINOTM:mode>3): Replace with...
(add<VDQ:mode>3): ...this new expander.
* config/arm/neon.md (*add<VDQ:mode>3_neon): Use the new
ARM_HAVE_NEON_<MODE>_ARITH macros as the C condition.
(addv8hf3_neon, addv4hf3, add<VFH:mode>3_fp16): Delete in
favor of the above.
(neon_vadd<VH:mode>): Use gen_add<mode>3 instead of
gen_add<mode>3_fp16.
gcc/testsuite/
* gcc.target/arm/armv8_2-fp16-arith-2.c: Expect FP16 vectorization
even without -ffast-math.
- According the conclusion in RISC-V C API document, we decide to deprecat
the __riscv_cmodel_pic marco
- __riscv_cmodel_pic is deprecated and will removed in next GCC
release.
[1] https://github.com/riscv/riscv-c-api-doc/pull/11
gcc/ChangeLog:
* config/riscv/riscv-c.c (riscv_cpu_cpp_builtins): Define
__riscv_cmodel_medany when PIC mode.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/predef-3.c: Update testcase.
* gcc.target/riscv/predef-6.c: Ditto.
This patch moves the entry for Neoverse N2 (an Armv8.5-A CPU) after
Saphira (an Armv8.4-A CPU) to preserve the overall ordering in the file.
Committing as obvious.
gcc/ChangeLog:
* config/aarch64/aarch64-cores.def: Move neoverse-n2 after saphira.
* config/aarch64/aarch64-tune.md: Regenerate.
gcc/ChangeLog:
PR tree-optimization/96979
* tree-switch-conversion.c (jump_table_cluster::can_be_handled):
Make a fast bail out.
(bit_test_cluster::can_be_handled): Likewise here.
* tree-switch-conversion.h (get_range): Use wi::to_wide instead
of a folding.
gcc/testsuite/ChangeLog:
PR tree-optimization/96979
* g++.dg/tree-ssa/pr96979.C: New test.
symver1.c only is valid on ELF targets. Add AIX to the skip list.
gcc/testsuite/ChangeLog
2020-09-28 David Edelsohn <dje.gcc@gmail.com>
* gcc.dg/ipa/symver1.c: Skip on AIX.
Use `-fasynchronous-unwind-tables' rather than `-fexceptions
-fnon-call-exceptions' in LIB2_DIVMOD_FUNCS compilation flags so as to
provide unwind tables for the affected functions while not pulling the
unwinder proper, which is not required here.
Beyond saving program space it fixes a RISC-V glibc build error due to
unsatisfied `malloc' and `free' references from the unwinder causing
link errors with `ld.so' where libgcc has been built at -O0.
libgcc/
* config/riscv/t-elf (LIB2_DIVMOD_EXCEPTION_FLAGS): New
variable.
I added this field (and the struct itself) in the rewrite of region and
value-handling (808f4dfeb3), but the field
was never used.
Found by cppcheck.
gcc/analyzer/ChangeLog:
* diagnostic-manager.cc (null_assignment_sm_context::m_visitor):
Remove unused field.
gcc/analyzer/ChangeLog:
PR analyzer/97233
* analyzer.cc (is_longjmp_call_p): Require the initial argument
to be a pointer.
* engine.cc (exploded_node::on_longjmp): Likewise.
gcc/testsuite/ChangeLog:
PR analyzer/97233
* gcc.dg/analyzer/pr97233.c: New test.
In 10fc42a839 I converted state_t from
unsigned to const state *, but missed this comparison against 0.
gcc/analyzer/ChangeLog:
* program-state.cc (sm_state_map::print): Update check
for m_global_state being the start state.