Another patch in the series to make the SVE FP patterns use unspecs,
so that they can accurately describe cases in which the predicate
isn't a PTRUE.
2019-08-14 Richard Sandiford <richard.sandiford@arm.com>
Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
gcc/
* config/aarch64/aarch64-sve.md (add<mode>3, *add<mode>3)
(sub<mode>3, *sub<mode>3, *fabd<mode>3, mul<mode>3, *mul<mode>3)
(div<mode>3, *div<mode>3): Use SVE_COND_FP_* unspecs instead of
rtx codes.
(cond_<optab><mode>, *cond_<optab><mode>_2, *cond_<optab><mode>_3)
(*cond_<optab><mode>_any): Add the predicate to the SVE_COND_FP_*
unspecs.
Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org>
From-SVN: r274417
This patch generalises the SVE BIC pattern so that it doesn't
rely on REG_EQUAL notes. The danger with relying on the notes
is that an optimisation could for example replace the original
(not ...) note with an (unspec ... UNSPEC_MERGE_PTRUE) in which
the predicate is a constant. That's a legitimate change and
could even be useful in some situations.
The patch also makes the operand order match the SVE operand order in
both the vector and predicate BIC patterns, which makes things easier
for the ACLE.
2019-08-14 Richard Sandiford <richard.sandiford@arm.com>
Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
gcc/
* config/aarch64/aarch64-sve.md (bic<mode>3): Rename to...
(*bic<SVE_I:mode>3): ...this. Match the form that an SVE inverse
actually has, rather than relying on REG_EQUAL notes.
Make the insn operand order match the SVE operand order.
(*<nlogical><PRED_ALL:mode>3): Make the insn operand order match
the SVE operand order.
Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org>
From-SVN: r274416
This patch makes sure that we build all SVE predicate constants as
VNx16BI before RA, to encourage similar constants to be reused
between modes. This is also useful for the ACLE, where the single
predicate type svbool_t is always a VNx16BI.
Also, and again to encourage reuse, the patch makes us use a .B PTRUE
for all ptrue-predicated operations, rather than (for example) using
a .S PTRUE for 32-bit operations and a .D PTRUE for 64-bit operations.
The only current case in which a .H, .S or .D operation needs to be
predicated by a "strict" .H/.S/.D PTRUE is the PTEST in a conditional
branch, which an earlier patch fixed to use an appropriate VNx16BI
constant.
2019-08-14 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* config/aarch64/aarch64.c (aarch64_target_reg): New function.
(aarch64_emit_set_immediate): Likewise.
(aarch64_ptrue_reg): Build a VNx16BI constant and then bitcast it.
(aarch64_pfalse_reg): Likewise.
(aarch64_convert_sve_data_to_pred): New function.
(aarch64_sve_move_pred_via_while): Take an optional target register
and the required register mode.
(aarch64_expand_sve_const_pred_1): New function.
(aarch64_expand_sve_const_pred): Likewise.
(aarch64_expand_mov_immediate): Build an all-true predicate
if the significant bits of the immediate are all true. Use
aarch64_expand_sve_const_pred for all compile-time predicate constants.
(aarch64_mov_operand_p): Force predicate constants to be VNx16BI
before register allocation.
* config/aarch64/aarch64-sve.md (*vec_duplicate<mode>_reg): Use
a VNx16BI PTRUE when splitting the memory alternative.
(vec_duplicate<mode>): Update accordingly.
(*pred_cmp<cmp_op><mode>): Rename to...
(@aarch64_pred_cmp<cmp_op><mode>): ...this.
gcc/testsuite/
* gcc.target/aarch64/sve/spill_4.c: Expect all ptrues to be .Bs.
* gcc.target/aarch64/sve/single_1.c: Likewise.
* gcc.target/aarch64/sve/single_2.c: Likewise.
* gcc.target/aarch64/sve/single_3.c: Likewise.
* gcc.target/aarch64/sve/single_4.c: Likewise.
From-SVN: r274415
This patch reworks the rtl representation of the SVE PTEST operation
so that:
- the governing predicate is always VNx16BI (and so all bits are defined)
- it is still possible to pattern-match the governing predicate in the
mode that it had previously
- a new hint operand says whether the governing predicate is known to be
all true for the element size of interest, rather than this being part
of the unspec name.
These changes make it easier to handle more flag-setting instructions
as part of the ACLE work.
See the comment in aarch64-sve.md for more details.
2019-08-14 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* config/aarch64/aarch64-protos.h (aarch64_ptrue_all): Declare.
* config/aarch64/aarch64.c (aarch64_ptrue_all): New function.
* config/aarch64/aarch64.md (UNSPEC_PTEST_PTRUE): Delete.
(UNSPEC_PTEST): New unspec.
(SVE_MAYBE_NOT_PTRUE, SVE_KNOWN_PTRUE): New constants.
* config/aarch64/iterators.md (data_bytes): New mode attribute.
* config/aarch64/predicates.md (aarch64_sve_ptrue_flag): New predicate.
* config/aarch64/aarch64-sve.md: Add a new section describing the
handling of UNSPEC_PTEST.
(pred_<LOGICAL:optab><PRED_ALL:mode>3): Rename to...
(@aarch64_pred_<LOGICAL:optab><PRED_ALL:mode>_z): ...this.
(ptest_ptrue<mode>): Replace with...
(aarch64_ptest<mode>): ...this new pattern.
(cbranch<mode>4): Update after above changes.
(*<LOGICAL:optab><PRED_ALL:mode>3_cc): Use UNSPEC_PTEST instead of
UNSPEC_PTEST_PTRUE.
(*cmp<SVE_INT_CMP:cmp_op><SVE_I:mode>_cc): Likewise.
(*cmp<SVE_INT_CMP:cmp_op><SVE_I:mode>_ptest): Likewise.
(*while_ult<GPI:mode><PRED_ALL:mode>_cc): Likewise.
From-SVN: r274414
2019-08-13 Steven G. Kargl <kargl@gcc.gnu.org>
PR fortran/87991
* resolve.c (check_data_variable): data-stmt-object with pointer
attribute requires a data-stmt-value with the target attribute.
2019-08-13 Steven G. Kargl <kargl@gcc.gnu.org>
PR fortran/87991
* gfortran.dg/pr87991.f90: New test.
From-SVN: r274412
In LTO mode, if static library and dynamic library contains same
function and both libraries are passed as arguments, linker will link
the function in dynamic library no matter the sequence. This patch
will output LTO symbol node as UNDEF if BUILT_IN_NORMAL function FNDECL
is a math function, then the function in static library will be linked
first if its sequence is ahead of the dynamic library.
gcc/ChangeLog
2019-08-14 Xiong Hu Luo <luoxhu@linux.ibm.com>
PR lto/91287
* builtins.c (builtin_with_linkage_p): New function.
* builtins.h (builtin_with_linkage_p): New function.
* symtab.c (write_symbol): Remove redundant assert.
* lto-streamer-out.c (symtab_node::output_to_lto_symbol_table_p):
Remove FIXME and use builtin_with_linkage_p.
From-SVN: r274411
We were shoe-horning all built-in enumerations (including frontend
and target-specific ones) into a field of type built_in_function. This
was accessed as either an lvalue or an rvalue using DECL_FUNCTION_CODE.
The obvious danger with this (as was noted by several ??? comments)
is that the ranges have nothing to do with each other, and targets can
easily have more built-in functions than generic code. But my patch to
make the field bigger was the straw that finally made the problem visible.
This patch therefore:
- replaces the field with a plain unsigned int
- turns DECL_FUNCTION_CODE into an rvalue-only accessor that checks
that the function really is BUILT_IN_NORMAL
- adds corresponding DECL_MD_FUNCTION_CODE and DECL_FE_FUNCTION_CODE
accessors for BUILT_IN_MD and BUILT_IN_FRONTEND respectively
- adds DECL_UNCHECKED_FUNCTION_CODE for places that need to access the
underlying field (should be low-level code only)
- adds new helpers for setting the built-in class and function code
- makes DECL_BUILT_IN_CLASS an rvalue-only accessor too, since all
assignments should go through the new helpers
2019-08-13 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR middle-end/91421
* tree-core.h (function_decl::function_code): Change type to
unsigned int.
* tree.h (DECL_FUNCTION_CODE): Rename old definition to...
(DECL_UNCHECKED_FUNCTION_CODE): ...this.
(DECL_BUILT_IN_CLASS): Make an rvalue macro only.
(DECL_FUNCTION_CODE): New function. Assert that the built-in class
is BUILT_IN_NORMAL.
(DECL_MD_FUNCTION_CODE, DECL_FE_FUNCTION_CODE): New functions.
(set_decl_built_in_function, copy_decl_built_in_function): Likewise.
(fndecl_built_in_p): Change the type of the "name" argument to
unsigned int.
* builtins.c (expand_builtin): Move DECL_FUNCTION_CODE use
after check for DECL_BUILT_IN_CLASS.
* cgraphclones.c (build_function_decl_skip_args): Use
set_decl_built_in_function.
* ipa-param-manipulation.c (ipa_modify_formal_parameters): Likewise.
* ipa-split.c (split_function): Likewise.
* langhooks.c (add_builtin_function_common): Likewise.
* omp-simd-clone.c (simd_clone_create): Likewise.
* tree-streamer-in.c (unpack_ts_function_decl_value_fields): Likewise.
* config/darwin.c (darwin_init_cfstring_builtins): Likewise.
(darwin_fold_builtin): Use DECL_MD_FUNCTION_CODE instead of
DECL_FUNCTION_CODE.
* fold-const.c (operand_equal_p): Compare DECL_UNCHECKED_FUNCTION_CODE
instead of DECL_FUNCTION_CODE.
* lto-streamer-out.c (hash_tree): Use DECL_UNCHECKED_FUNCTION_CODE
instead of DECL_FUNCTION_CODE.
* tree-streamer-out.c (pack_ts_function_decl_value_fields): Likewise.
* print-tree.c (print_node): Use DECL_MD_FUNCTION_CODE when
printing DECL_BUILT_IN_MD. Handle DECL_BUILT_IN_FRONTEND.
* config/aarch64/aarch64-builtins.c (aarch64_expand_builtin)
(aarch64_fold_builtin, aarch64_gimple_fold_builtin): Use
DECL_MD_FUNCTION_CODE instead of DECL_FUNCTION_CODE.
* config/aarch64/aarch64.c (aarch64_builtin_reciprocal): Likewise.
* config/alpha/alpha.c (alpha_expand_builtin, alpha_fold_builtin):
(alpha_gimple_fold_builtin): Likewise.
* config/arc/arc.c (arc_expand_builtin): Likewise.
* config/arm/arm-builtins.c (arm_expand_builtin): Likewise.
* config/avr/avr-c.c (avr_resolve_overloaded_builtin): Likewise.
* config/avr/avr.c (avr_expand_builtin, avr_fold_builtin): Likewise.
* config/bfin/bfin.c (bfin_expand_builtin): Likewise.
* config/c6x/c6x.c (c6x_expand_builtin): Likewise.
* config/frv/frv.c (frv_expand_builtin): Likewise.
* config/gcn/gcn.c (gcn_expand_builtin_1): Likewise.
(gcn_expand_builtin): Likewise.
* config/i386/i386-builtins.c (ix86_builtin_reciprocal): Likewise.
(fold_builtin_cpu): Likewise.
* config/i386/i386-expand.c (ix86_expand_builtin): Likewise.
* config/i386/i386.c (ix86_fold_builtin): Likewise.
(ix86_gimple_fold_builtin): Likewise.
* config/ia64/ia64.c (ia64_fold_builtin): Likewise.
(ia64_expand_builtin): Likewise.
* config/iq2000/iq2000.c (iq2000_expand_builtin): Likewise.
* config/mips/mips.c (mips_expand_builtin): Likewise.
* config/msp430/msp430.c (msp430_expand_builtin): Likewise.
* config/nds32/nds32-intrinsic.c (nds32_expand_builtin_impl): Likewise.
* config/nios2/nios2.c (nios2_expand_builtin): Likewise.
* config/nvptx/nvptx.c (nvptx_expand_builtin): Likewise.
* config/pa/pa.c (pa_expand_builtin): Likewise.
* config/pru/pru.c (pru_expand_builtin): Likewise.
* config/riscv/riscv-builtins.c (riscv_expand_builtin): Likewise.
* config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin):
Likewise.
* config/rs6000/rs6000-call.c (htm_expand_builtin): Likewise.
(altivec_expand_dst_builtin, altivec_expand_builtin): Likewise.
(rs6000_gimple_fold_builtin, rs6000_expand_builtin): Likewise.
* config/rs6000/rs6000.c (rs6000_builtin_md_vectorized_function)
(rs6000_builtin_reciprocal): Likewise.
* config/rx/rx.c (rx_expand_builtin): Likewise.
* config/s390/s390-c.c (s390_resolve_overloaded_builtin): Likewise.
* config/s390/s390.c (s390_expand_builtin): Likewise.
* config/sh/sh.c (sh_expand_builtin): Likewise.
* config/sparc/sparc.c (sparc_expand_builtin): Likewise.
(sparc_fold_builtin): Likewise.
* config/spu/spu-c.c (spu_resolve_overloaded_builtin): Likewise.
* config/spu/spu.c (spu_expand_builtin): Likewise.
* config/stormy16/stormy16.c (xstormy16_expand_builtin): Likewise.
* config/tilegx/tilegx.c (tilegx_expand_builtin): Likewise.
* config/tilepro/tilepro.c (tilepro_expand_builtin): Likewise.
* config/xtensa/xtensa.c (xtensa_fold_builtin): Likewise.
(xtensa_expand_builtin): Likewise.
gcc/ada/
PR middle-end/91421
* gcc-interface/trans.c (gigi): Call set_decl_buillt_in_function.
(Call_to_gnu): Use DECL_FE_FUNCTION_CODE instead of DECL_FUNCTION_CODE.
gcc/c/
PR middle-end/91421
* c-decl.c (merge_decls): Use copy_decl_built_in_function.
gcc/c-family/
PR middle-end/91421
* c-common.c (resolve_overloaded_builtin): Use
copy_decl_built_in_function.
gcc/cp/
PR middle-end/91421
* decl.c (duplicate_decls): Use copy_decl_built_in_function.
* pt.c (declare_integer_pack): Use set_decl_built_in_function.
gcc/d/
PR middle-end/91421
* intrinsics.cc (maybe_set_intrinsic): Use set_decl_built_in_function.
gcc/jit/
PR middle-end/91421
* jit-playback.c (new_function): Use set_decl_built_in_function.
gcc/lto/
PR middle-end/91421
* lto-common.c (compare_tree_sccs_1): Use DECL_UNCHECKED_FUNCTION_CODE
instead of DECL_FUNCTION_CODE.
* lto-symtab.c (lto_symtab_merge_p): Likewise.
From-SVN: r274404
This patch protects various uses of DECL_FUNCTION_CODE that didn't
obviously check for BUILT_IN_NORMAL first (either directly or in callers).
They could therefore trigger for functions that either aren't built-ins
or are a different kind of built-in.
Also, the patch removes a redundant GIMPLE_CALL check from
optimize_stdarg_builtin, since it gave the impression that the stmt
was less well checked than it actually is.
2019-08-13 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR middle-end/91421
* attribs.c (decl_attributes): Check the DECL_BUILT_IN_CLASS
before the DECL_FUNCTION_CODE.
* calls.c (maybe_warn_alloc_args_overflow): Use fndecl_built_in_p
to check for a BUILT_IN_ALLOCA call.
* ipa-cp.c (ipa_get_indirect_edge_target_1): Likewise for
BUILT_IN_UNREACHABLE. Don't check for a FUNCTION_TYPE.
* ipa-devirt.c (possible_polymorphic_call_target_p): Likewise.
* ipa-prop.c (try_make_edge_direct_virtual_call): Likewise.
* gimple-ssa-isolate-paths.c (is_addr_local): Check specifically
for BUILT_IN_NORMAL functions.
* trans-mem.c (expand_block_edges): Use gimple_call_builtin_p to
test for BUILT_IN_TM_ABORT.
* tree-ssa-ccp.c (optimize_stack_restore): Use fndecl_built_in_p
to check for a BUILT_IN_STACK_RESTORE call.
(optimize_stdarg_builtin): Remove redundant check for GIMPLE_CALL.
* tree-ssa-threadedge.c
(record_temporary_equivalences_from_stmts_at_dest): Check for a
BUILT_IN_NORMAL decl before checking its DECL_FUNCTION_CODE.
* tree-vect-patterns.c (vect_recog_pow_pattern): Use a positive
test for a BUILT_IN_NORMAL call instead of a negative test for
an internal function call.
gcc/c/
PR middle-end/91421
* c-decl.c (header_for_builtin_fn): Take a FUNCTION_DECL instead
of a built_in_function.
(diagnose_mismatched_decls, implicitly_declare): Update accordingly.
From-SVN: r274403
This patch is a combination of two changes that have to be
committed as a single unit:
(1) Try to fold IFN_WHILE_ULTs with constant arguments to a VECTOR_CST
(which is always possible for fixed-length vectors but is not
necessarily so for variable-length vectors)
(2) Make the SVE port recognise constants that map to PTRUE VLn,
which includes those generated by the new fold.
(2) can't be tested without (1) and (1) would be a significant
pessimisation without (2).
The target-specific parts also start moving towards doing predicate
manipulation in a canonical VNx16BImode form, using rtx_vector_builders.
2019-08-13 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree.h (build_vector_a_then_b): Declare.
* tree.c (build_vector_a_then_b): New function.
* fold-const-call.c (fold_while_ult): Likewise.
(fold_const_call): Use it to handle IFN_WHILE_ULT.
* config/aarch64/aarch64-protos.h (AARCH64_FOR_SVPATTERN): New macro.
(aarch64_svpattern): New enum.
* config/aarch64/aarch64-sve.md (mov<PRED_ALL:mode>): Pass
constants through aarch64_expand_mov_immediate.
(*aarch64_sve_mov<PRED_ALL:mode>): Use aarch64_mov_operand rather
than general_operand as the predicate for operand 1.
(while_ult<GPI:mode><PRED_ALL:mode>): Add a '@' marker.
* config/aarch64/aarch64.c (simd_immediate_info::PTRUE): New
insn_type.
(simd_immediate_info::simd_immediate_info): New overload that
takes a scalar_int_mode and an svpattern.
(simd_immediate_info::u): Add a "pattern" field.
(svpattern_token): New function.
(aarch64_get_sve_pred_bits, aarch64_widest_sve_pred_elt_size)
(aarch64_partial_ptrue_length, aarch64_svpattern_for_vl)
(aarch64_sve_move_pred_via_while): New functions.
(aarch64_expand_mov_immediate): Try using
aarch64_sve_move_pred_via_while for predicates that contain N ones
followed by M zeros but that do not correspond to a VLnnn pattern.
(aarch64_sve_pred_valid_immediate): New function.
(aarch64_simd_valid_immediate): Use it instead of dealing directly
with PTRUE and PFALSE.
(aarch64_output_sve_mov_immediate): Handle new simd_immediate_info
forms.
gcc/testsuite/
* gcc.target/aarch64/sve/spill_2.c: Increase iteration counts
beyond the range of a PTRUE.
* gcc.target/aarch64/sve/while_6.c: New test.
* gcc.target/aarch64/sve/while_7.c: Likewise.
* gcc.target/aarch64/sve/while_8.c: Likewise.
* gcc.target/aarch64/sve/while_9.c: Likewise.
* gcc.target/aarch64/sve/while_10.c: Likewise.
From-SVN: r274402
2019-08-13 Steven G. Kargl <kargl@gcc.gnu.org>
PR fortran/88072
* gfortran.dg/unlimited_polymorphic_28.f90: Fix error message. Left
out of previous commit!
From-SVN: r274400
2019-08-13 Steven G. Kargl <kargl@gcc.gnu.org>
PR fortran/88072
* misc.c (gfc_typename): Do not point to something that ought not to
be pointed at.
2019-08-13 Steven G. Kargl <kargl@gcc.gnu.org>
PR fortran/88072
* gfortran.dg/pr88072.f90: New test.
* gfortran.dg/unlimited_polymorphic_28.f90: Fix error message.
From-SVN: r274399
So we can use a single flag for both, and rename this now, before a confusing
name gets into the wild.
gcc/
2019-08-13 Iain Sandoe <iain@sandoe.co.uk>
* config/darwin.c (machopic_indirect_call_target): Rename symbol stub
flag.
(darwin_override_options): Likewise.
* config/darwin.h: Likewise.
* config/darwin.opt: Likewise.
* config/i386/i386.c (output_pic_addr_const): Likewise.
* config/rs6000/darwin.h: Likewise.
* config/rs6000/rs6000.c (rs6000_call_darwin_1): Likewise.
* config/i386/darwin.h (TARGET_MACHO_PICSYM_STUBS): Rename to ...
... this TARGET_MACHO_SYMBOL_STUBS.
(FUNCTION_PROFILER):Likewise.
* config/i386/i386.h: Likewise.
gcc/testsuite/
2019-08-13 Iain Sandoe <iain@sandoe.co.uk>
* obj-c++.dg/stubify-1.mm: Rename symbol stub option.
* obj-c++.dg/stubify-2.mm: Likewise.
* objc.dg/stubify-1.m: Likewise.
* objc.dg/stubify-2.m: Likewise.
From-SVN: r274397
2019-08-13 Steven G. Kargl <kargl@gcc.gnu.org>
PR fortran/89647
resolve.c (resolve_typebound_procedure): Allow host associated
procedure to be a binding target. While here, wrap long line.
2019-08-13 Steven G. Kargl <kargl@gcc.gnu.org>
PR fortran/89647
* gfortran.dg/pr89647.f90: New test.
From-SVN: r274393
2019-08-13 Steven G. Kargl <kargl@gcc.gnu.org>
PR fortran/87993
* expr.c (gfc_simplify_expr): Simplifcation of an array with a kind
type inquiry suffix yields a constant expression.
2019-08-13 Steven G. Kargl <kargl@gcc.gnu.org>
PR fortran/87993
* gfortran.dg/pr87993.f90: New test.
From-SVN: r274388
Fix a bug where linking with -fvtable-verify and
-static causes the linker to complain about multiple definitions of
things in the vtv_end*.o files (once from the .o file and once from
libvtv.a).
2019-08-12 Caroline Tice <cmtice@google.com>
PR other/91396
* config/gnu-user.h (GNU_USER_TARGET_ENDFILE_SPEC): Only add the
vtv_end.o or vtv_end_preinit.o files if !static.
From-SVN: r274386
2019-08-13 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/90561
* trans.h (gfc_evaluate_now_function_scope): New function.
* trans.c (gfc_evaluate_now_function_scope): New function.
* trans-expr.c (gfc_trans_assignment): Use it.
2019-08-13 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/90561
* gfortran.dg/deferred_character_34.f90: New test.
From-SVN: r274383
* call.c (null_ptr_cst_p): Update quote from the standard.
* decl.c (check_default_argument): Don't return nullptr when the arg
has side-effects.
* g++.dg/cpp0x/nullptr42.C: New test.
From-SVN: r274382
* rtlanal.c (tablejump_casesi_pattern): New function, to
determine if a tablejump insn is a casesi dispatcher. Extracted
from patch_jump_insn.
* rtl.h (tablejump_casesi_pattern): Declare.
* cfgrtl.c (patch_jump_insn): Use it.
* dwarf2cfi.c (create_trace_edges): Use it.
testsuite/
* gnat.dg/casesi.ad[bs], test_casesi.adb: New test.
From-SVN: r274377
PR81800 is about the lrint inline giving spurious FE_INEXACT exceptions.
The previous change for PR81800 didn't fix this: when lrint is disabled
in the backend, the midend will simply use llrint. This actually makes
things worse since llrint now also ignores FE_INVALID exceptions!
The fix is to disable lrint/llrint on double if the size of a long is
smaller (ie. ilp32).
gcc/
PR target/81800
* gcc/config/aarch64/aarch64.md (lrint): Disable lrint pattern if GPF
operand is larger than a long int.
testsuite/
PR target/81800
* gcc.target/aarch64/no-inline-lrint_3.c: New test.
From-SVN: r274376
If there's no SVE instruction to load a given constant directly, this
patch instead tries to use an Advanced SIMD constant move and then
duplicates the constant to fill an SVE vector. The main use of this
is to support constants in which each byte is in { 0, 0xff }.
Also, the patch prefers a simple integer move followed by a duplicate
over a load from memory, like we already do for Advanced SIMD. This is
a useful option to have and would be easy to turn off via a tuning
parameter if necessary.
The patch also extends the handling of wide LD1Rs to big endian,
whereas previously we punted to a full LD1RQ.
2019-08-13 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* machmode.h (opt_mode::else_mode): New function.
(opt_mode::else_blk): Use it.
* config/aarch64/aarch64-protos.h (aarch64_vq_mode): Declare.
(aarch64_full_sve_mode, aarch64_sve_ld1rq_operand_p): Likewise.
(aarch64_gen_stepped_int_parallel): Likewise.
(aarch64_stepped_int_parallel_p): Likewise.
(aarch64_expand_mov_immediate): Remove the optional gen_vec_duplicate
argument.
* config/aarch64/aarch64.c
(aarch64_expand_sve_widened_duplicate): Delete.
(aarch64_expand_sve_dupq, aarch64_expand_sve_ld1rq): New functions.
(aarch64_expand_sve_const_vector): Rewrite to handle more cases.
(aarch64_expand_mov_immediate): Remove the optional gen_vec_duplicate
argument. Use early returns in the !CONST_INT_P handling.
Pass all SVE data vectors to aarch64_expand_sve_const_vector rather
than handling some inline.
(aarch64_full_sve_mode, aarch64_vq_mode): New functions, split out
from...
(aarch64_simd_container_mode): ...here.
(aarch64_gen_stepped_int_parallel, aarch64_stepped_int_parallel_p)
(aarch64_sve_ld1rq_operand_p): New functions.
* config/aarch64/predicates.md (descending_int_parallel)
(aarch64_sve_ld1rq_operand): New predicates.
* config/aarch64/constraints.md (UtQ): New constraint.
* config/aarch64/aarch64.md (UNSPEC_REINTERPRET): New unspec.
* config/aarch64/aarch64-sve.md (mov<SVE_ALL:mode>): Remove the
gen_vec_duplicate from call to aarch64_expand_mov_immediate.
(@aarch64_sve_reinterpret<mode>): New expander.
(*aarch64_sve_reinterpret<mode>): New pattern.
(@aarch64_vec_duplicate_vq<mode>_le): New pattern.
(@aarch64_vec_duplicate_vq<mode>_be): Likewise.
(*sve_ld1rq<Vesize>): Replace with...
(@aarch64_sve_ld1rq<mode>): ...this new pattern.
gcc/testsuite/
* gcc.target/aarch64/sve/init_2.c: Expect ld1rd to be used
instead of a full vector load.
* gcc.target/aarch64/sve/init_4.c: Likewise.
* gcc.target/aarch64/sve/ld1r_2.c: Remove constants that no longer
need to be loaded from memory.
* gcc.target/aarch64/sve/slp_2.c: Expect the same output for
big and little endian.
* gcc.target/aarch64/sve/slp_3.c: Likewise. Expect 3 of the
doubles to be moved via integer registers rather than loaded
from memory.
* gcc.target/aarch64/sve/slp_4.c: Likewise but for 4 doubles.
* gcc.target/aarch64/sve/spill_4.c: Expect 16-bit constants to be
loaded via an integer register rather than from memory.
* gcc.target/aarch64/sve/const_1.c: New test.
* gcc.target/aarch64/sve/const_2.c: Likewise.
* gcc.target/aarch64/sve/const_3.c: Likewise.
From-SVN: r274375
With -mcpu=generic the function alignment is currently 8, however almost all
supported cores prefer 16 or higher, so increase the default to 16:12.
This gives ~0.2% performance increase on SPECINT2017, while codesize is 0.12%
larger.
gcc/
* config/aarch64/aarch64.c (generic_tunings): Set function alignment to
16:12.
From-SVN: r274374
This patch makes predicate constants use the normal simd_immediate_info
machinery, rather than treating PFALSE and PTRUE as special cases.
This makes it easier to add other types of predicate constant later.
2019-08-13 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* config/aarch64/aarch64-protos.h (aarch64_output_ptrue): Delete.
* config/aarch64/aarch64-sve.md (*aarch64_sve_mov<PRED_ALL:mode>):
Use a single Dn alternative instead of separate Dz and Dm
alternatives. Use aarch64_output_sve_move_immediate.
* config/aarch64/aarch64.c (aarch64_sve_element_int_mode): New
function.
(aarch64_simd_valid_immediate): Fill in the simd_immediate_info
for predicates too.
(aarch64_output_sve_mov_immediate): Handle predicate modes.
(aarch64_output_ptrue): Delete.
From-SVN: r274372
This patch tweaks the representation of SVE INDEX instructions in
simd_immediate_info so that it's easier to add new types of
constant on top.
2019-08-13 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* config/aarch64/aarch64.c (simd_immediate_info::insn_type): Add
INDEX.
(simd_immediate_info::value, simd_immediate_info::step)
(simd_immediate_info::modifier, simd_immediate_info::shift): Replace
with...
(simd_immediate_info::u): ...this new union.
(simd_immediate_info::simd_immediate_info): Update accordingly.
(aarch64_output_simd_mov_immediate): Likewise.
(aarch64_output_sve_mov_immediate): Likewise.
From-SVN: r274371
aarch64_classify_vector_mode used properties of a mode to test whether
the mode was a single Advanced SIMD vector, a single SVE vector, or a
tuple of SVE vectors. That works well for current trunk and is simpler
than checking for modes by name.
However, for the ACLE and for planned autovec improvements, we also
need partial SVE vector modes that hold:
- half of the available 32-bit elements
- a half or quarter of the available 16-bit elements
- a half, quarter, or eighth of the available 8-bit elements
These should be packed in memory and unpacked in registers. E.g.
VNx2SI has half the number of elements of VNx4SI, and so is half the
size in memory. When stored in registers, each VNx2SI element occupies
the low 32 bits of a VNx2DI element, with the upper bits being undefined.
The upshot is that:
GET_MODE_SIZE (VNx4SImode) == 2 * GET_MODE_SIZE (VNx2SImode)
since GET_MODE_SIZE must always be the memory size. This in turn means
that for fixed-length SVE, some partial modes can have the same size as
Advanced SIMD modes. We then need to be specific about which mode we're
dealing with.
This patch prepares for that by switching based on the mode instead
of querying properties.
A later patch makes sure that Advanced SIMD modes always win over
partial SVE vector modes in normal queries.
2019-08-13 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* config/aarch64/aarch64.c (aarch64_classify_vector_mode): Switch
based on the mode instead of testing properties of it.
From-SVN: r274368
Some indexed SVE FCMLA operations have a 3-bit register field that
requires one of Z0-Z7. This patch adds a public "y" constraint for that.
The patch also documents "x", which is again intended to be a public
constraint.
2019-08-13 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* doc/md.texi: Document the x and y constraints for AArch64.
* config/aarch64/aarch64.h (FP_LO8_REGNUM_P): New macro.
(FP_LO8_REGS): New reg_class.
(REG_CLASS_NAMES, REG_CLASS_CONTENTS): Add an entry for FP_LO8_REGS.
* config/aarch64/aarch64.c (aarch64_hard_regno_nregs)
(aarch64_regno_regclass, aarch64_class_max_nregs): Handle FP_LO8_REGS.
* config/aarch64/predicates.md (aarch64_simd_register): Use
FP_REGNUM_P instead of checking the classes manually.
* config/aarch64/constraints.md (y): New constraint.
gcc/testsuite/
* gcc.target/aarch64/asm-x-constraint-1.c: New test.
* gcc.target/aarch64/asm-y-constraint-1.c: Likewise.
From-SVN: r274367
The Advanced SIMD and SVE permute patterns both split the permute
operation into a base name and a hilo suffix. That works well, but it
means that for "@" patterns, we need to pass the permute code twice,
once for the base name and once for the suffix.
Having a unified name avoids that and also makes the definitions
slightly simpler.
2019-08-13 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* config/aarch64/iterators.md (perm_insn): Include the "1"/"2" suffix.
(perm_hilo): Remove UNSPEC_ZIP*, UNSEPC_TRN* and UNSPEC_UZP*.
* config/aarch64/aarch64-simd.md
(aarch64_<PERMUTE:perm_insn><PERMUTE:perm_hilo><mode>): Rename to..
(aarch64_<PERMUTE:perm_insn><mode>): ...this and remove perm_hilo
from the asm template.
* config/aarch64/aarch64-sve.md
(aarch64_<perm_insn><perm_hilo><PRED_ALL:mode>): Rename to..
(aarch64_<perm_insn><PRED_ALL:mode>): ...this and remove perm_hilo
from the asm template.
(aarch64_<perm_insn><perm_hilo><SVE_ALL:mode>): Rename to..
(aarch64_<perm_insn><SVE_ALL:mode>): ...this and remove perm_hilo
from the asm template.
* config/aarch64/aarch64-simd-builtins.def: Update comment.
From-SVN: r274366
Update the PRNG from xorshift1024* to xoshiro256** by the same
author. For details see
http://prng.di.unimi.it/
and the paper at
https://arxiv.org/abs/1805.01407
Also the seeding is slightly improved, by reading only 8 bytes from
the operating system and using the simple splitmix64 PRNG to fill in
the rest of the PRNG state (as recommended by the xoshiro author),
instead of reading the entire state from the OS.
Regtested on x86_64-pc-linux-gnu, Ok for trunk?
gcc/fortran/ChangeLog:
2019-08-13 Janne Blomqvist <jb@gcc.gnu.org>
PR fortran/91414
* check.c (gfc_check_random_seed): Reduce seed_size.
* intrinsic.texi (RANDOM_NUMBER): Update to match new PRNG.
gcc/testsuite/ChangeLog:
2019-08-13 Janne Blomqvist <jb@gcc.gnu.org>
PR fortran/91414
* gfortran.dg/random_seed_1.f90: Update to match new seed size.
libgfortran/ChangeLog:
2019-08-13 Janne Blomqvist <jb@gcc.gnu.org>
PR fortran/91414
* intrinsics/random.c (prng_state): Update state struct.
(master_state): Update to match new size.
(get_rand_state): Update to match new PRNG.
(rotl): New function.
(xorshift1024star): Replace with prng_next.
(prng_next): New function.
(jump): Update for new PRNG.
(lcg_parkmiller): Replace with splitmix64.
(splitmix64): New function.
(getosrandom): Fix return value, simplify.
(init_rand_state): Use getosrandom only to get 8 bytes, splitmix64
to fill rest of state.
(random_r4): Update to new function and struct names.
(random_r8): Likewise.
(random_r10): Likewise.
(random_r16): Likewise.
(arandom_r4): Liekwise.
(arandom_r8): Likewise.
(arandom_r10): Likwewise.
(arandom_r16): Likewise.
(xor_keys): Reduce size to match new PRNG.
(random_seed_i4): Update to new function and struct names, remove
special handling of variable p used in previous PRNG.
(random_seed_i8): Likewise.
From-SVN: r274361
The component has been unused for a while. No functional changes.
2019-08-13 Eric Botcazou <ebotcazou@adacore.com>
gcc/ada/
* ali.ads (Linker_Option_Record): Remove Original_Pos component.
* ali.adb (Scan_ALI): Do not set it.
From-SVN: r274360
This extends the processing done for the derivation of private
discriminated types to concurrent types, which is now required because
this derivation is no longer redone when a subtype of the derived
concurrent type is built.
This increases the number of entities generated internally in the
compiler but this case is sufficiently rare as not to be a real concern.
2019-08-13 Eric Botcazou <ebotcazou@adacore.com>
gcc/ada/
* sem_ch3.adb (Build_Derived_Concurrent_Type): Add a couple of
local variables and use them. When the derived type fully
constrains the parent type, rewrite it as a subtype of an
implicit (unconstrained) derived type instead of the other way
around.
(Copy_And_Build): Deal with concurrent types and use predicates.
(Build_Derived_Private_Type): Build the full derivation if
needed for concurrent types too.
(Build_Derived_Record_Type): Add marker comment.
(Complete_Private_Subtype): Use predicates.
gcc/testsuite/
* gnat.dg/discr56.adb, gnat.dg/discr56.ads,
gnat.dg/discr56_pkg1.adb, gnat.dg/discr56_pkg1.ads,
gnat.dg/discr56_pkg2.ads: New testcase.
From-SVN: r274359
This patch adds an RM reference for the rule that in a generic body a
type extension cannot have ancestors that are generic formal types. The
patch also extends the check to interface progenitors that may appear in
a derived type declaration or private extension declaration.
2019-08-13 Ed Schonberg <schonberg@adacore.com>
gcc/ada/
* sem_ch3.adb (Check_Generic_Ancestor): New subprogram,
aubsidiary to Build_Derived_Record_Type. to enforce the rule
that a type extension declared in a generic body cznnot have an
ancestor that is a generic formal (RM 3.9.1 (4/2)). The rule
applies to all ancestors of the type, including interface
progenitors.
gcc/testsuite/
* gnat.dg/tagged4.adb: New testcase.
From-SVN: r274358
This change was initially aimed at fixing a spurious instantiation error
due to a disambiguation issue which happens when a generic unit with two
formal type parameters is instantiated on a single actual type that is
private.
The compiler internally sets the Is_Generic_Actual_Type flag on the
actual subtypes built for the instantiation in order to ease the
disambiguation, but it would fail to set it on the full view if the
subtypes are private. The change makes it so that the flag is properly
set and reset on the full view in this case.
But this uncovered an issue in Subtypes_Statically_Match, which was
relying on a stalled Is_Generic_Actual_Type flag set on a full view
outside of the instantiation to return a positive answer. This bypass
was meant to solve an issue arising with a private discriminated record
type whose completion is a discriminated record type itself derived from
a private discriminated record type, which is used as actual type in an
instantiation in another unit, and the instantiation is used in a child
unit of the original unit. In this case, the private and full views of
the generic actual type are swapped in the child unit, but there was a
mismatch between the chain of full and underlying full views of the
private discriminated record type and that of the generic actual type.
This secondary issue is solved by avoiding to skip the full view in the
preparation of the completion of the private subtype and by directly
constraining the underlying full view of the full view of the base type
instead of building an underlying full view from scratch.
2019-08-13 Eric Botcazou <ebotcazou@adacore.com>
gcc/ada/
* sem_ch3.adb (Build_Underlying_Full_View): Delete.
(Complete_Private_Subtype): Do not set the full view on the
private subtype here. If the full base is itself derived from
private, do not re-derive the parent type but instead constrain
an existing underlying full view.
(Prepare_Private_Subtype_Completion): Do not get to the
underlying full view, if any. Set the full view on the private
subtype here.
(Process_Full_View): Likewise.
* sem_ch12.adb (Check_Generic_Actuals): Also set
Is_Generic_Actual_Type on the full view if the type of the
actual is private.
(Restore_Private_Views): Also reset Is_Generic_Actual_Type on
the full view if the type of the actual is private.
* sem_eval.adb (Subtypes_Statically_Match): Remove bypass for
generic actual types.
gcc/testsuite/
* gnat.dg/generic_inst10.adb, gnat.dg/generic_inst10_pkg.ads:
New testcase.
From-SVN: r274357
When a record type with an an access to class-wide type discriminant
has aspect Implicit_Dereference, and the discriminant is used as the
controlling argument of a dispatching call, the compiler may generate
wrong code to dispatch the call.
2019-08-13 Javier Miranda <miranda@adacore.com>
gcc/ada/
* sem_res.adb (Resolve_Selected_Component): When the type of the
component is an access to a class-wide type and the type of the
context is an access to a tagged type the relevant type is that
of the component (since in such case we may need to generate
implicit type conversions or dispatching calls).
gcc/testsuite/
* gnat.dg/tagged3.adb, gnat.dg/tagged3_pkg.adb,
gnat.dg/tagged3_pkg.ads: New testcase.
From-SVN: r274356
An aggregate can be handled by the backend if it consists of static
constants of an elementary type, or null. If a component is a type
conversion we must preanalyze and resolve it to determine whether the
ultimate value is in one of these categories. Previously we did a full
analysis and resolution of the expression for the component, which could
lead to a removal of side-effects, which is semantically incorrect if
the expression includes functions with side-effects (e.g. a call to a
random generator).
2019-08-13 Ed Schonberg <schonberg@adacore.com>
gcc/ada/
* exp_aggr.adb (Aggr_Assignment_OK_For_Backend): Preanalyze
expression, rather do a full analysis, to prevent unwanted
removal of side effects which mask the intent of the expression.
gcc/testsuite/
* gnat.dg/aggr27.adb: New testcase.
From-SVN: r274355
This is a small cleanup in the inlining machinery of the front-end
dealing with back-end inlining. It should save a few cycles at -O0 by
stopping it from doing useless work. No functional changes.
2019-08-13 Eric Botcazou <ebotcazou@adacore.com>
gcc/ada/
* exp_ch6.adb: Remove with and use clauses for Sem_Ch12.
(Expand_Call_Helper): Swap the back-end inlining case and the
special front-end expansion case. In back-end inlining mode, do
not invoke Add_Inlined_Body unless the call may be inlined.
* inline.ads (Add_Pending_Instantiation): New function moved
from...
* inline.adb (Add_Inlined_Body): Simplify comment. Turn test on
the enclosing unit into assertion.
(Add_Pending_Instantiation): New function moved from...
* sem_ch12.ads (Add_Pending_Instantiation): ...here.
* sem_ch12.adb (Add_Pending_Instantiation): ...here.
From-SVN: r274353
This fixes a bogus style check failure for long lines in rare cases
where the compiler is invoked, with a -gnatyX switch where X is neither
'm' nor 'M', on a unit which contains "with" clauses for other units
that contain a pragma Style_Checks (Off).
2019-08-13 Eric Botcazou <ebotcazou@adacore.com>
gcc/ada/
* sem.adb (Do_Analyze): Recompute Style_Check_Max_Line_Length
after restoring Style_Max_Line_Length.
From-SVN: r274352