ARC processors can work with a reduced register set (i.e. registers
r4-r9 and r16-r25 are not available). This option can be enabled
passing -mrf16 option to the compiler, or by using -mcpu=em_mini CPU
configuration. Using RF16 config requires all the hand-made assembly
files used in libgcc to have the corresponding RF16 object attribute
set.
This patch qualifies the relevant hand-made assembly files to
RF16 config, and also adds generic c-functions for the one which are
not.
libgcc/
xxxx-xx-xx Claudiu Zissulescu <claziss@synopsys.com>
* config/arc/crti.S: Add RF16 object attribute.
* config/arc/crtn.S: Likewise.
* config/arc/crttls.S: Likewise.
* config/arc/lib1funcs.S: Likewise.
* config/arc/fp-hack.h (ARC_OPTFPE): Define.
* config/arc/lib2funcs.c: New file.
* config/arc/t-arc: Add lib2funcs to LIB2ADD.
The deduction guide from an iterator and sentinel used the wrong alias
template and so didn't work.
PR libstdc++/93426
* include/std/span (span): Fix deduction guide.
* testsuite/23_containers/span/deduction.cc: New test.
lra_assign has an assert to make sure that no pseudo is allocated
to a conflicting hard register. It used to be restricted to
!flag_ipa_ra, but in g:a1e6ee38e708ef2bdef4 I'd enabled it for
flag_ipa_ra too. It then tripped while building libstdc++
for mips-mti-linux.
The failure was due to code at the end of process_bb_lives. For an
abnormal/EH edge, we need to make sure that all pseudos that are live
on entry to the destination conflict with all hard registers that are
clobbered by an abnormal call return. The usual way to do this would
be to simulate a clobber of the hard registers, by making them live and
them making them dead again. Making the registers live creates the
conflict; making them dead again restores the correct live set for
whatever follows.
However, process_bb_lives skips the second step (making the registers
dead again) at the start of a BB, presumably on the basis that there's
no further code that needs a correct live set. The problem for the PR
is that that wasn't quite true in practice. There was code further
down process_bb_lives that updated the live-in set of the BB for some
registers, and this live set was "contaminated" by registers that
weren't live but that created conflicts. This information then got
propagated to other blocks, so that registers that were made live
purely to create a conflict at the start of the EH receiver then became
needlessly live throughout preceding blocks. This in turn created a
fake conflict with pseudos in those blocks, invalidating the choices
made by IRA.
The easiest fix seems to be to update the live-in set *before* adding
the fake live registers. An alternative would be to simulate the full
clobber, but that seems a bit wasteful.
2020-01-27 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR rtl-optimization/92989
* lra-lives.c (process_bb_lives): Update the live-in set before
processing additional clobbers.
g:3bd2918594dae34ae84f mishandled the case in which only the
tail end of a multireg hard register is invalidated by the call.
Walking all the entries should be both safer and more precise.
Avoiding cselib_invalidate_regno also means that we no longer
walk the same list multiple times (which is something we did
before g:3bd2918594dae34ae84f too).
2020-01-27 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR rtl-optimization/93170
* cselib.c (cselib_invalidate_regno_val): New function, split out
from...
(cselib_invalidate_regno): ...here.
(cselib_invalidated_by_call_p): New function.
(cselib_process_insn): Iterate over all the hard-register entries in
REG_VALUES and invalidate any that cross call-clobbered registers.
gcc/testsuite/
* gcc.dg/torture/pr93170.c: New test.
PR91323 was fixed for x86 and sparc in target code, but aarch64
instead relies on the target-independent comparison splitters.
Since LTGT is an unordered-signalling operation, we should split
it into unordered-signalling operations for any input that could
be NaN, not just inputs that could be signalling NaNs.
2020-01-27 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* dojump.c (split_comparison): Use HONOR_NANS rather than
HONOR_SNANS when splitting LTGT.
PR target/93274
* config/i386/i386-features.c (make_resolver_func):
Align the code with ppc64 target implementation.
Do not generate a unique name for resolver function.
PR target/93274
* gcc.target/i386/pr81213.c: Adjust to not expect
a globally unique name.
The following delays adjusting the SLP graph for converted reduction
chains to a point where the SLP build no longer can fail since we
otherwise fail to undo marking the conversion as a group.
2020-01-27 Richard Biener <rguenther@suse.de>
PR tree-optimization/93397
* tree-vect-slp.c (vect_analyze_slp_instance): Delay
converted reduction chain SLP graph adjustment.
* gcc.dg/torture/pr93397.c: New testcase.
The immediate issue here was that the second warning didn't depend on the
first one, so if the first location was in a system header, we'd
mysteriously give the second by itself.
It's also the case that the thing we care about being in a system header is
the function that we want to suggest adding 'noexcept' to, not the
noexcept-expression; it's useful to suggest adding noexcept to a user
function to satisfy a noexcept-expression in a system header.
PR c++/90992
* except.c (maybe_noexcept_warning): Check DECL_IN_SYSTEM_HEADER and
temporarily enable -Wsystem-headers. Change second warning to
conditional inform.
Here we crash when using -fsanitize=address -fdump-tree-sanopt because
the dumping code uses IDENTIFIER_POINTER on a null DECL_NAME. Instead,
we can print "<anonymous>" in such a case. Or we could avoid printing
that diagnostic altogether.
2020-01-26 Marek Polacek <polacek@redhat.com>
PR tree-optimization/93436
* sanopt.c (sanitize_rewrite_addressable_params): Avoid crash on
null DECL_NAME.
This amends the cases where inline comments in function calls were
followed by a space. It also fixes some uses of C++ style and wrongly
wrapped comment end markers.
gcc/cp/ChangeLog:
2020-01-26 Iain Sandoe <iain@sandoe.co.uk>
* coroutines.cc: Amend whitespace after inline comments
throughout. Ensure use of C-style comment markers.
The new gcc.target/i386/pr91298-?.c testcases FAIL on Solaris/x86 with the
native assembler:
FAIL: gcc.target/i386/pr91298-1.c (test for excess errors)
Excess errors:
Assembler: pr91298-1.c
"/var/tmp//ccE6r3xb.s", line 5 : Syntax error
Near line: " .globl $quux"
"/var/tmp//ccE6r3xb.s", line 6 : Syntax error
Near line: " .type $quux, @function"
"/var/tmp//ccE6r3xb.s", line 7 : Syntax error
Near line: "$quux:"
"/var/tmp//ccE6r3xb.s", line 15 : Syntax error
Near line: " .size $quux, .-$quux"
"/var/tmp//ccE6r3xb.s", line 24 : Syntax error
Near line: " movl $($a), %eax"
"/var/tmp//ccE6r3xb.s", line 38 : Syntax error
Near line: " leal ($a)(,%eax,4), %eax"
"/var/tmp//ccE6r3xb.s", line 51 : Syntax error
Near line: " movl ($a), %eax"
"/var/tmp//ccE6r3xb.s", line 63 : Syntax error
Near line: " movl ($a)+16, %eax"
"/var/tmp//ccE6r3xb.s", line 97 : Syntax error
Near line: " movl $($quux), %eax"
"/var/tmp//ccE6r3xb.s", line 101 : Syntax error
Near line: " .globl $a"
"/var/tmp//ccE6r3xb.s", line 104 : Syntax error
Near line: " .type $a, @object"
"/var/tmp//ccE6r3xb.s", line 105 : Syntax error
Near line: " .size $a, 72"
"/var/tmp//ccE6r3xb.s", line 106 : Syntax error
Near line: "$a:"
"/var/tmp//ccE6r3xb.s", line 228 : Syntax error
Near line: " .long ($a)"
FAIL: gcc.target/i386/pr91298-2.c (test for excess errors)
It only allows letters, digits, '_' and '.' in identifiers:
https://docs.oracle.com/cd/E37838_01/html/E61064/eqbsx.html#XALRMeoqjw
For lack of an effective-target keyword matching -fdollars-in-identifiers,
this patch fixes this by xfailing them on *-*-solaris2.* && !gas.
Tested on i386-pc-solaris2.11 with as and gas and x86_64-pc-linux-gnu.
* gcc.target/i386/pr91298-1.c: xfail on Solaris/x86 with native
assembler.
* gcc.target/i386/pr91298-2.c: Likewise.
Here, we end up calling gen_type_die_with_usage for a type that's in the
middle of finish_struct_1, after we set TYPE_NEEDS_CONSTRUCTING on it but
before we copy all the flags to the variants--and, significantly, before we
set its TYPE_SIZE. It seems reasonable to only look at
TYPE_NEEDS_CONSTRUCTING on complete types, since we aren't going to try to
create an object of an incomplete type any other way.
PR c++/92601
* tree.c (verify_type_variant): Only verify TYPE_NEEDS_CONSTRUCTING
of complete types.
In the *{add,sub}v<dwi>4_doubleword patterns, we don't really want to see a
VOIDmode last operand, because it then means invalid RTL
(sign_extend:{TI,POI} (const_int ...)) or so, and therefore something we
don't really handle in the splitter either. We have
*{add,sub}v<dwi>4_doubleword_1 pattern for those and that is what combine
will match, the problem in this testcase is just that it was only RA that
propagated the constant into the instruction.
In the similar *{add,sub}v<mode>4 patterns, we make sure not to accept
VOIDmode operand and similarly to these have _1 suffixed variant that allows
constants.
2020-01-26 Jakub Jelinek <jakub@redhat.com>
PR target/93412
* config/i386/i386.md (*addv<dwi>4_doubleword, *subv<dwi>4_doubleword):
Use nonimmediate_operand instead of x86_64_hilo_general_operand and
drop <di> from constraint of last operand.
* gcc.dg/pr93412.c: New test.
Apparently my recent patch which moved the *avx_vperm_broadcast* and
*vpermil* patterns before vpermpd broke the following testcase, the
define_insn_and_split matched always but the splitter condition only split
it if not -mavx2 for V4DFmode, basically relying on the vpermpd pattern to
come first.
The following patch fixes it by moving that part of SPLIT-CONDITION into
CONDITION, so that when it is not met, we just don't match the pattern
and thus match the later vpermpd pattern in that case.
Except, for { 0, 0, 0, 0 } permutation, there is actually no reason to do
that, vbroadcastsd from memory seems to be slightly cheaper than vpermpd $0.
2020-01-26 Jakub Jelinek <jakub@redhat.com>
PR target/93430
* config/i386/sse.md (*avx_vperm_broadcast_<mode>): Disallow for
TARGET_AVX2 and V4DFmode not in the split condition, but in the
pattern condition, though allow { 0, 0, 0, 0 } broadcast always.
* gcc.dg/pr93430.c: New test.
* gcc.target/i386/avx2-pr93430.c: New test.
warn_for_memset calls fold_for_warn, which calls fold_non_dependent_expr, so
also calling instantiate_non_dependent_expr here is undesirable.
PR c++/90997
* semantics.c (finish_call_expr): Don't call
instantiate_non_dependent_expr before warn_for_memset.
2020-01-26 Jakub Jelinek <jakub@redhat.com>
PR ipa/93166
* g++.dg/pr93166.C: Move to ...
* g++.dg/pr93166_0.C: ... here. Turn it into a proper lto test.
2020-01-26 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/92788
* g++.dg/pr92788.C: Move to ...
* g++.target/i386/pr92788.C: ... here. Remove target from dg-do line.
Change type of operator new's first parameter to __SIZE_TYPE__.
I neglected to add a proper diagnostic for the reference dynamic_cast
case when the operand of a dynamic_cast doesn't refer to a public base
of Derived, resulting in suboptimal error message
error: call to non-'constexpr' function 'void* __cxa_bad_cast()'
2020-01-25 Marek Polacek <polacek@redhat.com>
PR c++/93414 - poor diagnostic for dynamic_cast in constexpr context.
* constexpr.c (cxx_eval_dynamic_cast_fn): Add a reference
dynamic_cast diagnostic.
* g++.dg/cpp2a/constexpr-dynamic18.C: New test.
vec_zeroextend.c fails on big-endian as it assumes
0 index is the lower part but it is not for
big-endian case. This fixes the problem by
using the correct index for the lower part
for big-endian.
Committed as obvious after a test on aarch64_be-linux-gnu.
ChangeLog:
* gcc.target/aarch64/vec_zeroextend.c: Fix for big-endian.
Here, the problem was that tsubst_friend_function was modifying the
CONSTRAINT_INFO for the friend template to have the constraints for one
instantiation, which fell down when we went to adjust it for another
instantiation. Fixed by deferring substitution of trailing requirements
until we try to check declaration matching.
PR c++/93400 - ICE with constrained friend.
* constraint.cc (maybe_substitute_reqs_for): New.
* decl.c (function_requirements_equivalent_p): Call it.
* pt.c (tsubst_friend_function): Only substitute
TEMPLATE_PARMS_CONSTRAINTS.
(tsubst_template_parms): Copy constraints.
Here the problem was that we were remembering the lookup in template scope,
and then trying to reuse that lookup in the instantiation without
substituting into it at all. The simplest solution is to not try to
remember a lookup that finds a class-scope declaration, as in that case
doing the normal lookup again at instantiation time will always find the
right declarations.
PR c++/93279 - ICE with lambda in member operator.
* name-lookup.c (maybe_save_operator_binding): Don't remember
class-scope bindings.
any_template_parm_r was looking at the args of an alias template-id, but we
need to look at all args of a member alias/typedef, including implicit ones
from the enclosing class.
PR c++/93377 - ICE with member alias in constraint.
* pt.c (any_template_parm_r): Look at template arguments for all
aliases, not only alias templates.
In Agner Fog's tables, vpermilp[sd] with immediates seem to be
much faster than vpermpd with immediate, for a good reason,
the former only permute something within the lanes and don't do anything
intra-lane, while vpermpd can. So, functionality-wise, vpermilpd
is more efficient subset of vpermpd. We use the same RTL for those
though (and also for certain broadcast).
Now, the problem was that the vpermpd pattern appeared first in sse.md,
followed by the broadcast patterns, followed by the vpermilp[sd].
Which means unless -mavx -mno-avx2, we'd emit vpermpd instead of the
more efficient alternatives.
The following patch reorders them, so that vpermpd comes last, if we
can match a broadcast, we do, if we can match a vpermilp[sd] that is not a
broadcast, we will, otherwise fall back (of course only if -mavx2) to
vpermpd.
2020-01-24 Jakub Jelinek <jakub@redhat.com>
PR target/93395
* config/i386/sse.md (*avx_vperm_broadcast_v4sf,
*avx_vperm_broadcast_<mode>,
<sse2_avx_avx512f>_vpermil<mode><mask_name>,
*<sse2_avx_avx512f>_vpermilp<mode><mask_name>):
Move before avx2_perm<mode>/avx512f_perm<mode>.
* gcc.target/i386/pr93395.c: New test.
* gcc.target/i386/avx512vl-vpermilpdi-1.c: Remove xfail.
The following patch makes sure we punt in the 3 spots if precision is above
MAX_BITSIZE_MODE_ANY_INT.
2020-01-24 Jakub Jelinek <jakub@redhat.com>
PR target/93376
* simplify-rtx.c (simplify_const_unary_operation,
simplify_const_binary_operation): Punt for mode precision above
MAX_BITSIZE_MODE_ANY_INT.
Like I mentioned in https://gcc.gnu.org/ml/gcc/2020-01/msg00157.html,
The shift by a register should be just COSTS_N_INSNS (1) rather than
COSTS_N_INSNS (2). This allows lshift_cheap_p to return true now
and converting switches to be using shift and other like
structures. I noticed this difference when I was working
through PR 93131 and understanding what reassoc could handle.
ChangeLog:
* config/arm/aarch-cost-tables.h (cortexa57_extra_costs): Change
alu.shift_reg to 0.
Since e4511ca2e9ecdb51d41b64452398f8e2df575668 force_paren_expr can create
a VIEW_CONVERT_EXPR so that we have something to set REF_PARENTHESIZED_P
on, while not making the expression dependent. But tsubst_copy can't cope
with such a VIEW_CONVERT_EXPR, because it's not location_wrapper_p, or
a TEMPLATE_PARM_INDEX wrapped in a VIEW_CONVERT_EXPR.
I think we need to teach tsubst_copy how to handle it. Setting
EXPR_LOCATION_WRAPPER_P in force_paren_expr would make the ICE go away
too, but tsubst_copy would lose the REF_PARENTHESIZED_P flag.
2020-01-24 Marek Polacek <polacek@redhat.com>
PR c++/93299 - ICE in tsubst_copy with parenthesized expression.
* pt.c (tsubst_copy): Handle a REF_PARENTHESIZED_P VIEW_CONVERT_EXPR.
* g++.dg/cpp1y/paren5.C: New test.
Another place we need to unshare cached expressions.
PR c++/92852 - ICE with generic lambda and reference var.
* constexpr.c (maybe_constant_value): Likewise.
The _Eq and _Ord enumerations can be combined into one, reducing the
number of constructors needed for the comparison category types. The
redundant equal enumerator can be removed and equivalent used in its
place. The _Less and _Greater enumerators can be renamed because 'less'
and 'greater' are already reserved names anyway.
* libsupc++/compare (__cmp_cat::_Eq): Remove enumeration type.
(__cmp_cat::_Ord::equivalent): Add enumerator.
(__cmp_cat::_Ord::_Less, __cmp_cat::_Ord::_Greater): Rename to less
and greater.
(partial_ordering, weak_ordering, strong_ordering): Remove
constructors taking __cmp_cat::_Eq parameters. Use renamed
enumerators.
PR target/13721
* config/h8300/h8300.c (h8300_print_operand): Only call byte_reg
for REGs. Call output_operand_lossage to get more reasonable
diagnostics.
PR target/13721
* gcc.target/h8300/pr13721.c: New test.
2020-01-24 Andrew Stubbs <ams@codesourcery.com>
gcc/
* config/gcn/gcn-valu.md (vec_cmp<mode>di): Use
gcn_fp_compare_operator.
(vec_cmpu<mode>di): Use gcn_compare_operator.
(vec_cmp<u>v64qidi): Use gcn_compare_operator.
(vec_cmp<mode>di_exec): Use gcn_fp_compare_operator.
(vec_cmpu<mode>di_exec): Use gcn_compare_operator.
(vec_cmp<u>v64qidi_exec): Use gcn_compare_operator.
(vec_cmp<mode>di_dup): Use gcn_fp_compare_operator.
(vec_cmp<mode>di_dup_exec): Use gcn_fp_compare_operator.
(vcond<VEC_ALLREG_MODE:mode><VEC_ALLREG_ALT:mode>): Use
gcn_fp_compare_operator.
(vcond<VEC_ALLREG_MODE:mode><VEC_ALLREG_ALT:mode>_exec): Use
gcn_fp_compare_operator.
(vcondu<VEC_ALLREG_MODE:mode><VEC_ALLREG_INT_MODE:mode>): Use
gcn_fp_compare_operator.
(vcondu<VEC_ALLREG_MODE:mode><VEC_ALLREG_INT_MODE:mode>_exec): Use
gcn_fp_compare_operator.
Whilst trying to convert the add vendor branch script to work with
personal branches I encountered a minor issue where git would report
ambiguous refs when checking out the new branch.
It turns out that this is because git considers <me>/<topic> to be
ambiguous if both
refs/heads/<me>/<topic>
and
refs/remotes/<me>/<topic>
exist in the list of known branches.
Having thought about this a bit, I think the best solution is to adopt
something more like the vendors space and call the remote users/<me>
(this also works better if you want to set up remotes to track other
users branches as well).
So this patch has two parts.
1) It updates gcc-git-customization.sh to set up the new 'remote' and
converts any existing remote and branches tracking that to the new
format
2) It adds a new script to set up a personal branch on the gcc git repository.
* gcc-git-customization.sh: Use users/<pfx> for the personal remote
rather than just <pfx>. Convert any existing personal branches to the
new remote.
* git-add-user-branch.sh: New file.
I noticed, but ignored this code when addressing p80005, but having
fixed up defined(X) on the modules branch, I could see where it came
from, and it's obviously wrong as we've just pulled out a string
contant from the token.
* expr.c (parse_has_include): Remove bogus controlling macro code.