After committing it I noticed I'd also accidentally added a change to
__synth3way as well, which I meant to do in a separate commit. I've
updated the changelog entry to reflect that additional change.
* libsupc++/compare (__detail::__synth3way): Add noexcept-specifier.
With P1614R2 fully implemented (except for the <chrono> types which we
don't support at all) we can define the feature test macro to the new
value.
* include/std/version (__cpp_lib_three_way_comparison): Update value.
* libsupc++/compare (__cpp_lib_three_way_comparison): Likewise.
This patch fixes a large lmbench performance regression with
128-bit SVE, compiled in length-agnostic mode.
vect_better_loop_vinfo_p (new in GCC 10) tries to estimate whether
a new loop_vinfo is cheaper than a previous one, with an in-built
preference for the old one. For variable VF it prefers the old
loop_vinfo if it is cheaper for at least one VF. However, we have
no idea how likely that VF is in practice.
Another extreme would be to do what most of the rest of the
vectoriser does, and rely solely on the constant estimated VF.
But as noted in the comment, this means that a one-unit cost
difference would be enough to pick the new loop_vinfo,
despite the target generally preferring the old loop_vinfo
where possible. The cost model just isn't accurate enough
for that to produce good results as things stand: there might
not be any practical benefit to the new loop_vinfo at the
estimated VF, and it would be significantly worse for higher VFs.
The patch instead goes for a hacky compromise: make sure that the new
loop_vinfo is also no worse than the old loop_vinfo at double the
estimated VF. For all but trivial loops, this ensures that the
new loop_vinfo is only chosen if it is better than the old one
by a non-trivial amount at the estimated VF. It also avoids
putting too much faith in the VF estimate.
I realise this isn't great, but it's supposed to be a conservative fix
suitable for stage 4. The only affected testcases are the ones for
pr89007-*.c, where Advanced SIMD is indeed preferred for 128-bit SVE
and is no worse for 256-bit SVE.
Part of the problem here is that if the new loop_vinfo is better,
we discard the old one and never consider using it even as an
epilogue loop. This means that if we choose Advanced SIMD over SVE,
we're much more likely to have left-over scalar elements.
Another is that the estimate provided by estimated_poly_value might have
different probabilities attached. E.g. when tuning for a particular core,
the estimate is probably accurate, but when tuning for generic code,
the estimate is more of a guess. Relying solely on the estimate is
probably correct for the former but not for the latter.
Hopefully those are things that we could tackle in GCC 11.
2020-04-20 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-loop.c (vect_better_loop_vinfo_p): If old_loop_vinfo
has a variable VF, prefer new_loop_vinfo if it is cheaper for the
estimated VF and is no worse at double the estimated VF.
gcc/testsuite/
* gcc.target/aarch64/sve/cost_model_8.c: New test.
* gcc.target/aarch64/sve/cost_model_9.c: Likewise.
* gcc.target/aarch64/sve/pr89007-1.c: Add -msve-vector-bits=512.
* gcc.target/aarch64/sve/pr89007-2.c: Likewise.
This testcase triggered an ICE in rtx_vector_builder::step because
we were trying to use a stepped representation for floating-point
constants. The underlying problem was that the arguments to
rtx_vector_builder were the wrong way around, meaning that some
variations were likely to be incorrectly encoded for integers
(but probably as a silent failure).
Also, aarch64_sve_expand_vector_init_handle_trailing_constants
tries to extend the trailing constant elements to a full vector
by following the "natural" pattern of the original vector, which
should generally lead to nicer constants. However, for the testcase,
we'd then end up picking a variable for some elements. Fixed by
stubbing out all variable elements with zeros.
That fix involved testing valid_for_const_vector_p. For consistency,
the patch uses the same test when finding trailing constants, instead
of the previous aarch64_legitimate_constant_p.
2020-04-20 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR target/94668
* config/aarch64/aarch64.c (aarch64_sve_expand_vector_init): Fix
order of arguments to rtx_vector_builder.
(aarch64_sve_expand_vector_init_handle_trailing_constants): Likewise.
When extending the trailing constants to a full vector, replace any
variables with zeros.
gcc/testsuite/
PR target/94668
* gcc.target/aarch64/sve/pr94668.c: New test.
If extra_tool_flags starts with a dash, an error like 'ERROR: verbose:
illegal argument: -march=native -O2 -std=c++17' is printed. This is
easily fixed by inserting a double dash before the variable.
2020-04-20 Matthias Kretz <kretz@kde.org>
* testsuite/lib/libstdc++.exp: Avoid illegal argument to verbose.
We treat tpl-tpl-parms as types. They're not; bound-tpl-tpl-parms
are. We can get away with them being type-like. Unfortunately we
give the original level==orig_level case a canonical type, but the
reduced cases of level<orig_level get structural equality. This patch
gives them structural type always.
* pt.c (canonical_type_parameter): Assert not a tpl-tpl-parm.
(process_template_parm): tpl-tpl-parms are structural.
(rewrite_template_parm): Propagate structuralness.
We were not comparing expression pack expansions correctly. We could
consider distinct expansions equal and creating two, apparently equal,
specializations that would sometimes collide. cp_tree_operand_length
says a pack has 1 operand (for mangling), whereas it actually has 3,
but only two of which are significant for equality. We must special
case that in cp_tree_equal. That new code matches the hasher and the
type_pack_expansion case in structural_comp_types.
* tree.c (cp_tree_equal): [TEMPLATE_ID_EXPR, default] Refactor.
[EXPR_PACK_EXPANSION]: Add.
One of the problems hit by pr94454 was that the argument hasher was
not skipping nodes that template_args_equal would. Fixed by replacing
the STRIP_NOPS invocation by a bespoke loop. We also confuse the
canonical type machinery by treating tpl-tpl-parms as types. They're
not; bound-tpl-tpl-parms are. We can get away with them being
type-like. Unfortunately we give the original level==orig_level case
a canonical type, but the reduced cases of level<orig_level get
structural equality. That breaks the hasher because we'll use
TYPE_HASH (CANONICAL_TYPE ()) when we can. There's a note in
tsubst[TEMPLATE_TEMPLATE_PARM] about why the reduced ones cannot have
a canonical type. (I didn't feel like questioning that assertion at
this point.)
* pt.c (iterative_hash_template_arg): Strip nodes as
template_args_equal does.
[ARGUMENT_PACK_SELECT, TREE_VEC, CONSTRUCTOR]: Refactor.
[node_class:TEMPLATE_TEMPLATE_PARM]: Hash by level & index.
[node_class:default]: Refactor.
Add missing check in gfc_set_array_spec for sum of rank and corank to not
exceed GFC_MAX_DIMENSIONS.
2020-04-20 Harald Anlauf <anlauf@gmx.de>
PR fortran/93364
* array.c (gfc_set_array_spec): Check for sum of rank and corank
not exceeding GFC_MAX_DIMENSIONS.
2020-04-20 Harald Anlauf <anlauf@gmx.de>
PR fortran/93364
* gfortran.dg/pr93364.f90: New test.
2020-04-20 Steve Kargl <kargl@gcc.gnu.org>
Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/91800
* decl.c (variable_decl): Reject Hollerith constants as type
initializer.
2020-04-20 Steve Kargl <kargl@gcc.gnu.org>
Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/91800
* gfortran.dg/hollerith_9.f90: New test.
Testing on the host does not make sense for 'declare copyout' for
a same-scope stack-allocated variable. Once the copyout is done,
the variable is gone. Hence, test the variable on the device. This
can be revisit after the OpenACC semantic has been fixed; but with
that fix, the test PASSes again with devices.
PR middle-end/94120
* testsuite/libgomp.oacc-c++/declare-pr94120.C: Fix 'declare copy(out)'
test case.
Some more C++20 changes from P1614R2, "The Mothership has Landed".
* include/bits/stl_queue.h (queue): Define operator<=> for C++20.
* include/bits/stl_stack.h (stack): Likewise.
* testsuite/23_containers/queue/cmp_c++20.cc: New test.
* testsuite/23_containers/stack/cmp_c++20.cc: New test.
This appears to be a copy&paste error, which cppcheck diagnoses.
PR other/94629
* include/debug/formatter.h (_Error_formatter::_Parameter): Fix
redundant assignment in constructor.
std.array.Appender and RefAppender: use .opSlice() instead of data()
Previously, Appender.data() was used to extract a slice of the Appender's array.
Now use the [] slice operator instead. The same goes for RefAppender.
Fixes: PR d/94455
Reviewed-on: https://github.com/dlang/phobos/pull/7450
According to "Intel 64 and IA32 Arch SDM, Vol. 3:
"Because SIMD floating-point exceptions are precise and occur immediately,
the situation does not arise where an x87 FPU instruction, a WAIT/FWAIT
instruction, or another SSE/SSE2/SSE3 instruction will catch a pending
unmasked SIMD floating-point exception."
Remove unneeded assignments to volatile memory.
libgcc/ChangeLog:
* config/i386/sfp-exceptions.c (__sfp_handle_exceptions) [__SSE_MATH__]:
Remove unneeded assignments to volatile memory.
libatomic/ChangeLog:
* config/x86/fenv.c (__atomic_feraiseexcept) [__SSE_MATH__]:
Remove unneeded assignments to volatile memory.
libgfortran/ChangeLog:
* config/fpu-387.h (local_feraiseexcept) [__SSE_MATH__]:
Remove unneeded assignments to volatile memory.
While the coroutines implementation, and most of the coroutines
tests, will operate with C++14 or newer, these tests require
facilities introduced in C++17. Add the target requirement.
gcc/testsuite/
2020-04-19 Iain Sandoe <iain@sandoe.co.uk>
* g++.dg/coroutines/torture/co-await-17-capture-comp-ref.C: Require
C++17.
* g++.dg/coroutines/torture/co-ret-15-default-return_void.C: Likewise.
Returning &gfc_bad_expr when simplifying bounds after a divisin by zero
happened results in the division by zero error actually reaching the user.
2020-04-19 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/93500
* resolve.c (resolve_operator): If both operands are
NULL, return false.
* simplify.c (simplify_bound): If a division by zero
was seen during bound simplification, free the
corresponcing expression and return &gfc_bad_expr.
2020-04-19 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/93500
* arith_divide_3.f90: New test.
Similarly to inline asm, :: (or any other number of consecutive colons) can
appear in ObjC @selector argument and with the introduction of CPP_SCOPE
into the C FE, we need to trat CPP_SCOPE as two CPP_COLON tokens.
The C++ FE does that already that way.
2020-04-19 Jakub Jelinek <jakub@redhat.com>
PR objc/94637
* c-parser.c (c_parser_objc_selector_arg): Handle CPP_SCOPE like
two CPP_COLON tokens.
* objc.dg/pr94637.m: New test.
Patch fixes test failure seen on X32 where a nested struct was passed in
registers, rather than via invisible reference. Now, all non-POD
structs are passed by invisible reference, not just those with a
user-defined copy constructor/destructor.
gcc/d/ChangeLog:
PR d/94609
* d-codegen.cc (argument_reference_p): Don't check TREE_ADDRESSABLE.
(type_passed_as): Build reference type if TREE_ADDRESSABLE.
* d-convert.cc (convert_for_argument): Build explicit TARGET_EXPR if
needed for arguments passed by invisible reference.
* types.cc (TypeVisitor::visit (TypeStruct *)): Mark all structs that
are not POD as TREE_ADDRESSABLE.
The intended purpose of the option is both for targets that don't
support phobos yet, and for gdc itself to support bootstrapping itself
as a self-hosted D compiler.
The libphobos testsuite has been updated to only add libphobos to the
search paths if it's being built. A new D2 testsuite directive
RUNNABLE_PHOBOS_TEST has also been patched in to disable some runnable
tests that have phobos dependencies, of which is a temporary measure
until upstream DMD fixes or removes these tests entirely.
gcc/testsuite/ChangeLog:
* lib/gdc-utils.exp (gdc-convert-test): Add dg-skip-if for tests that
depending on the phobos standard library.
libphobos/ChangeLog:
* configure: Regenerate.
* configure.ac: Add --with-libphobos-druntime-only option and the
conditional ENABLE_LIBDRUNTIME_ONLY.
* configure.tgt: Define LIBDRUNTIME_ONLY.
* src/Makefile.am: Add phobos sources if not ENABLE_LIBDRUNTIME_ONLY.
* src/Makefile.in: Regenerate.
* testsuite/testsuite_flags.in: Add phobos path if compiling phobos.
The current check_effective_target_d_runtime procedure returns false if
the target is built without any core runtime library for D being
available (--disable-libphobos). This additional procedure is for
targets where the core runtime library exists, but without the higher
level standard library.
gcc/ChangeLog:
* doc/sourcebuild.texi (Effective-Target Keywords, Environment
attributes): Document d_runtime_has_std_library.
gcc/testsuite/ChangeLog:
* gdc.dg/link.d: Use d_runtime_has_std_library effective target.
* gdc.dg/runnable.d: Move phobos tests to...
* gdc.dg/runnable2.d: ...here. New test.
* lib/target-supports.exp
(check_effective_target_d_runtime_has_std_library): New.
libphobos/ChangeLog:
* testsuite/libphobos.phobos/phobos.exp: Skip if effective target is
not d_runtime_has_std_library.
* testsuite/libphobos.phobos_shared/phobos_shared.exp: Likewise.
In the testcase below, during specialization of c<int>::d, we build two
identical specializations of the parameter type b<decltype(e)::k> -- one when
substituting into c<int>::d's TYPE_ARG_TYPES and another when substituting into
c<int>::d's DECL_ARGUMENTS.
We don't reuse the first specialization the second time around as a consequence
of the fix for PR c++/56247 which made PARM_DECLs always compare different from
one another during spec_hasher::equal. As a result, when looking up existing
specializations of 'b', spec_hasher::equal considers the template argument
decltype(e')::k to be different from decltype(e'')::k, where e' and e'' are the
result of two calls to tsubst_copy on the PARM_DECL e.
Since the two specializations are considered different due to the mentioned fix,
their TYPE_CANONICAL points to themselves even though they are otherwise
identical types, and this triggers an ICE in maybe_rebuild_function_decl_type
when comparing the TYPE_ARG_TYPES of c<int>::d to its DECL_ARGUMENTS.
This patch fixes this issue at the spec_hasher::equal level by ignoring the
'comparing_specializations' flag in cp_tree_equal whenever the DECL_CONTEXTs of
the two parameters are identical. This seems to be a sufficient condition to be
able to correctly compare PARM_DECLs structurally. (This also subsumes the
CONSTRAINT_VAR_P check since constraint variables all have empty, and therefore
identical, DECL_CONTEXTs.)
gcc/cp/ChangeLog:
PR c++/94632
* tree.c (cp_tree_equal) <case PARM_DECL>: Ignore
comparing_specializations if the parameters' contexts are identical.
gcc/testsuite/ChangeLog:
PR c++/94632
* g++.dg/template/canon-type-14.C: New test.
When updating an auto return type of an abbreviated function template in
splice_late_return_type, we should also propagate PLACEHOLDER_TYPE_CONSTRAINTS
(and cv-qualifiers) of the original auto node.
gcc/cp/ChangeLog:
PR c++/92187
* pt.c (splice_late_return_type): Propagate cv-qualifiers and
PLACEHOLDER_TYPE_CONSTRAINTS from the original auto node to the new one.
gcc/testsuite/ChangeLog:
PR c++/92187
* g++.dg/concepts/abbrev5.C: New test.
* g++.dg/concepts/abbrev6.C: New test.
Some more C++20 changes from P1614R2, "The Mothership has Landed".
* include/std/chrono (duration, time_point): Define operator<=> and
remove redundant operator!= for C++20.
* testsuite/20_util/duration/comparison_operators/three_way.cc: New
test.
* testsuite/20_util/time_point/comparison_operators/three_way.cc: New
test.
In C++20 the rebind and const_reference members of std::allocator are
gone, so this testsuite utility stopped working, causing
ext/pb_ds/regression/priority_queue_rand_debug.cc to FAIL.
* testsuite/util/native_type/native_priority_queue.hpp: Use
allocator_traits to rebind allocator.
This time instead of having a NOP copy insn that we can completely ignore and
ultimately remove, we have a NOP set within a multi-set PARALLEL. It triggers,
the same failure when the source of such a set is a hard register for the same
reasons as we've already noted in the BZ and patches-to-date.
For prior cases we've been able to mark the insn as a nop set and ignore it for
the rest of cse_insn, ultimately removing it. That's not really an option here
as there are other sets that we have to preserve.
We might be able to fix this instance by splitting the multi-set insn, but I'm
not keen to introduce splitting into cse. Furthermore, the target may not be
able to split the insn. So I considered this is non-starter.
What I finally settled on was to use the existing do_not_record machinery to
ignore the nop set within the parallel (and only that set within the parallel).
One might argue that we should always ignore a REG_UNUSED set. But I rejected
that idea -- we could have cse-able divmod insns where the first had a
REG_UNUSED note for a destination, but the second did not.
One might also argue that we could have a nop set without a REG_UNUSED in a
multi-set parallel and thus we could trigger yet another insert_regs ICE at
some point. I tend to think this is a possibility. If we see this happen,
we'll have to revisit.
PR rtl-optimization/90275
* cse.c (cse_insn): Avoid recording nop sets in multi-set parallels
when the destination has a REG_UNUSED note.
In this PR, we're ICEing on a use of an 'int... a' template parameter pack as
part of the variadic lambda init-capture [...z=a].
The unexpected thing about this variadic init-capture is that it is not
type-dependent, and so the call to do_auto_deduction from
lambda_capture_field_type actually resolves its type to 'int' instead of exiting
early like it does for a type-dependent variadic initializer. This later
confuses add_capture which, according to one of its comments, assumes that
'type' is always 'auto' for a variadic init-capture.
The simplest fix (and the approach that this patch takes) seems to be to avoid
doing auto deduction in lambda_capture_field_type when the initializer uses
parameter packs, so that we always return 'auto' even in the non-type-dependent
case.
gcc/cp/ChangeLog:
PR c++/94483
* lambda.c (lambda_capture_field_type): Avoid doing auto deduction if
the explicit initializer has parameter packs.
gcc/testsuite/ChangeLog:
PR c++/94483
* g++.dg/cpp2a/lambda-pack-init5.C: New test.
In the testcase for this PR, we try to parse the statement
A(value<0>());
first tentatively as a declaration (with a parenthesized declarator), and during
this tentative parse we end up issuing a hard error from
cp_parser_check_template_parameters about its invalidness as a declaration.
Rather than issuing a hard error, it seems we should instead simulate an error
since we're parsing tentatively. This would then allow cp_parser_statement to
recover and successfully parse the statement as an expression-statement instead.
gcc/cp/ChangeLog:
PR c++/88754
* parser.c (cp_parser_check_template_parameters): Before issuing a hard
error, first try simulating an error instead.
gcc/testsuite/ChangeLog:
PR c++/88754
* g++.dg/parse/ambig10.C: New test.
The attached patch fixes an ICE on invalid: When the return type of
a function was misdeclared with a wrong rank, we issued a warning,
but not an error (unless with -pedantic); later on, an ICE ensued.
Nothing good can come from wrongly declaring a function type
(considering the ABI), so I changed that into a hard error.
2020-04-17 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/94090
* gfortran.dg (gfc_compare_interfaces): Add
optional argument bad_result_characteristics.
* interface.c (gfc_check_result_characteristics): Fix
whitespace.
(gfc_compare_interfaces): Handle new argument; return
true if function return values are wrong.
* resolve.c (resolve_global_procedure): Hard error if
the return value of a function is wrong.
2020-04-17 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/94090
* gfortran.dg/interface_46.f90: New test.