The following patch adds some improvements for __builtin_mul_overflow
expansion.
One optimization is for the u1 * u2 -> sr case, as documented we normally
do:
u1 * u2 -> sr
res = (S) (u1 * u2)
ovf = res < 0 || main_ovf (true)
where main_ovf (true) stands for jump on unsigned multiplication overflow.
If we know that the most significant bits of both operands are clear (such
as when they are zero extended from something smaller), we can
emit better coe by handling it like s1 * s2 -> sr, i.e. just jump on
overflow after signed multiplication.
Another two cases are s1 * s2 -> ur or s1 * u2 -> ur, if we know the minimum
precision needed to encode all values of both arguments summed together
is smaller or equal to destination precision (such as when the two arguments
are sign (or zero) extended from half precision types, we know the overflows
happen only iff one argument is negative and the other argument is positive
(not zero), because even if both have maximum possible values, the maximum
is still representable (e.g. for char * char -> unsigned short
0x7f * 0x7f = 0x3f01 and for char * unsigned char -> unsigned short
0x7f * 0xffU = 0x7e81) and as the result is unsigned, all negative results
do overflow, but are also representable if we consider the result signed
- all of them have the MSB set. So, it is more efficient to just
do the normal multiplication in that case and compare the result considered
as signed value against 0, if it is smaller, overflow happened.
And the get_min_precision change is to improve the char to short handling,
we have there in the IL
_2 = (int) arg_1(D);
promotion from C promotions from char or unsigned char arg, and the caller
adds a NOP_EXPR cast to short or unsigned short. get_min_precision punts
on the narrowing cast though, it handled only widening casts, but we can
handle narrowing casts fine too, by recursing on the narrowing cast operands
and using it only if it has in the end smaller minimal precision, which
would duplicate the sign bits (or zero bits) to both the bits above the
narrowing conversion and also at least one below that.
2020-10-25 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/95862
* internal-fn.c (get_min_precision): For narrowing conversion, recurse
on the operand and if the operand precision is smaller than the
current one, return that smaller precision.
(expand_mul_overflow): For s1 * u2 -> ur and s1 * s2 -> ur cases
if the sum of minimum precisions of both operands is smaller or equal
to the result precision, just perform normal multiplication and
set overflow to the sign bit of the multiplication result. For
u1 * u2 -> sr if both arguments have the MSB known zero, use
normal s1 * s2 -> sr expansion.
* gcc.dg/builtin-artih-overflow-5.c: New test.
* cfg.c (free_block): New function.
(clear_edges): Rename to ....
(free_cfg): ... this one; also free BBs and vectors.
(expunge_block): Update comment.
* cfg.h (clear_edges): Rename to ...
(free_cfg): ... this one.
* cgraph.c (release_function_body): Use free_cfg.
This makes sure to lower VECTOR_BOOLEAN_TYPE_P typed non-vector
mode VEC_COND_EXPRs so we don't try to use vcond to expand those.
That's required for x86 and gcn integer mode boolean vectors.
2020-11-25 Richard Biener <rguenther@suse.de>
PR middle-end/97579
* gimple-isel.cc (gimple_expand_vec_cond_expr): Lower
VECTOR_BOOLEAN_TYPE_P, non-vector mode VEC_COND_EXPRs.
* gcc.dg/pr97579.c: New testcase.
The testcase had invalid assumptions about which loop iterations would run
first and last.
libgomp/ChangeLog
* testsuite/libgomp.oacc-fortran/atomic_capture-1.f90 (main): Adjust
expected results.
gcc/ada/
* freeze.adb (Is_Uninitialized_Aggregate): Move...
* exp_util.adb (Is_Uninitialized_Aggregate): ... here.
(Expand_Subtype_From_Expr): If the expression is an
uninitialized aggregate, capture subtype for declared object and
remove expression to suppress further superfluous expansion.
gcc/ada/
* sem_eval.adb (Subtypes_Statically_Compatible): Scalar types
with compatible static bounds are statically compatible if
predicates are compatible, even if they are not static subtypes.
Same for private types without discriminants.
gcc/ada/
* exp_ch11.adb (Expand_N_Raise_Statement): Use Is_Entity_Name
consistently in tests on the name of the statement.
* exp_prag.adb (Expand_Pragma_Check): In the local propagation
case, wrap the raise statement in a block statement.
gcc/ada/
* exp_ch8.adb (Expand_N_Exception_Renaming_Declaration): Move
"Nam" constant after the body of a nested subprogram; change "T"
from variable to constant.
gcc/ada/
* doc/gnat_rm/implementation_defined_attributes.rst
(Has_Tagged_Values): Document based on the existing description
of Has_Access_Type and the comment for Has_Tagged_Component,
which is where frontend evaluates this attribute.
* gnat_rm.texi: Regenerate.
* sem_attr.adb (Analyze_Attribute): Merge processing of
Has_Access_Type and Has_Tagged_Component attributes.
* sem_util.adb (Has_Access_Type): Fix casing in comment.
* sem_util.ads (Has_Tagged_Component): Remove wrong (or
outdated) comment about the use of this routine to implement the
equality operator.
gcc/ada/
* sem_ch13.adb (Analyze_One_Aspect): Detect aspect identifiers
with membership tests.
(Check_Aspect_At_End_Of_Declarations): Likewise.
(Freeze_Entity_Checks): Likewise; a local constant is no longer
needed.
(Is_Operational_Item): Similar simplification for attribute
identifiers.
(Is_Type_Related_Rep_Item): Likewise.
(Resolve_Iterable_Operation): Detect names with a membership
test.
(Validate_Independence): Replace repeated Ekind with a
membership test.
gcc/ada/
* exp_ch2.adb (Expand_Entity_Reference): A new local predicate
Is_Object_Renaming_Name indicates whether a given expression
occurs (after looking through qualified expressions and type
conversions) as the name of an object renaming declaration. If
Current_Value is available but this new predicate is True, then
ignore the availability of Current_Value.
gcc/ada/
* doc/gnat_rm/intrinsic_subprograms.rst (Shifts and Rotates):
Document behavior on negative numbers
* gnat_rm.texi: Regenerate.
* sem_eval.adb (Fold_Shift): Set modulus to be based on the RM
size for non-modular integer types.
gcc/ada/
* exp_util.adb (Remove_Side_Effects): Only remove side-effects
in GNATprove mode when this is useful.
* sem_res.adb (Set_Slice_Subtype): Make sure in GNATprove mode
to define the Itype when needed, so that run-time errors can be
analyzed.
* sem_util.adb (Enclosing_Declaration): Correctly take into
account renaming declarations.
gcc/ada/
* libgnat/g-rannum.ads (Random): New functions returning 128-bit.
* libgnat/g-rannum.adb (Random): Implement them and alphabetize.
(To_Signed): New unchecked conversion function for 128-bit.
gcc/ada/
* exp_util.adb (Attribute_Constrained_Static_Value): Fix body
box.
* sem_attr.adb (Eval_Attribute): Replace repeated calls to
Attribute_Name with a captured value of the Attribute_Id; also,
remove extra parens around Is_Generic_Type.
gcc/ada/
* exp_attr.adb (Expand_N_Attribute_Reference): A variable that
is only incremented in the code has now type Nat; conversion is
now unnecessary.
A while back I submitted GCC10 commit:
44f77a6dea2f312ee1743f3dde465c1b8453ee13
for PR91816.
Turns out I was an idiot and forgot to include the test in the actual git commit.
Tested that the test still passes on a cross arm-none-eabi and also in a
Cortex A-15 bootstrap with no regressions.
gcc/testsuite/ChangeLog:
PR target/91816
* gcc.target/arm/pr91816.c: New test.
libstdc++-v3/ChangeLog:
PR libstdc++/97936
* include/bits/atomic_wait.h (__platform_wait): Check errno,
not just the value of EAGAIN.
(__waiters::__waiters()): Fix name of data member.
The __platform_wait function is supposed to wait until *addr != old.
The futex syscall checks the initial value and returns EAGAIN if *addr
!= old is already true, which should cause __platform_wait to return.
Instead it loops and keeps doing a futex wait, which keeps returning
EAGAIN.
libstdc++-v3/ChangeLog:
PR libstdc++/97936
* include/bits/atomic_wait.h (__platform_wait): Return if futex
sets EAGAIN.
* testsuite/30_threads/latch/3.cc: Re-enable test.
* testsuite/30_threads/semaphore/try_acquire_until.cc: Likewise.
As mentioned in the PR, we currently ICE on flexible array members in
structs and unions during __builtin_clear_padding processing.
Jason said in the PR he'd prefer an error in these cases over forcefully
handling it as [0] arrays (everything is padding then) or consider the
arrays to have as many whole elements as would fit into the tail padding.
So, this patch implements that.
2020-11-25 Jakub Jelinek <jakub@redhat.com>
PR middle-end/97943
* gimple-fold.c (clear_padding_union, clear_padding_type): Error on and
ignore flexible array member fields. Ignore fields with
error_mark_node type.
* c-c++-common/builtin-clear-padding-2.c: New test.
* c-c++-common/builtin-clear-padding-3.c: New test.
* g++.dg/ext/builtin-clear-padding-1.C: New test.
* gcc.dg/builtin-clear-padding-2.c: New test.
These tests are unstable and causing failures due to timeouts. Disable
them until the cause can be found, so that testing doesn't have to wait
for them to timeout.
libstdc++-v3/ChangeLog:
PR libstdc++/97936
PR libstdc++/97944
* testsuite/29_atomics/atomic_integral/wait_notify.cc: Disable.
Do not require pthreads, but add -pthread when appropriate.
* testsuite/30_threads/jthread/95989.cc: Likewise.
* testsuite/30_threads/latch/3.cc: Likewise.
* testsuite/30_threads/semaphore/try_acquire_until.cc: Likewise.
split_nonconstant_init_1 was confused by a CONSTRUCTOR with non-aggregate
type, which (with COMPOUND_LITERAL_P set) we use in a template to represent
a braced functional cast. It seems to me that there's no good reason to do
split_nonconstant_init at all in a template.
gcc/cp/ChangeLog:
PR c++/97899
* typeck2.c (store_init_value): Don't split_nonconstant_init in a
template.
gcc/testsuite/ChangeLog:
PR c++/97899
* g++.dg/cpp0x/initlist-template3.C: New test.
This reverts commit c4fa3728ab4f78984a549894e0e8c4d6a253e540,
which caused a regression in the default for flag_excess_precision.
2020-11-24 Ulrich Weigand <uweigand@de.ibm.com>
gcc/
PR tree-optimization/97970
* doc/invoke.texi (-ffast-math): Revert last change.
* opts.c: Revert last change.
gcc/
2020-11-24 Vladimir Makarov <vmakarov@redhat.com>
PR bootstrap/97933
* lra.c (lra_process_new_insns): Stop on the first real insn after
head of e->dest.
arm_split_atomic_op handles subtracting a constant by converting it
into addition of the negated constant. But if the type of the operand
is int and the constant is -1 we currently end up generating invalid
RTL which can lead to an abort later on.
The problem is that in a HOST_WIDE_INT, INT_MIN is represented as
0xffffffff80000000 and the negation of this is 0x0000000080000000, but
that's not a valid constant for use in SImode operations.
The fix is straight-forward which is to use gen_int_mode rather than
simply GEN_INT. This knows how to correctly sign-extend the negated
constant when this is needed.
gcc/
PR target/97534
* config/arm/arm.c (arm_split_atomic_op): Use gen_int_mode when
negating a const_int.
gcc/testsuite
* gcc.dg/pr97534.c: New test.
Deferred macros are needed for C++ modules. Header units may export
macro definitions and undefinitions. These are resolved lazily at the
point of (potential) use. (The language specifies that, it's not just
a useful optimization.) Thus, identifier nodes grow a 'deferred'
field, which fortunately doesn't expand the structure on 64-bit
systems as there was padding there. This is non-zero on NT_MACRO
nodes, if the macro is deferred. When such an identifier is lexed, it
is resolved via a callback that I added recently. That will either
provide the macro definition, or discover it there was an overriding
undef. Either way the identifier is no longer a deferred macro.
Notice it is now possible for NT_MACRO nodes to have a NULL macro
expansion.
libcpp/
* include/cpplib.h (struct cpp_hashnode): Add deferred field.
(cpp_set_deferred_macro): Define.
(cpp_get_deferred_macro): Declare.
(cpp_macro_definition): Reformat, add overload.
(cpp_macro_definition_location): Deal with deferred macro.
(cpp_alloc_token_string, cpp_compare_macro): Declare.
* internal.h (_cpp_notify_macro_use): Return bool
(_cpp_maybe_notify_macro_use): Likewise.
* directives.c (do_undef): Check macro is not undef before
warning.
(do_ifdef, do_ifndef): Deal with deferred macro.
* expr.c (parse_defined): Likewise.
* lex.c (cpp_allocate_token_string): Break out of ...
(create_literal): ... here. Call it.
(cpp_maybe_module_directive): Deal with deferred macro.
* macro.c (cpp_get_token_1): Deal with deferred macro.
(warn_of_redefinition): Deal with deferred macro.
(compare_macros): Rename to ...
(cpp_compare_macro): ... here. Make extern.
(cpp_get_deferred_macro): New.
(_cpp_notify_macro_use): Deal with deferred macro, return bool
indicating definedness.
(cpp_macro_definition): Deal with deferred macro.
Various aapcs64 tests were failing at -O1 and above because
the assignments to testfunc_ptr were being deleted as dead.
That in turn happened because FUNC_VAL_CHECK hid the tail call
to myfunc using an LR asm trick:
asm volatile ("mov %0, x30" : "=r" (saved_return_address));
asm volatile ("mov x30, %0" : : "r" ((unsigned long long) myfunc));
and so the compiler couldn't see any calls that might read
testfunc_ptr.
That in itself could be fixed by adding a memory clobber to the
second asm above, forcing the compiler to keep both the testfunc_ptr
and the saved_return_address assignments. But since this is an ABI
test, it seems better to make sure that we don't do any IPA at all.
The fact that doing IPA caused a problem was kind-of helpful and
so it might be better to avoid making the test “work” in the
presence of IPA.
The patch therefore just replaced “noinline” with “noipa”.
gcc/testsuite/
* gcc.target/aarch64/aapcs64/abitest.h (FUNC_VAL_CHECK): Use
noipa rather than noinline.
* gcc.target/aarch64/aapcs64/abitest-2.h (FUNC_VAL_CHECK): Likewise.