174785 Commits

Author SHA1 Message Date
Dennis Zhang
40f6483780 aarch64: ACLE I8MM multiply-accumulate intrinsics
This patch adds intrinsics for 8-bit integer matrix multiply-accumulate
operations including vmmlaq_s32, vmmlaq_u32, and vusmmlaq_s32.

gcc/ChangeLog:

2020-02-07  Dennis Zhang  <dennis.zhang@arm.com>

	* config/aarch64/aarch64-simd-builtins.def (simd_smmla): New entry.
	(simd_ummla, simd_usmmla): Likewise.
	* config/aarch64/aarch64-simd.md (aarch64_simd_<sur>mmlav16qi): New.
	* config/aarch64/arm_neon.h (vmmlaq_s32, vmmlaq_u32): New.
	(vusmmlaq_s32): New.

gcc/testsuite/ChangeLog:

2020-02-07  Dennis Zhang  <dennis.zhang@arm.com>

	* gcc.target/aarch64/simd/vmmla.c: New test.
2020-02-07 15:04:23 +00:00
Patrick Palka
b7903d9f5b libstdc++: Add [range.istream]
This patch adds ranges::basic_istream_view and ranges::istream_view.  This seems
to be the last missing part of the ranges header.

libstdc++-v3/ChangeLog:

	* include/std/ranges (ranges::__detail::__stream_extractable,
	ranges::basic_istream_view, ranges::istream_view): Define.
	* testsuite/std/ranges/istream_view: New test.
2020-02-07 09:44:53 -05:00
Patrick Palka
55d4cbcba8 Fix ChangeLog for previous commit 2020-02-07 09:31:17 -05:00
Patrick Palka
cba9ef069e libstdc++: Implement C++20 range adaptors
This patch implements [range.adaptors].  It also includes the changes from P3280
and P3278 and P3323, without which many standard examples won't work.

The implementation is mostly dictated by the spec and there was not much room
for implementation discretion.  The most interesting part that was not specified
by the spec is the design of the range adaptors and range adaptor closures,
which I tried to design in a way that minimizes boilerplate and statefulness (so
that e.g. the composition of two stateless closures is stateless).

What is left unimplemented is caching of calls to begin() in filter_view,
drop_view and reverse_view, which is required to guarantee that begin() has
amortized constant time complexity.  I can implement this in a subsequent patch.

"Interesting" parts of the patch are marked with XXX comments.

libstdc++-v3/ChangeLog:

	Implement C++20 range adaptors
	* include/std/ranges: Include <bits/refwrap.h> and <tuple>.
	(subrange::_S_store_size): Mark as const instead of constexpr to
	avoid what seems to be a bug in GCC.
	(__detail::__box): Give it defaulted copy and move constructors.
	(views::_Single::operator()): Mark constexpr.
	(views::_Iota::operator()): Mark constexpr.
	(__detail::Empty): Define.
	(views::_RangeAdaptor, views::_RangeAdaptorClosure, ref_view, all_view,
	views::all, filter_view, views::filter, transform_view,
	views::transform, take_view, views::take, take_while_view,
	views::take_while, drop_view, views::drop, join_view, views::join,
	__detail::require_constant, __detail::tiny_range, split_view,
	views::split, views::_Counted, views::counted, common_view,
	views::common, reverse_view, views::reverse,
	views::__detail::__is_reversible_subrange,
	views::__detail::__is_reverse_view, reverse_view, views::reverse,
	__detail::__has_tuple_element, elements_view, views::elements,
	views::keys, views::values): Define.
	* testsuite/std/ranges/adaptors/all.cc: New test.
	* testsuite/std/ranges/adaptors/common.cc: Likewise.
	* testsuite/std/ranges/adaptors/counted.cc: Likewise.
	* testsuite/std/ranges/adaptors/drop.cc: Likewise.
	* testsuite/std/ranges/adaptors/drop_while.cc: Likewise.
	* testsuite/std/ranges/adaptors/elements.cc: Likewise.
	* testsuite/std/ranges/adaptors/filter.cc: Likewise.
	* testsuite/std/ranges/adaptors/join.cc: Likewise.
	* testsuite/std/ranges/adaptors/reverse.cc: Likewise.
	* testsuite/std/ranges/adaptors/split.cc: Likewise.
	* testsuite/std/ranges/adaptors/take.cc: Likewise.
	* testsuite/std/ranges/adaptors/take_while.cc: Likewise.
	* testsuite/std/ranges/adaptors/transform.cc: Likewise.
2020-02-07 09:24:43 -05:00
Jonathan Wakely
0d57370c9c libstdc++: Optimize C++20 comparison category types
This reduces the size and alignment of all three comparison category
types to a single byte. The partial_ordering::_M_is_ordered flag is
replaced by the value 0x02 in the _M_value member.

This also optimizes conversion and comparison operators to avoid
conditional branches where possible, by comparing _M_value to constants
or using bitwise operations to correctly handle the unordered state.

	* libsupc++/compare (__cmp_cat::type): Define typedef for underlying
	type of enumerations and comparison category types.
	(__cmp_cat::_Ord, __cmp_cat::_Ncmp): Add underlying type.
	(__cmp_cat::_Ncmp::unordered): Change value to 2.
	(partial_ordering::_M_value, weak_ordering::_M_value)
	(strong_ordering::_M_value): Change type to __cmp_cat::type.
	(partial_ordering::_M_is_ordered): Remove data member.
	(partial_ordering): Use second bit of _M_value for unordered. Adjust
	comparison operators.
	(weak_ordering::operator partial_ordering): Simplify to remove
	branches.
	(operator<=>(unspecified, weak_ordering)): Likewise.
	(strong_ordering::operator partial_ordering): Likewise.
	(strong_ordering::operator weak_ordering): Likewise.
	(operator<=>(unspecified, strong_ordering)): Likewise.
	* testsuite/18_support/comparisons/categories/partialord.cc: New test.
	* testsuite/18_support/comparisons/categories/strongord.cc: New test.
	* testsuite/18_support/comparisons/categories/weakord.cc: New test.
2020-02-07 14:09:03 +00:00
Jason Merrill
82aee6dd61 c++: Fix ICE on nonsense requires-clause.
Here we were swallowing all the syntax errors by parsing tentatively, and
returning error_mark_node without ever actually giving an error.  Fixed by
using save_tokens/rollback_tokens instead.

	PR c++/92517
	* parser.c (cp_parser_constraint_primary_expression): Do the main
	parse non-tentatively.
2020-02-07 08:51:12 -05:00
Richard Biener
3c7a03bc36 middle-end/93519 - avoid folding stmts in obviously unreachable code
The inliner folds stmts delayed, the following arranges things so
to not fold stmts that are obviously not reachable to avoid warnings
from those code regions.

2020-02-07  Richard Biener  <rguenther@suse.de>

	PR middle-end/93519
	* tree-inline.c (fold_marked_statements): Do a PRE walk,
	skipping unreachable regions.
	(optimize_inline_calls): Skip folding stmts when we didn't
	inline.

	* gcc.dg/Wrestrict-21.c: New testcase.
2020-02-07 12:57:18 +01:00
Jonathan Wakely
5713834e4b libstdc++: Enable three-way comparison for iota_view iterators
The declaration of operator<=> was disabled due to a typo in the macro.
The declaration was also ill-formed when three_way_comparable<_Winc> is
not satisfied, which is a defect in the C++20 draft.

	* include/std/ranges (iota_view::_Iterator): Fix typo in name of
	__cpp_lib_three_way_comparison macro and use deduced return type for
	operator<=>.
	* testsuite/std/ranges/iota/iterator.cc: New test.
2020-02-07 11:39:12 +00:00
H.J. Lu
ea5ca698dc x86-64: Pass aggregates with only float/double in GPRs for MS_ABI
MS_ABI requires passing aggregates with only float/double in integer
registers as shown in the output from MSVC v19.10 at:

https://godbolt.org/z/2NPygd

This patch fixed:

FAIL: libffi.bhaible/test-callback.c -W -Wall -Wno-psabi -DDGTEST=54 -Wno-unused-variable -Wno-unused-parameter -Wno-unused-but-set-variable -Wno-uninitialized -O0 -DABI_NUM=FFI_GNUW64 -DABI_ATTR=MSABI execution test
FAIL: libffi.bhaible/test-callback.c -W -Wall -Wno-psabi -DDGTEST=54 -Wno-unused-variable -Wno-unused-parameter -Wno-unused-but-set-variable -Wno-uninitialized -O2 -DABI_NUM=FFI_GNUW64 -DABI_ATTR=MSABI execution test
FAIL: libffi.bhaible/test-callback.c -W -Wall -Wno-psabi -DDGTEST=55 -Wno-unused-variable -Wno-unused-parameter -Wno-unused-but-set-variable -Wno-uninitialized -O0 -DABI_NUM=FFI_GNUW64 -DABI_ATTR=MSABI execution test
FAIL: libffi.bhaible/test-callback.c -W -Wall -Wno-psabi -DDGTEST=55 -Wno-unused-variable -Wno-unused-parameter -Wno-unused-but-set-variable -Wno-uninitialized -O2 -DABI_NUM=FFI_GNUW64 -DABI_ATTR=MSABI execution test
FAIL: libffi.bhaible/test-callback.c -W -Wall -Wno-psabi -DDGTEST=56 -Wno-unused-variable -Wno-unused-parameter -Wno-unused-but-set-variable -Wno-uninitialized -O0 -DABI_NUM=FFI_GNUW64 -DABI_ATTR=MSABI execution test
FAIL: libffi.bhaible/test-callback.c -W -Wall -Wno-psabi -DDGTEST=56 -Wno-unused-variable -Wno-unused-parameter -Wno-unused-but-set-variable -Wno-uninitialized -O2 -DABI_NUM=FFI_GNUW64 -DABI_ATTR=MSABI execution test

in libffi testsuite.

gcc/

	PR target/85667
	* config/i386/i386.c (function_arg_ms_64): Add a type argument.
	Don't return aggregates with only SFmode and DFmode in SSE
	register.
	(ix86_function_arg): Pass arg.type to function_arg_ms_64.

gcc/testsuite/

	PR target/85667
	* gcc.target/i386/pr85667-10.c: New test.
	* gcc.target/i386/pr85667-7.c: Likewise.
	* gcc.target/i386/pr85667-8.c: Likewise.
	* gcc.target/i386/pr85667-9.c: Likewise.
2020-02-07 03:37:58 -08:00
Jakub Jelinek
c006911de9 powerpc: Fix -fstack-clash-protection -mprefixed-addr ICE [PR93122]
As mentioned in the PR, the following testcase ICEs because rs, while valid
add_operand is not valid add_cint_operand and so gen_add3_insn fails,
because it doesn't meet the expander predicates.

Here is what I meant as the alternative, i.e. don't check any predicates,
just gen_add3_insn, if that fails, force rs into register and retry.
And, add REG_FRAME_RELATED_EXPR note always when we haven't emitted a single
insn that has rtl exactly matching what we'd add the REG_FRAME_RELATED_EXPR
with (in that case, dwarf2cfi.c is able to figure it out by itself, no need
to waste compile time memory).

2020-02-07  Jakub Jelinek  <jakub@redhat.com>

	PR target/93122
	* config/rs6000/rs6000-logue.c
	(rs6000_emit_probe_stack_range_stack_clash): Always use gen_add3_insn,
	if it fails, move rs into end_addr and retry.  Add
	REG_FRAME_RELATED_EXPR note whenever it returns more than one insn or
	the insn pattern doesn't describe well what exactly happens to
	dwarf2cfi.c.

	* gcc.target/powerpc/pr93122.c: New test.
2020-02-07 11:12:22 +01:00
Paolo Carlini
c58e6cc32c Add testcase of PR c++/89404, already fixed in trunk.
PR c++/89404
	* g++.dg/ext/vla21.C: New.
2020-02-07 11:05:30 +01:00
Jakub Jelinek
811a475ea3 arm: Fix up arm installed unwind.h for use in pedantic modes [PR93615]
As the following testcase shows, unwind.h on ARM can't be (starting with GCC
10) compiled with -std=c* modes, only -std=gnu* modes.
The problem is it uses asm keyword, which isn't a keyword in those modes
(system headers vs. non-system ones don't make a difference here).
glibc and other installed headers use __asm or __asm__ keywords instead that
work fine in both standard and gnu modes.

While there, as it is an installed header, I think it is also wrong to
completely ignore any identifier namespace rules.
The generic unwind.h defines just _Unwind* namespace identifiers plus
_sleb128_t/_uleb128_t (but e.g. unlike libstdc++/glibc headers doesn't
uglify operand names), the ARM unwind.h is much worse here.  I've just
changed the gnu_Unwind_Find_got function at least not be in user identifier
namespace, but perhaps it would be good to go further and rename e.g.
or e.g.
  typedef _Unwind_Reason_Code (*personality_routine) (_Unwind_State,
      _Unwind_Control_Block *, _Unwind_Context *);
in unwind-arm-common.h.

2020-02-07  Jakub Jelinek  <jakub@redhat.com>

	PR target/93615
	* config/arm/unwind-arm.h (gnu_Unwind_Find_got): Rename to ...
	(_Unwind_gnu_Find_got): ... this.  Use __asm instead of asm.  Remove
	trailing :s in asm.  Formatting fixes.
	(_Unwind_decode_typeinfo_ptr): Adjust caller.

	* gcc.dg/pr93615.c: New test.
2020-02-07 11:01:14 +01:00
Jakub Jelinek
f82617f229 i386: Better patch to improve avx* vector concatenation [PR93594]
After thinking some more on this, we can do better; rather than having to
add a new prereload splitter pattern to catch all other cases where it might
be beneficial to fold first part of an UNSPEC_CAST back to the unspec
operand, this patch reverts the *.md changes I've made yesterday and instead
tweaks the patterns, so that simplify-rtx.c can optimize those on its own.
Instead of the whole SET_SRC being an UNSPEC through which simplify-rtx.c
obviously can't optimize anything, this represents those patterns through a
VEC_CONCAT (or two nested ones for the 128-bit -> 512-bit casts) with the
operand as the low part of it and UNSPEC representing just the high part of
it (the undefined, to be ignored, bits).  While richi suggested using
already in GIMPLE for those using a SSA_NAME default definition (i.e.
clearly uninitialized use), I'd say that uninit pass would warn about those,
but more importantly, in RTL it would probably force zero initialization of
that or use or an uninitialized pseudo, all of which is hard to match in an
pattern, so I think an UNSPEC is better for that.

2020-02-07  Jakub Jelinek  <jakub@redhat.com>

	PR target/93594
	* config/i386/predicates.md (avx_identity_operand): Remove.
	* config/i386/sse.md (*avx_vec_concat<mode>_1): Remove.
	(avx_<castmode><avxsizesuffix>_<castmode>,
	avx512f_<castmode><avxsizesuffix>_256<castmode>): Change patterns to
	a VEC_CONCAT of the operand and UNSPEC_CAST.
	(avx512f_<castmode><avxsizesuffix>_<castmode>): Change pattern to
	a VEC_CONCAT of VEC_CONCAT of the operand and UNSPEC_CAST with
	UNSPEC_CAST.
2020-02-07 09:28:39 +01:00
Jakub Jelinek
e7bec5d5ed i386: Fix splitters that call extract_insn_cached [PR93611]
The following testcase ICEs.  The generated split_insns starts
with recog_data.insn = NULL and then tries to put various operands into
recog_data.operand array and checks various splitter conditions.
The problem is that some atom related tuning splitters indirectly call
extract_insn_cached on the insn they are used in.  This can change
recog_data.operand, but most likely it will just keep it as is, but
sets recog_data.insn to the current instruction.  If that splitter doesn't
match, we continue trying some other split conditions and modify
recog_data.operand array again.  If even that doesn't find any usable
splitter, we punt, but at that point recog_data.insn says that recog_data
is valid for that particular instruction, even when recog_data.operand array
can be anything.
The safest thing would be to copy whole recog_data to a temporary object
before doing the calls that can call extract_insn_cached and restore it
afterwards, but it would be also very costly, recog_data has 1280 bytes.
So, this patch just makes sure to clear recog_data.insn if it has changed
during the extract_insn_cached call, which means if we extract_insn_cached
later, we'll extract it properly, while if we call it say from some other
context than splitter conditions, the insn is already cached, we don't reset
the cache.

2020-02-07  Jakub Jelinek  <jakub@redhat.com>

	PR target/93611
	* config/i386/i386.c (ix86_lea_outperforms): Make sure to clear
	recog_data.insn if distance_non_agu_define changed it.

	* gcc.target/i386/pr93611.c: New test.
2020-02-07 09:26:54 +01:00
Patrick Palka
bc4646410a libstdc++: Implement C++20 constrained algorithms
This patch implements the C++20 ranges overloads for the algorithms in
[algorithms].  Most of the algorithms were reimplemented, with each of their
implementations very closely following the existing implementation in
bits/stl_algo.h and bits/stl_algobase.h.  The reason for reimplementing most of
the algorithms instead of forwarding to their STL-style overload is because
forwarding cannot be conformantly and efficiently performed for algorithms that
operate on non-random-access iterators.  But algorithms that operate on random
access iterators can safely and efficiently be forwarded to the STL-style
implementation, and this patch does so for push_heap, pop_heap, make_heap,
sort_heap, sort, stable_sort, nth_element, inplace_merge and stable_partition.

What's missing from this patch is debug-iterator and container specializations
that are present for some of the STL-style algorithms that need to be ported
over to the ranges algos.  I marked them missing at TODO comments.  There are
also some other minor outstanding TODOs.

The code that could use the most thorough review is ranges::__copy_or_move,
ranges::__copy_or_move_backward, ranges::__equal and
ranges::__lexicographical_compare.  In the tests, I tried to test the interface
of each new overload, as well as the correctness of the new implementation.

libstdc++-v3/ChangeLog:

	Implement C++20 constrained algorithms
	* include/Makefile.am: Add new header.
	* include/Makefile.in: Regenerate.
	* include/std/algorithm: Include <bits/ranges_algo.h>.
	* include/bits/ranges_algo.h: New file.
	* testsuite/25_algorithms/adjacent_find/constrained.cc: New test.
	* testsuite/25_algorithms/all_of/constrained.cc: New test.
	* testsuite/25_algorithms/any_of/constrained.cc: New test.
	* testsuite/25_algorithms/binary_search/constrained.cc: New test.
	* testsuite/25_algorithms/copy/constrained.cc: New test.
	* testsuite/25_algorithms/copy_backward/constrained.cc: New test.
	* testsuite/25_algorithms/copy_if/constrained.cc: New test.
	* testsuite/25_algorithms/copy_n/constrained.cc: New test.
	* testsuite/25_algorithms/count/constrained.cc: New test.
	* testsuite/25_algorithms/count_if/constrained.cc: New test.
	* testsuite/25_algorithms/equal/constrained.cc: New test.
	* testsuite/25_algorithms/equal_range/constrained.cc: New test.
	* testsuite/25_algorithms/fill/constrained.cc: New test.
	* testsuite/25_algorithms/fill_n/constrained.cc: New test.
	* testsuite/25_algorithms/find/constrained.cc: New test.
	* testsuite/25_algorithms/find_end/constrained.cc: New test.
	* testsuite/25_algorithms/find_first_of/constrained.cc: New test.
	* testsuite/25_algorithms/find_if/constrained.cc: New test.
	* testsuite/25_algorithms/find_if_not/constrained.cc: New test.
	* testsuite/25_algorithms/for_each/constrained.cc: New test.
	* testsuite/25_algorithms/generate/constrained.cc: New test.
	* testsuite/25_algorithms/generate_n/constrained.cc: New test.
	* testsuite/25_algorithms/heap/constrained.cc: New test.
	* testsuite/25_algorithms/includes/constrained.cc: New test.
	* testsuite/25_algorithms/inplace_merge/constrained.cc: New test.
	* testsuite/25_algorithms/is_partitioned/constrained.cc: New test.
	* testsuite/25_algorithms/is_permutation/constrained.cc: New test.
	* testsuite/25_algorithms/is_sorted/constrained.cc: New test.
	* testsuite/25_algorithms/is_sorted_until/constrained.cc: New test.
	* testsuite/25_algorithms/lexicographical_compare/constrained.cc: New
	test.
	* testsuite/25_algorithms/lower_bound/constrained.cc: New test.
	* testsuite/25_algorithms/max/constrained.cc: New test.
	* testsuite/25_algorithms/max_element/constrained.cc: New test.
	* testsuite/25_algorithms/merge/constrained.cc: New test.
	* testsuite/25_algorithms/min/constrained.cc: New test.
	* testsuite/25_algorithms/min_element/constrained.cc: New test.
	* testsuite/25_algorithms/minmax/constrained.cc: New test.
	* testsuite/25_algorithms/minmax_element/constrained.cc: New test.
	* testsuite/25_algorithms/mismatch/constrained.cc: New test.
	* testsuite/25_algorithms/move/constrained.cc: New test.
	* testsuite/25_algorithms/move_backward/constrained.cc: New test.
	* testsuite/25_algorithms/next_permutation/constrained.cc: New test.
	* testsuite/25_algorithms/none_of/constrained.cc: New test.
	* testsuite/25_algorithms/nth_element/constrained.cc: New test.
	* testsuite/25_algorithms/partial_sort/constrained.cc: New test.
	* testsuite/25_algorithms/partial_sort_copy/constrained.cc: New test.
	* testsuite/25_algorithms/partition/constrained.cc: New test.
	* testsuite/25_algorithms/partition_copy/constrained.cc: New test.
	* testsuite/25_algorithms/partition_point/constrained.cc: New test.
	* testsuite/25_algorithms/prev_permutation/constrained.cc: New test.
	* testsuite/25_algorithms/remove/constrained.cc: New test.
	* testsuite/25_algorithms/remove_copy/constrained.cc: New test.
	* testsuite/25_algorithms/remove_copy_if/constrained.cc: New test.
	* testsuite/25_algorithms/remove_if/constrained.cc: New test.
	* testsuite/25_algorithms/replace/constrained.cc: New test.
	* testsuite/25_algorithms/replace_copy/constrained.cc: New test.
	* testsuite/25_algorithms/replace_copy_if/constrained.cc: New test.
	* testsuite/25_algorithms/replace_if/constrained.cc: New test.
	* testsuite/25_algorithms/reverse/constrained.cc: New test.
	* testsuite/25_algorithms/reverse_copy/constrained.cc: New test.
	* testsuite/25_algorithms/rotate/constrained.cc: New test.
	* testsuite/25_algorithms/rotate_copy/constrained.cc: New test.
	* testsuite/25_algorithms/search/constrained.cc: New test.
	* testsuite/25_algorithms/search_n/constrained.cc: New test.
	* testsuite/25_algorithms/set_difference/constrained.cc: New test.
	* testsuite/25_algorithms/set_intersection/constrained.cc: New test.
	* testsuite/25_algorithms/set_symmetric_difference/constrained.cc: New
	test.
	* testsuite/25_algorithms/set_union/constrained.cc: New test.
	* testsuite/25_algorithms/shuffle/constrained.cc: New test.
	* testsuite/25_algorithms/sort/constrained.cc: New test.
	* testsuite/25_algorithms/stable_partition/constrained.cc: New test.
	* testsuite/25_algorithms/stable_sort/constrained.cc: New test.
	* testsuite/25_algorithms/swap_ranges/constrained.cc: New test.
	* testsuite/25_algorithms/transform/constrained.cc: New test.
	* testsuite/25_algorithms/unique/constrained.cc: New test.
	* testsuite/25_algorithms/unique_copy/constrained.cc: New test.
	* testsuite/25_algorithms/upper_bound/constrained.cc: New test.
2020-02-06 20:08:34 -05:00
David Malcolm
13f5b93e64 analyzer: fix reproducer for PR 93375
Reproducing the ICE in PR analyzer/93375 required some kind of
analyzer diagnostic occurring after a call with fewer arguments
than required by the callee.

The testcase used __builtin_memcpy with a NULL argument for this.

On x86_64-pc-linux-gnu this happened to be already optimized into:
  _4 = MEM <unsigned int> [(char * {ref-all})0B];
  MEM <unsigned int> [(char * {ref-all})rl_1] = _4;
by the time of the analyzer pass, leading to the diagnostic in question
being:
  warning: dereference of NULL ‘rl’ [CWE-690] [-Wanalyzer-null-dereference]

On other targets e.g. arm-unknown-linux-gnueabi, the builtin isn't
optimized at the time of the analyzer pass, leading to this diagnostic
instead:
  warning: use of NULL ‘rl’ where non-null expected [CWE-690] [-Wanalyzer-null-argument]
  <built-in>: note: argument 1 of ‘__builtin_memcpy’ must be non-null

This patch fixes the test case by using a custom function marked as
nonnull.  I manually verified that it still reproduces the ICE if the
patch for the PR is reverted.

gcc/testsuite/ChangeLog:
	PR analyzer/93375
	* gcc.dg/analyzer/pr93375.c: Rework test case to avoid per-target
	differences in how __builtin_memcpy has been optimized at the time
	the analyzer runs.
2020-02-06 19:37:34 -05:00
GCC Administrator
e032e7a9ab Daily bump. 2020-02-07 00:16:34 +00:00
Michael Meissner
a66219dce7 Fix PR 93569.
2020-02-06  Michael Meissner  <meissner@linux.ibm.com>

	PR target/93569
	* config/rs6000/rs6000.c (reg_to_non_prefixed): Before ISA 3.0
	we only had X-FORM (reg+reg) addressing for vectors.  Also before
	ISA 3.0, we only had X-FORM addressing for scalars in the
	traditional Altivec registers.
2020-02-06 18:39:48 -05:00
Vladimir N. Makarov
d26f37a16e PR93561 -- [bounds checking] memory overflow for spill_for
2020-02-06  <zhongyunde@huawei.com>
  	    Vladimir Makarov  <vmakarov@redhat.com>

	PR rtl-optimization/93561
	* lra-assigns.c (spill_for): Check that tested hard regno is not out of
	hard register range.
2020-02-06 17:10:58 -05:00
David Malcolm
cb273d81a4 analyzer: round-trip pointer-equality through intptr_t
When investigating how the analyzer handles malloc/free of Cray pointers
in gfortran I noticed that that analyzer was losing information on
pointers that were cast to an integer type, and then back to a pointer
type again.

The root cause is that region_model::maybe_cast_1 was only preserving
the region_svalue-ness of the result if both types were pointers,
instead returning an unknown_svalue for a pointer-to-int cast.

This patch updates the above code so that it attempts to use a
region_svalue if *either* type is a pointer

Doing so allows the analyzer to recognize that the same underlying
region is in use through various casts through integer types.

gcc/analyzer/ChangeLog:
	* region-model.cc (region_model::maybe_cast_1): Attempt to provide
	a region_svalue if either type is a pointer, rather than if both
	types are pointers.

gcc/testsuite/ChangeLog:
	* gcc.dg/analyzer/torture/intptr_t.c: New test.
2020-02-06 14:44:23 -05:00
Richard Sandiford
1ccdd460d1 aarch64: Add a type attribute to aarch64_movk<mode>
Kyrill pointed out off-list that this new pattern was missing
a type attribute.

2020-02-06  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/aarch64.md (aarch64_movk<mode>): Add a type
	attribute.
2020-02-06 18:09:44 +00:00
Segher Boessenkool
72b2f3317b rs6000: Use rldimi for 64-bit constants with high=low (PR93012)
We currently use an (up to) five instruction sequence to generate such
constants.  After this change we just generate a 32-bit constant and do
a rotate-and-mask-insert instruction, making the sequence only up to
three instructions.

	* config/rs6000/rs6000.c (rs6000_emit_set_long_const): Handle the case
	where the low and the high 32 bits are equal to each other specially,
	with an rldimi instruction.

gcc/testsuite/
	* gcc.target/powerpc/pr93012.c: New.
2020-02-06 18:03:47 +00:00
Mihail Ionescu
201c2f785f [GCC][PATCH][ARM] Set profile to M for Armv8.1-M
2020-02-06  Mihail Ionescu  <mihail.ionescu@arm.com>

	* config/arm/arm-cpus.in: Set profile M for armv8.1-m.main.
2020-02-06 17:34:16 +00:00
Mihail Ionescu
52b25ffca1 [GCC][PATCH][ARM] Regenerate arm-tables.opt for Armv8.1-M patch
2020-02-06  Mihail Ionescu  <mihail.ionescu@arm.com>

	* config/arm/arm-tables.opt: Regenerate.
2020-02-06 17:34:16 +00:00
Uros Bizjak
a59658eaef Remove parenthesis from return statements in i386.md. 2020-02-06 18:32:42 +01:00
Uros Bizjak
b7c840121d Add missing ChangeLog entry. 2020-02-06 18:31:23 +01:00
Richard Sandiford
bba0c624c8 aarch64: Add an and/ior-based movk pattern [PR87763]
This patch adds a second movk pattern that models the instruction
as a "normal" and/ior operation rather than an insertion.  It fixes
the third insv_1.c failure in PR87763, which was a regression from
GCC 8.

2020-02-06  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	PR target/87763
	* config/aarch64/aarch64-protos.h (aarch64_movk_shift): Declare.
	* config/aarch64/aarch64.c (aarch64_movk_shift): New function.
	* config/aarch64/aarch64.md (aarch64_movk<mode>): New pattern.

gcc/testsuite/
	PR target/87763
	* gcc.target/aarch64/movk_2.c: New test.
2020-02-06 17:27:00 +00:00
Richard Sandiford
b65a1eb3fa aarch64: Add an extra sbfiz pattern [PR87763]
This patch matches another form of sbfiz, in which the input
has DImode and the output has SImode.  It fixes a regression
in gcc.target/aarch64/lsl_asr_sbfiz.c from GCC 8.

2020-02-06  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	PR rtl-optimization/87763
	* config/aarch64/aarch64.md (*ashiftsi_extvdi_bfiz): New pattern.
2020-02-06 17:26:59 +00:00
Uros Bizjak
88ec0e8dbe Unify gcc.target/i386/memcpy scan strings.
After -fno-common became the default, we can unify various
scan strings between 64bit and 32bit targets.

	* gcc.target/i386/memcpy-strategy-1.c (dg-final):
	Unify scan-assembler strings for all targets.
	* gcc.target/i386/memcpy-strategy-2.c (dg-final): Ditto.
	* gcc.target/i386/memcpy-strategy-3.c (dg-final): Ditto.
	* gcc.target/i386/memcpy-vector_loop-1.c (dg-final): Ditto.
2020-02-06 18:21:50 +01:00
Marek Polacek
4a136a214e c++: Fix ICE with lambda in operator function [PR93597]
If we are going to use get_first_fn let's make sure we operate on
is_overloaded_fn, as the rest of the codebase does, and if lookup finds
any class-scope declaration, return early too.

	PR c++/93597 - ICE with lambda in operator function.
	* name-lookup.c (maybe_save_operator_binding): Check is_overloaded_fn.

	* g++.dg/cpp0x/lambda/lambda-93597.C: New test.
2020-02-06 11:58:23 -05:00
Delia Burduv
f78335df69 aarch64: ACLE intrinsics bfmmla and bfmlal<b/t>
This patch adds the ARMv8.6 ACLE intrinsics for bfmmla, bfmlalb and
bfmlalt as part of the BFloat16 extension.
(https://developer.arm.com/architectures/instruction-sets/simd-isas/neon/intrinsics)
The intrinsics are declared in arm_neon.h and the RTL patterns are
defined in aarch64-simd.md.  Two new tests are added to check assembler
output.

2020-02-06  Delia Burduv  <delia.burduv@arm.com>

gcc/
	* config/aarch64/aarch64-simd-builtins.def
	(bfmlaq): New built-in function.
	(bfmlalb): New built-in function.
	(bfmlalt): New built-in function.
	(bfmlalb_lane): New built-in function.
	(bfmlalt_lane): New built-in function.
	* config/aarch64/aarch64-simd.md
	(aarch64_bfmmlaqv4sf): New pattern.
	(aarch64_bfmlal<bt>v4sf): New pattern.
	(aarch64_bfmlal<bt>_lane<q>v4sf): New pattern.
	* config/aarch64/arm_neon.h (vbfmmlaq_f32): New intrinsic.
	(vbfmlalbq_f32): New intrinsic.
	(vbfmlaltq_f32): New intrinsic.
	(vbfmlalbq_lane_f32): New intrinsic.
	(vbfmlaltq_lane_f32): New intrinsic.
	(vbfmlalbq_laneq_f32): New intrinsic.
	(vbfmlaltq_laneq_f32): New intrinsic.
	* config/aarch64/iterators.md (BF_MLA): New int iterator.
	(bt): New int attribute.
2020-02-06 16:40:12 +00:00
Uros Bizjak
ad84548336 Emit "#" instead of calling gcc_unreachable for invalid insns.
Implement standard approach by emitting "#" for insns that have to be split.

	* config/i386/i386.md (*pushtf): Emit "#" instead of
	calling gcc_unreachable in insn output.
	(*pushxf): Ditto.
	(*pushdf): Ditto.
	(*pushsf_rex64): Ditto for alternatives other than 1.
	(*pushsf): Ditto for alternatives other than 1.
2020-02-06 17:34:21 +01:00
Bill Schmidt
c940105cc1 Fix PowerPC prototype documentation of __builtin_mtfsf (PR93570)
PR93570 reports that the documentation shows __builtin_mtfsf to
return a double, but this is incorrect.  The return signature
should be void.

2020-02-06  Bill Schmidt  <wschmidt@linux.ibm.com>

        PR target/93570
        * doc/extend.texi (Basic PowerPC Built-in Functions): Correct
	prototype for __builtin_mtfsf.
2020-02-06 08:04:09 -06:00
Martin Liska
554ced43eb
Revert mangling of names with -fprofile-generate=<dir>.
PR gcov-profile/91971
	PR gcov-profile/93466
	* coverage.c (coverage_init): Revert mangling of
	path into filename.  It can lead to huge filename length.
	Creation of subfolders seem more natural.
2020-02-06 14:53:28 +01:00
Jonathan Wakely
bd630df033 libstdc++: Fix comment to refer to correct PR
* include/bits/stl_iterator.h (__detail::__common_iter_ptr): Fix PR
	number in comment. Fix indentation.
2020-02-06 13:32:48 +00:00
Jonathan Wakely
26eae9ac2b libstdc++: decay in viewable_range should be remove_cvref (LWG 3375)
* include/bits/stl_algobase.h (__iter_swap, __iter_swap<true>): Remove
	redundant _GLIBCXX20_CONSTEXPR.
2020-02-06 13:32:48 +00:00
Tobias Burnus
101baaee42 [Testsuite] – More fixes for remote execution: check_gc_sections_available, …
* gcc.target/arm/multilib.exp (multilib_config): Pass flags to
	…_target_compile as (additional_flags=) option and not as source
	filename to make it work with remote execution.
	* lib/target-supports.exp (check_runtime, check_gc_sections_available,
	check_effective_target_gas, check_effective_target_gld): Likewise.
2020-02-06 13:27:45 +01:00
Jonathan Wakely
d1aa7705d5 libstdc++: Remove redundant macro that is always empty
The __iter_swap class template and explicit specialization are only
declared (and used) for C++03 so _GLIBCXX20_CONSTEXPR does nothing here.

	* include/bits/stl_algobase.h (__iter_swap, __iter_swap<true>): Remove
	redundant _GLIBCXX20_CONSTEXPR.
2020-02-06 10:48:17 +00:00
Stam Markianos-Wright
ff861d6595 [GCC][BUG][ARM] Fix ICE due to BFmode libfunc call (PR93300)
This was sent and approved on gcc-patches as "[GCC][BUG][Aarch64][ARM]
    (PR93300) Fix ICE due to BFmode placement in GET_MODES_WIDER chain".

    The observed error came about because BFmode was placed between HFmode
    and SFmode in the GET_MODES_WIDER chain, resulting in convert_mode_scalar
    attempting to gen a libfunc for a HFmode -> BFmode conversion.

    This patch registers NULL for all libfuncs in BFmode, which stops the
    middle-end from attempting to generate them.

    gcc/ChangeLog:
    2020-02-06  Stam Markianos-Wright  <stam.markianos-wright@arm.com>

	PR target/93300
	* config/arm/arm.c (arm_block_arith_comp_libfuncs_for_mode): New.
	(arm_init_libfuncs): Add BFmode support to block spurious BF libfuncs.
	Use arm_block_arith_comp_libfuncs_for_mode for HFmode.
2020-02-06 10:20:08 +00:00
Jakub Jelinek
3f740c67db i386: Improve avx* vector concatenation [PR93594]
The following testcase shows that for _mm256_set*_m128i and similar
intrinsics, we sometimes generate bad code.  All 4 routines are expressing
the same thing, a 128-bit vector zero padded to 256-bit vector, but only the
3rd one actually emits the desired vmovdqa      %xmm0, %xmm0 insn, the
others vpxor    %xmm1, %xmm1, %xmm1; vinserti128        $0x1, %xmm1, %ymm0, %ymm0
The problem is that the cast builtins use UNSPEC_CAST which is after reload
simplified using a splitter, but during combine it prevents optimizations.
We do have avx_vec_concat* patterns that generate efficient code, both for
this low part + zero concatenation special case and for other cases too, so
the following define_insn_and_split just recognizes avx_vec_concat made of a
low half of a cast and some other reg.

2020-02-06  Jakub Jelinek  <jakub@redhat.com>

	PR target/93594
	* config/i386/predicates.md (avx_identity_operand): New predicate.
	* config/i386/sse.md (*avx_vec_concat<mode>_1): New
	define_insn_and_split.

	* gcc.target/i386/avx2-pr93594.c: New test.
2020-02-06 11:08:59 +01:00
Jakub Jelinek
cb3f06480a openmp: Fix handling of non-addressable shared scalars in parallel nested inside of target [PR93515]
As the following testcase shows, we need to consider even target to be a construct
that forces not to use copy in/out for shared on parallel inside of the target.
E.g. for parallel nested inside another parallel or host teams, we already avoid
copy in/out and we need to treat target the same.

2020-02-06  Jakub Jelinek  <jakub@redhat.com>

	PR libgomp/93515
	* omp-low.c (use_pointer_for_field): For nested constructs, also
	look for map clauses on target construct.
	(scan_omp_1_stmt) <case GIMPLE_OMP_TARGET>: Bump temporarily
	taskreg_nesting_level.

	* testsuite/libgomp.c-c++-common/pr93515.c: New test.
2020-02-06 09:19:08 +01:00
Jakub Jelinek
cf785618ec openmp: Notice reduction decl in outer contexts after adding it to shared [PR93515]
If we call omp_add_variable, following omp_notice_variable will already find it
on that construct and not go through outer constructs, the following patch fixes that.
Note, this still doesn't follow OpenMP 5.0 semantics on target combined with other
constructs with reduction/lastprivate/linear clauses, will handle that for GCC11.

2020-02-06  Jakub Jelinek  <jakub@redhat.com>

	PR libgomp/93515
	* gimplify.c (gimplify_scan_omp_clauses) <do_notice>: If adding
	shared clause, call omp_notice_variable on outer context if any.
2020-02-06 09:15:13 +01:00
Alexandre Oliva
006eeaa819 Initialize barrier_cache for ARM EH ABI compliance
The ARM Exception Handling ABI requires personality functions in
phase1 to initialize barrier_cache before returning
_URC_HANDLER_FOUND, and we don't.

Although our own ARM personality function does not use barrier_cache
at all, other languages' ARM personality functions, during phase2, are
allowed and expected to test barrier_cache.sp to check whether the
handler frame was reached, which implies that personality functions is
in charge of the frame, and the remaining fields of barrier_cache hold
whatever values it put there in phase1.

Since we did not set barrier_cache.sp, an earlier exception, already
handled by a non-Ada handler and then released, may have its storage
reused for a new exception, that phase1 matches to an Ada frame, but
if that leaves barrier_cache.sp alone, the phase2 personality function
that handled the earlier exception, upon reaching the frame that
handled the earlier exception, may believe the information in
barrier_cache applies to the current exception.  The C++ personality
function, for example, would take the information in the barrier_cache
and end up activating the handler that handled the earlier exception:

  try {
    throw 1;
  } catch (int i) {
    std::cout << "caught " << i << " by c++" << std::endl;
  }
  raise_ada_exception (); // might loop back to the handler above

for  gcc/ada/ChangeLog

	* raise-gcc.c (personality_body) [__ARM_EABI_UNWINDER__]:
	Initialize barrier_cache.sp when ending phase1.
2020-02-06 03:59:45 -03:00
Jason Merrill
3774c0b934 cgraph: A COMDAT decl always has non-zero address.
We should be able to assume that a template instantiation or other COMDAT
has non-zero address even if MAKE_DECL_ONE_ONLY for the target sets
DECL_WEAK and we haven't yet decided to emit a definition in this
translation unit.

	PR c++/92003
	* symtab.c (symtab_node::nonzero_address): A DECL_COMDAT decl has
	non-zero address even if weak and not yet defined.
2020-02-05 21:22:40 -05:00
GCC Administrator
b8e165be65 Daily bump. 2020-02-06 00:16:31 +00:00
Martin Sebor
297aa66829 Remove trailing comma to avoid pedantic warning in C++ 98 mode: comma at end of enumerator list 2020-02-05 17:14:42 -07:00
Martin Sebor
e7868dc6a7 PR tree-optimization/92765 - wrong code for strcmp of a union member
gcc/ChangeLog:

	PR tree-optimization/92765
	* gimple-fold.c (get_range_strlen_tree): Handle MEM_REF and PARM_DECL.
	* tree-ssa-strlen.c (compute_string_length): Remove.
	(determine_min_objsize): Remove.
	(get_len_or_size): Add an argument.  Call get_range_strlen_dynamic.
	Avoid using type size as the upper bound on string length.
	(handle_builtin_string_cmp): Add an argument.  Adjust.
	(strlen_check_and_optimize_call): Pass additional argument to
	handle_builtin_string_cmp.

gcc/testsuite/ChangeLog:

	PR tree-optimization/92765
	* g++.dg/tree-ssa/strlenopt-1.C: New test.
	* g++.dg/tree-ssa/strlenopt-2.C: New test.
	* gcc.dg/Warray-bounds-58.c: New test.
	* gcc.dg/Wrestrict-20.c: Avoid a valid -Wformat-overflow.
	* gcc.dg/Wstring-compare.c: Xfail a test.
	* gcc.dg/strcmpopt_2.c: Disable tests.
	* gcc.dg/strcmpopt_4.c: Adjust tests.
	* gcc.dg/strcmpopt_10.c: New test.
	* gcc.dg/strcmpopt_11.c: New test.
	* gcc.dg/strlenopt-69.c: Disable tests.
	* gcc.dg/strlenopt-92.c: New test.
	* gcc.dg/strlenopt-93.c: New test.
	* gcc.dg/strlenopt.h: Declare calloc.
	* gcc.dg/tree-ssa/pr92056.c: Xfail tests until pr93518 is resolved.
	* gcc.dg/tree-ssa/builtin-sprintf-warn-23.c: Correct test (pr93517).
2020-02-05 16:55:26 -07:00
Jason Merrill
f6bef09771 c++: Fix decltype of empty pack expansion of parm.
In unevaluated context, we only substitute a single PARM_DECL, not the
entire chain, but the handling of an empty pack expansion was missing that
check.

	PR c++/93140
	* pt.c (tsubst_decl) [PARM_DECL]: Check cp_unevaluated_operand in
	handling of TREE_CHAIN for empty pack.
2020-02-05 18:38:23 -05:00
Uros Bizjak
ba67231631 Simplify post epilogue_completed splitters.
Now that we have post epilogue_completed split point for all
optimization levels, we can simplify post epilogue_completed splitters
considerably. If corresponding define_peephole2 pattern fails to
allocate a temporary register (or if peephole2 pass isn't run at all),
we can now always split invalid RTX after epilogue_completed is set.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

	* config/i386/i386.md (*pushdi2_rex64 peephole2): Remove.
	(*pushdi2_rex64 peephole2): Unconditionally split after
	epilogue_completed.
	(*ashl<mode>3_doubleword): Ditto.
	(*<shift_insn><mode>3_doubleword): Ditto.
2020-02-06 00:13:00 +01:00
Marek Polacek
f214002ba1 Move CL to the correct file. 2020-02-05 17:45:16 -05:00