Commit Graph

214554 Commits

Author SHA1 Message Date
Ian Lance Taylor
f8687bceaa libbacktrace: don't get confused by overlapping address ranges
Fixes https://github.com/ianlancetaylor/libbacktrace/issues/137.

	* dwarf.c (resolve_unit_addrs_overlap_walk): New static function.
	(resolve_unit_addrs_overlap): New static function.
	(build_dwarf_data): Call resolve_unit_addrs_overlap.
2024-10-18 13:04:11 -07:00
John David Anglin
aaa855fac0 hppa: Fix up pa.opt.urls
2024-10-18  John David Anglin  <danglin@gcc.gnu.org>

gcc/ChangeLog:

	* config/pa/pa.opt.urls: Fix for -mlra.
2024-10-18 12:43:15 -04:00
Thomas Koenig
1f07dea91c Handle GFC_STD_UNSIGNED like a standard in error messages.
gcc/fortran/ChangeLog:

	* error.cc (notify_std_msg): Handle GFC_STD_UNSIGNED.

gcc/testsuite/ChangeLog:

	* gfortran.dg/unsigned_37.f90: New test.
2024-10-18 17:58:56 +02:00
John David Anglin
44a81aaf73 hppa: Add LRA support
LRA is not enabled as default since there are some new test fails
remaining to resolve.

2024-10-18  John David Anglin  <danglin@gcc.gnu.org>

gcc/ChangeLog:

	PR target/113933
	* config/pa/pa.cc (pa_use_lra_p): Declare.
	(TARGET_LRA_P): Change define to pa_use_lra_p.
	(pa_use_lra_p): New function.
	(legitimize_pic_address): Also check lra_in_progress.
	(pa_emit_move_sequence): Likewise.
	(pa_legitimate_constant_p): Likewise.
	(pa_legitimate_address_p): Likewise.
	(pa_secondary_reload): For floating-point loads and stores,
	return NO_REGS for REG and SUBREG operands.  Return
	GENERAL_REGS for some shift register spills.
	* config/pa/pa.opt: Add mlra option.
	* config/pa/predicates.md (integer_store_memory_operand):
	Also check lra_in_progress.
	(floating_point_store_memory_operand): Likewise.
	(reg_before_reload_operand): Likewise.
2024-10-18 11:28:23 -04:00
Craig Blackmore
b039d06c9a [PATCH 3/7] RISC-V: Fix vector memcpy smaller LMUL generation
If riscv_vector::expand_block_move is generating a straight-line memcpy
using a predicated store, it tries to use a smaller LMUL to reduce
register pressure if it still allows an entire transfer.

This happens in the inner loop of riscv_vector::expand_block_move,
however, the vmode chosen by this loop gets overwritten later in the
function, so I have added the missing break from the outer loop.

I have also addressed a couple of issues with the conditions of the if
statement within the inner loop.

The first condition did not make sense to me:
```
  TARGET_MIN_VLEN * lmul <= nunits * BITS_PER_UNIT
```
I think this was supposed to be checking that the length fits within the
given LMUL, so I have changed it to do that.

The second condition:
```
  /* Avoid loosing the option of using vsetivli .  */
  && (nunits <= 31 * lmul || nunits > 31 * 8)
```
seems to imply that lmul affects the range of AVL immediate that
vsetivli can take but I don't think that is correct.  Anyway, I don't
think this condition is necessary because if we find a suitable mode we
should stick with it, regardless of whether it allowed vsetivli, rather
than continuing to try larger lmul which would increase register
pressure or smaller potential_ew which would increase AVL.  I have
removed this condition.

gcc/ChangeLog:

	* config/riscv/riscv-string.cc (expand_block_move): Fix
	condition for using smaller LMUL.  Break outer loop if a
	suitable vmode has been found.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/vsetvl/pr112929-1.c: Expect smaller lmul.
	* gcc.target/riscv/rvv/vsetvl/pr112988-1.c: Likewise.
	* gcc.target/riscv/rvv/base/cpymem-3.c: New test.
2024-10-18 09:17:21 -06:00
Craig Blackmore
212d8685e4 [PATCH 2/7] RISC-V: Fix uninitialized reg in memcpy
gcc/ChangeLog:

	* config/riscv/riscv-string.cc (expand_block_move): Replace
	`end` with `length_rtx` in gen_rtx_NE.
2024-10-18 09:06:58 -06:00
Craig Blackmore
f244492ec2 [PATCH 1/7] RISC-V: Fix indentation in riscv_vector::expand_block_move [NFC]
gcc/ChangeLog:

	* config/riscv/riscv-string.cc (expand_block_move): Fix
	indentation.
2024-10-18 09:01:35 -06:00
Uros Bizjak
3a12ac4032 i386: Fix the order of operands in andn<MMXMODEI:mode>3 [PR117192]
Fix the order of operands in andn<MMXMODEI:mode>3 expander to comply
with the specification, where bitwise-complement applies to operand 2.

	PR target/117192

gcc/ChangeLog:

	* config/i386/mmx.md (andn<MMXMODEI:mode>3): Swap operand
	indexes 1 and 2 to comply with andn specification.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/pr117192.c: New test.
2024-10-18 16:04:57 +02:00
Jonathan Wakely
d0a9ae1321
libstdc++: Reuse std::__assign_one in <bits/ranges_algobase.h>
Use std::__assign_one instead of ranges::__assign_one. Adjust the uses,
because std::__assign_one has the arguments in the opposite order (the
same order as an assignment expression).

libstdc++-v3/ChangeLog:

	* include/bits/ranges_algobase.h (ranges::__assign_one): Remove.
	(__copy_or_move, __copy_or_move_backward): Use std::__assign_one
	instead of ranges::__assign_one.

Reviewed-by: Patrick Palka <ppalka@redhat.com>
2024-10-18 14:49:35 +01:00
Jonathan Wakely
6ecf2b380d
libstdc++: Add always_inline to some one-liners in <bits/stl_algobase.h>
We implement std::copy, std::fill etc. as a series of calls to other
overloads which incrementally peel off layers of iterator wrappers. This
adds a high abstraction penalty for -O0 and potentially even -O1. Add
the always_inline attribute to several functions that are just a single
return statement (and maybe a static_assert, or some concept-checking
assertions which are disabled by default).

libstdc++-v3/ChangeLog:

	* include/bits/stl_algobase.h (__copy_move_a1, __copy_move_a)
	(__copy_move_backward_a1, __copy_move_backward_a, move_backward)
	(__fill_a1, __fill_a, fill, __fill_n_a, fill_n, __equal_aux):
	Add always_inline attribute to one-line forwarding functions.

Reviewed-by: Patrick Palka <ppalka@redhat.com>
2024-10-18 14:49:35 +01:00
Jonathan Wakely
5546be4c24
libstdc++: Add nodiscard to std::find
I missed this one out in r14-9478-gdf483ebd24689a but I don't think that
was intentional. I see no reason std::find shouldn't be [[nodiscard]].

libstdc++-v3/ChangeLog:

	* include/bits/stl_algo.h (find): Add nodiscard.

Reviewed-by: Patrick Palka <ppalka@redhat.com>
2024-10-18 14:49:34 +01:00
Jonathan Wakely
7ed561f63e
libstdc++: Inline memmove optimizations for std::copy etc. [PR115444]
This removes all the __copy_move class template specializations that
decide how to optimize std::copy and std::copy_n. We can inline those
optimizations into the algorithms, using if-constexpr (and macros for
C++98 compatibility) and remove the code dispatching to the various
class template specializations.

Doing this means we implement the optimization directly for std::copy_n
instead of deferring to std::copy, That avoids the unwanted consequence
of advancing the iterator in copy_n only to take the difference later to
get back to the length that we already had in copy_n originally (as
described in PR 115444).

With the new flattened implementations, we can also lower contiguous
iterators to pointers in std::copy/std::copy_n/std::copy_backwards, so
that they benefit from the same memmove optimizations as pointers.
There's a subtlety though: contiguous iterators can potentially throw
exceptions to exit the algorithm early.  So we can only transform the
loop to memmove if dereferencing the iterator is noexcept. We don't
check that incrementing the iterator is noexcept because we advance the
contiguous iterators before using memmove, so that if incrementing would
throw, that happens first. I am writing a proposal (P3349R0) which would
make this unnecessary, so I hope we can drop the nothrow requirements
later.

This change also solves PR 114817 by checking is_trivially_assignable
before optimizing copy/copy_n etc. to memmove. It's not enough to check
that the types are trivially copyable (a precondition for using memmove
at all), we also need to check that the specific assignment that would
be performed by the algorithm is also trivial. Replacing a non-trivial
assignment with memmove would be observable, so not allowed.

libstdc++-v3/ChangeLog:

	PR libstdc++/115444
	PR libstdc++/114817
	* include/bits/stl_algo.h (__copy_n): Remove generic overload
	and overload for random access iterators.
	(copy_n): Inline generic version of __copy_n here. Do not defer
	to std::copy for random access iterators.
	* include/bits/stl_algobase.h (__copy_move): Remove.
	(__nothrow_contiguous_iterator, __memcpyable_iterators): New
	concepts.
	(__assign_one, _GLIBCXX_TO_ADDR, _GLIBCXX_ADVANCE): New helpers.
	(__copy_move_a2): Inline __copy_move logic and conditional
	memmove optimization into the most generic overload.
	(__copy_n_a): Likewise.
	(__copy_move_backward): Remove.
	(__copy_move_backward_a2): Inline __copy_move_backward logic and
	memmove optimization into the most generic overload.
	* testsuite/20_util/specialized_algorithms/uninitialized_copy/114817.cc:
	New test.
	* testsuite/20_util/specialized_algorithms/uninitialized_copy_n/114817.cc:
	New test.
	* testsuite/25_algorithms/copy/114817.cc: New test.
	* testsuite/25_algorithms/copy/115444.cc: New test.
	* testsuite/25_algorithms/copy_n/114817.cc: New test.

Reviewed-by: Patrick Palka <ppalka@redhat.com>
2024-10-18 14:49:34 +01:00
Jonathan Wakely
4020ee7718
libstdc++: Make __normal_iterator constexpr, always_inline, nodiscard
The __gnu_cxx::__normal_iterator type we use for std::vector::iterator
is not specified by the standard, it's an implementation detail. This
means it's not constrained by the rule that forbids strengthening
constexpr. We can make it meet the constexpr iterator requirements for
older standards, not only when it's required to be for C++20.

For the non-const member functions they can't be constexpr in C++11, so
use _GLIBCXX14_CONSTEXPR for those. For all constructors, const members
and non-member operator overloads, use _GLIBCXX_CONSTEXPR or just
constexpr.

We can also liberally add [[nodiscard]] and [[gnu::always_inline]]
attributes to those functions.

Also change some internal helpers for std::move_iterator which can be
unconditionally constexpr and marked nodiscard.

libstdc++-v3/ChangeLog:

	* include/bits/stl_iterator.h (__normal_iterator): Make all
	members and overloaded operators constexpr before C++20, and add
	always_inline attribute
	(__to_address): Add nodiscard and always_inline attributes.
	(__make_move_if_noexcept_iterator): Add nodiscard
	and make unconditionally constexpr.
	(__niter_base(__normal_iterator), __niter_base(Iter)):
	Add nodiscard and always_inline attributes.
	(__niter_base(reverse_iterator), __niter_base(move_iterator))
	(__miter_base): Add inline.
	(__niter_wrap(From, To)): Add nodiscard attribute.
	(__niter_wrap(const Iter&, Iter)): Add nodiscard and
	always_inline attributes.

Reviewed-by: Patrick Palka <ppalka@redhat.com>
2024-10-18 14:49:34 +01:00
Jonathan Wakely
3abe751ea8
libstdc++: Refactor std::uninitialized_{copy,fill,fill_n} algos [PR68350]
This refactors the std::uninitialized_copy, std::uninitialized_fill and
std::uninitialized_fill_n algorithms to directly perform memcpy/memset
optimizations instead of dispatching to std::copy/std::fill/std::fill_n.

The reasons for this are:

- Use 'if constexpr' to simplify and optimize compilation throughput, so
  dispatching to specialized class templates is only needed for C++98
  mode.
- Use memcpy instead of memmove, because the conditions on
  non-overlapping ranges are stronger for std::uninitialized_copy than
  for std::copy. Using memcpy might be a minor optimization.
- No special case for creating a range of one element, which std::copy
  needs to deal with (see PR libstdc++/108846). The uninitialized algos
  create new objects, which reuses storage and is allowed to clobber
  tail padding.
- Relax the conditions for using memcpy/memset, because the C++20 rules
  on implicit-lifetime types mean that we can rely on memcpy to begin
  lifetimes of trivially copyable types.  We don't need to require
  trivially default constructible, so don't need to limit the
  optimization to trivial types. See PR 68350 for more details.
- Remove the dependency on std::copy and std::fill. This should mean
  that stl_uninitialized.h no longer needs to include all of
  stl_algobase.h.  This isn't quite true yet, because we still use
  std::fill in __uninitialized_default and still use std::fill_n in
  __uninitialized_default_n. That will be fixed later.

Several tests need changes to the diagnostics matched by dg-error
because we no longer use the __constructible() function that had a
static assert in. Now we just get straightforward errors for attempting
to use a deleted constructor.

Two tests needed more signficant changes to the actual expected results
of executing the tests, because they were checking for old behaviour
which was incorrect according to the standard.
20_util/specialized_algorithms/uninitialized_copy/64476.cc was expecting
std::copy to be used for a call to std::uninitialized_copy involving two
trivially copyable types. That was incorrect behaviour, because a
non-trivial constructor should have been used, but using std::copy used
trivial default initialization followed by assignment.
20_util/specialized_algorithms/uninitialized_fill_n/sizes.cc was testing
the behaviour with a non-integral Size passed to uninitialized_fill_n,
but I wrote the test looking at the requirements of uninitialized_copy_n
which are not the same as uninitialized_fill_n. The former uses --n and
tests n > 0, but the latter just tests n-- (which will never be false
for a floating-point value with a fractional part).

libstdc++-v3/ChangeLog:

	PR libstdc++/68350
	PR libstdc++/93059
	* include/bits/stl_uninitialized.h (__check_constructible)
	(_GLIBCXX_USE_ASSIGN_FOR_INIT): Remove.
	[C++98] (__unwrappable_niter): New trait.
	(__uninitialized_copy<true>): Replace use of std::copy.
	(uninitialized_copy): Fix Doxygen comments. Open-code memcpy
	optimization for C++11 and later.
	(__uninitialized_fill<true>): Replace use of std::fill.
	(uninitialized_fill): Fix Doxygen comments. Open-code memset
	optimization for C++11 and later.
	(__uninitialized_fill_n<true>): Replace use of std::fill_n.
	(uninitialized_fill_n): Fix Doxygen comments. Open-code memset
	optimization for C++11 and later.
	* testsuite/20_util/specialized_algorithms/uninitialized_copy/64476.cc:
	Adjust expected behaviour to match what the standard specifies.
	* testsuite/20_util/specialized_algorithms/uninitialized_fill_n/sizes.cc:
	Likewise.
	* testsuite/20_util/specialized_algorithms/uninitialized_copy/1.cc:
	Adjust dg-error directives.
	* testsuite/20_util/specialized_algorithms/uninitialized_copy/89164.cc:
	Likewise.
	* testsuite/20_util/specialized_algorithms/uninitialized_copy_n/89164.cc:
	Likewise.
	* testsuite/20_util/specialized_algorithms/uninitialized_fill/89164.cc:
	Likewise.
	* testsuite/20_util/specialized_algorithms/uninitialized_fill_n/89164.cc:
	Likewise.
	* testsuite/23_containers/vector/cons/89164.cc: Likewise.
	* testsuite/23_containers/vector/cons/89164_c++17.cc: Likewise.

Reviewed-by: Patrick Palka <ppalka@redhat.com>
2024-10-18 14:49:34 +01:00
Jonathan Wakely
2608fcfe5f
libstdc++: Move std::__niter_base and std::__niter_wrap to stl_iterator.h
Move the functions for unwrapping and rewrapping __normal_iterator
objects to the same file as the definition of __normal_iterator itself.

This will allow a later commit to make use of std::__niter_base in other
headers without having to include all of <bits/stl_algobase.h>.

libstdc++-v3/ChangeLog:

	* include/bits/stl_algobase.h (__niter_base, __niter_wrap): Move
	to ...
	* include/bits/stl_iterator.h: ... here.
	(__niter_base, __miter_base): Move all overloads to the end of
	the header.
	* testsuite/24_iterators/normal_iterator/wrapping.cc: New test.

Reviewed-by: Patrick Palka <ppalka@redhat.com>
2024-10-18 14:49:34 +01:00
Jennifer Schmitz
e69c2e2120 SVE intrinsics: Add fold_active_lanes_to method to refactor svmul and svdiv.
As suggested in
https://gcc.gnu.org/pipermail/gcc-patches/2024-September/663275.html,
this patch adds the method gimple_folder::fold_active_lanes_to (tree X).
This method folds active lanes to X and sets inactive lanes according to
the predication, returning a new gimple statement. That makes folding of
SVE intrinsics easier and reduces code duplication in the
svxxx_impl::fold implementations.
Using this new method, svdiv_impl::fold and svmul_impl::fold were refactored.
Additionally, the method was used for two optimizations:
1) Fold svdiv to the dividend, if the divisor is all ones and
2) for svmul, if one of the operands is all ones, fold to the other operand.
Both optimizations were previously applied to _x and _m predication on
the RTL level, but not for _z, where svdiv/svmul were still being used.
For both optimization, codegen was improved by this patch, for example by
skipping sel instructions with all-same operands and replacing sel
instructions by mov instructions.

The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
OK for mainline?

Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>

gcc/
	* config/aarch64/aarch64-sve-builtins-base.cc (svdiv_impl::fold):
	Refactor using fold_active_lanes_to and fold to dividend, is the
	divisor is all ones.
	(svmul_impl::fold): Refactor using fold_active_lanes_to and fold
	to the other operand, if one of the operands is all ones.
	* config/aarch64/aarch64-sve-builtins.h: Declare
	gimple_folder::fold_active_lanes_to (tree).
	* config/aarch64/aarch64-sve-builtins.cc
	(gimple_folder::fold_actives_lanes_to): Add new method to fold
	actives lanes to given argument and setting inactives lanes
	according to the predication.

gcc/testsuite/
	* gcc.target/aarch64/sve/acle/asm/div_s32.c: Adjust expected outcome.
	* gcc.target/aarch64/sve/acle/asm/div_s64.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/div_u32.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/div_u64.c: Likewise.
	* gcc.target/aarch64/sve/fold_div_zero.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/mul_s16.c: New test.
	* gcc.target/aarch64/sve/acle/asm/mul_s32.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/mul_s64.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/mul_s8.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/mul_u16.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/mul_u32.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/mul_u64.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/mul_u8.c: Likewise.
	* gcc.target/aarch64/sve/mul_const_run.c: Likewise.
2024-10-18 15:12:47 +02:00
Richard Biener
94b95f7a3f [5/n] remove trapv-*.c special-casing of gcc.dg/vect/ files
The following makes -ftrapv explicit.

	* gcc.dg/vect/vect.exp: Remove special-casing of tests
	named trapv-*
	* gcc.dg/vect/trapv-vect-reduc-4.c: Add dg-additional-options -ftrapv.
2024-10-18 14:44:54 +02:00
Richard Biener
902f4ee7f1 [4/n] remove wrapv-*.c special-casing of gcc.dg/vect/ files
The following makes -fwrapv explicit.

	* gcc.dg/vect/vect.exp: Remove special-casing of tests
	named wrapv-*
	* gcc.dg/vect/wrapv-vect-7.c: Add dg-additional-options -fwrapv.
	* gcc.dg/vect/wrapv-vect-reduc-2char.c: Likewise.
	* gcc.dg/vect/wrapv-vect-reduc-2short.c: Likewise.
	* gcc.dg/vect/wrapv-vect-reduc-dot-s8b.c: Likewise.
	* gcc.dg/vect/wrapv-vect-reduc-pattern-2c.c: Likewise.
2024-10-18 14:44:54 +02:00
Richard Biener
a1381b69b9 [3/n] remove fast-math-*.c special-casing of gcc.dg/vect/ files
The following makes -ffast-math explicit.

	* gcc.dg/vect/vect.exp: Remove special-casing of tests
	named fast-math-*
	* gcc.dg/vect/fast-math-bb-slp-call-1.c: Add dg-additional-options
	-ffast-math.
	* gcc.dg/vect/fast-math-bb-slp-call-2.c: Likewise.
	* gcc.dg/vect/fast-math-bb-slp-call-3.c: Likewise.
	* gcc.dg/vect/fast-math-ifcvt-1.c: Likewise.
	* gcc.dg/vect/fast-math-pr35982.c: Likewise.
	* gcc.dg/vect/fast-math-pr43074.c: Likewise.
	* gcc.dg/vect/fast-math-pr44152.c: Likewise.
	* gcc.dg/vect/fast-math-pr55281.c: Likewise.
	* gcc.dg/vect/fast-math-slp-27.c: Likewise.
	* gcc.dg/vect/fast-math-slp-38.c: Likewise.
	* gcc.dg/vect/fast-math-vect-call-1.c: Likewise.
	* gcc.dg/vect/fast-math-vect-call-2.c: Likewise.
	* gcc.dg/vect/fast-math-vect-complex-3.c: Likewise.
	* gcc.dg/vect/fast-math-vect-outer-7.c: Likewise.
	* gcc.dg/vect/fast-math-vect-pow-1.c: Likewise.
	* gcc.dg/vect/fast-math-vect-pow-2.c: Likewise.
	* gcc.dg/vect/fast-math-vect-pr25911.c: Likewise.
	* gcc.dg/vect/fast-math-vect-pr29925.c: Likewise.
	* gcc.dg/vect/fast-math-vect-reduc-5.c: Likewise.
	* gcc.dg/vect/fast-math-vect-reduc-7.c: Likewise.
	* gcc.dg/vect/fast-math-vect-reduc-8.c: Likewise.
	* gcc.dg/vect/fast-math-vect-reduc-9.c: Likewise.
	* gcc.dg/vect/complex/fast-math-bb-slp-complex-add-double.c: Likewise.
	* gcc.dg/vect/complex/fast-math-bb-slp-complex-add-float.c: Likewise.
	* gcc.dg/vect/complex/fast-math-bb-slp-complex-add-half-float.c:
	Likewise.
	* gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-double.c:
	Likewise.
	* gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-float.c:
	Likewise.
	* gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-half-float.c:
	Likewise.
	* gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-double.c: Likewise.
	* gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-float.c: Likewise.
	* gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-half-float.c:
	Likewise.
	* gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-double.c: Likewise.
	* gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-float.c: Likewise.
	* gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-half-float.c:
	Likewise.
	* gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-double.c: Likewise.
	* gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-float.c: Likewise.
	* gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-half-float.c:
	Likewise.
	* gcc.dg/vect/complex/fast-math-complex-add-double.c: Likewise.
	* gcc.dg/vect/complex/fast-math-complex-add-float.c: Likewise.
	* gcc.dg/vect/complex/fast-math-complex-add-half-float.c: Likewise.
	* gcc.dg/vect/complex/fast-math-complex-add-pattern-double.c:
	Likewise.
	* gcc.dg/vect/complex/fast-math-complex-add-pattern-float.c: Likewise.
	* gcc.dg/vect/complex/fast-math-complex-add-pattern-half-float.c:
	Likewise.
	* gcc.dg/vect/complex/fast-math-complex-mla-double.c: Likewise.
	* gcc.dg/vect/complex/fast-math-complex-mla-float.c: Likewise.
	* gcc.dg/vect/complex/fast-math-complex-mla-half-float.c: Likewise.
	* gcc.dg/vect/complex/fast-math-complex-mls-double.c: Likewise.
	* gcc.dg/vect/complex/fast-math-complex-mls-float.c: Likewise.
	* gcc.dg/vect/complex/fast-math-complex-mls-half-float.c: Likewise.
	* gcc.dg/vect/complex/fast-math-complex-mul-double.c: Likewise.
	* gcc.dg/vect/complex/fast-math-complex-mul-float.c: Likewise.
	* gcc.dg/vect/complex/fast-math-complex-mul-half-float.c: Likewise.
2024-10-18 14:44:54 +02:00
Richard Biener
d3d41ec609 [2/n] remove no-vfa-*.c special-casing of gcc.dg/vect/ files
The following makes --param vect-max-version-for-alias-checks=0
explicit.

	* gcc.dg/vect/vect.exp: Remove special-casing of tests
	named no-vfa-*
	* gcc.dg/vect/no-vfa-pr29145.c: Add dg-additional-options
	--param vect-max-version-for-alias-checks=0.
	* gcc.dg/vect/no-vfa-vect-101.c: Likewise.
	* gcc.dg/vect/no-vfa-vect-102.c: Likewise.
	* gcc.dg/vect/no-vfa-vect-102a.c: Likewise.
	* gcc.dg/vect/no-vfa-vect-37.c: Likewise.
	* gcc.dg/vect/no-vfa-vect-43.c: Likewise.
	* gcc.dg/vect/no-vfa-vect-45.c: Likewise.
	* gcc.dg/vect/no-vfa-vect-49.c: Likewise.
	* gcc.dg/vect/no-vfa-vect-51.c: Likewise.
	* gcc.dg/vect/no-vfa-vect-53.c: Likewise.
	* gcc.dg/vect/no-vfa-vect-57.c: Likewise.
	* gcc.dg/vect/no-vfa-vect-61.c: Likewise.
	* gcc.dg/vect/no-vfa-vect-79.c: Likewise.
	* gcc.dg/vect/no-vfa-vect-depend-1.c: Likewise.
	* gcc.dg/vect/no-vfa-vect-depend-2.c: Likewise.
	* gcc.dg/vect/no-vfa-vect-depend-3.c: Likewise.
	* gcc.dg/vect/no-vfa-vect-dv-2.c: Likewise.
2024-10-18 14:44:54 +02:00
Richard Biener
ee70e5c729 Adjust assert in vect_build_slp_tree_2
The assert in SLP discovery when we handle masked operations is
confusingly wide - all gather variants should be catched by
the earlier STMT_VINFO_GATHER_SCATTER_P.

	* tree-vect-slp.cc (vect_build_slp_tree_2): Only expect
	IFN_MASK_LOAD for masked loads that are not
	STMT_VINFO_GATHER_SCATTER_P.
2024-10-18 14:44:54 +02:00
Alex Coplan
261d803c40 MAINTAINERS: Add myself as pair fusion and aarch64 ldp/stp maintainer
ChangeLog:

	* MAINTAINERS (CPU Port Maintainers): Add myself as aarch64 ldp/stp
	maintainer.
	(Various Maintainers): Add myself as pair fusion maintainer.
2024-10-18 11:09:53 +01:00
Martin Jambor
1a458bdeb2
testsuite: Add necessary dejagnu directives to pr115815_0.c
I have received an email from the Linaro infrastructure that the test
gcc.dg/lto/pr115815_0.c which I added is failing on arm-eabi and I
realized that not only it is missing dg-require-effective-target
global_constructor but actually any dejagnu directives at all, which
means it is unnecessarily running both at -O0 and -O2 and there is an
unnecesary run test too.  All fixed by this patch.

I have not actually verified that the failure goes away on arm-eabi
but have very high hopes it will.  I have verified that the test still
checks for the bug and also that it passes by running:

  make -k check-gcc RUNTESTFLAGS="lto.exp=*pr115815*"

gcc/testsuite/ChangeLog:

2024-10-14  Martin Jambor  <mjambor@suse.cz>

	* gcc.dg/lto/pr115815_0.c: Add dejagu directives.
2024-10-18 12:08:45 +02:00
Tamar Christina
51291ad0f1 middle-end: Fix GSI for gcond root [PR117140]
When finding the gsi to use for code of the root statements we should use the
one of the original statement rather than the gcond which may be inside a
pattern.

Without this the emitted instructions may be discarded later.

gcc/ChangeLog:

	PR tree-optimization/117140
	* tree-vect-slp.cc (vectorize_slp_instance_root_stmt): Use gsi from
	original statement.

gcc/testsuite/ChangeLog:

	PR tree-optimization/117140
	* gcc.dg/vect/vect-early-break_129-pr117140.c: New test.
2024-10-18 10:37:28 +01:00
Tamar Christina
55f898008e middle-end: Fix VEC_PERM_EXPR lowering since relaxation of vector sizes
In GCC 14 VEC_PERM_EXPR was relaxed to be able to permute to a 2x larger vector
than the size of the input vectors.  However various passes and transformations
were not updated to account for this.

I have patches in these area that I will be upstreaming with individual patches
that expose them.

This one is that vectlower tries to lower based on the size of the input vectors
rather than the size of the output.  As a consequence it creates an invalid
vector of half the size.

Luckily we ICE because the resulting nunits doesn't match the vector size.

gcc/ChangeLog:

	* tree-vect-generic.cc (lower_vec_perm): Use output vector size instead
	of input vector when determining output nunits.

gcc/testsuite/ChangeLog:

	* gcc.dg/vec-perm-lower.c: New test.
2024-10-18 10:36:19 +01:00
Tamar Christina
453d3d90c3 AArch64: use movi d0, #0 to clear SVE registers instead of mov z0.d, #0
This patch changes SVE to use Adv. SIMD movi 0 to clear SVE registers when not
in SVE streaming mode.  As the Neoverse Software Optimization guides indicate
SVE mov #0 is not a zero cost move.

When In streaming mode we continue to use SVE's mov to clear the registers.

Tests have already been updated.

gcc/ChangeLog:

	* config/aarch64/aarch64.cc (aarch64_output_sve_mov_immediate): Use
	fmov for SVE zeros.
2024-10-18 09:44:15 +01:00
Tamar Christina
87dc6b1992 AArch64: support encoding integer immediates using floating point moves
This patch extends our immediate SIMD generation cases to support generating
integer immediates using floating point operation if the integer immediate maps
to an exact FP value.

As an example:

uint32x4_t f1() {
    return vdupq_n_u32(0x3f800000);
}

currently generates:

f1:
        adrp    x0, .LC0
        ldr     q0, [x0, #:lo12:.LC0]
        ret

i.e. a load, but with this change:

f1:
        fmov    v0.4s, 1.0e+0
        ret

Such immediates are common in e.g. our Math routines in glibc because they are
created to extract or mark part of an FP immediate as masks.

gcc/ChangeLog:

	* config/aarch64/aarch64.cc (aarch64_sve_valid_immediate,
	aarch64_simd_valid_immediate): Refactor accepting modes and values.
	(aarch64_float_const_representable_p): Refactor and extract FP checks
	into ...
	(aarch64_real_float_const_representable_p): ...This and fix fail
	fallback from real_to_integer.
	(aarch64_advsimd_valid_immediate): Use it.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/const_create_using_fmov.c: New test.
2024-10-18 09:43:45 +01:00
Tamar Christina
fc35079277 AArch64: update testsuite to account for new zero moves
The patch series will adjust how zeros are created.  In principal it doesn't
matter the exact lane size a zero gets created on but this makes the tests a
bit fragile.

This preparation patch will update the testsuite to accept multiple variants
of ways to create vector zeros to accept both the current syntax and the one
being transitioned to in the series.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/ldp_stp_18.c: Update zero regexpr.
	* gcc.target/aarch64/memset-corner-cases.c: Likewise.
	* gcc.target/aarch64/sme/acle-asm/revd_bf16.c: Likewise.
	* gcc.target/aarch64/sme/acle-asm/revd_f16.c: Likewise.
	* gcc.target/aarch64/sme/acle-asm/revd_f32.c: Likewise.
	* gcc.target/aarch64/sme/acle-asm/revd_f64.c: Likewise.
	* gcc.target/aarch64/sme/acle-asm/revd_s16.c: Likewise.
	* gcc.target/aarch64/sme/acle-asm/revd_s32.c: Likewise.
	* gcc.target/aarch64/sme/acle-asm/revd_s64.c: Likewise.
	* gcc.target/aarch64/sme/acle-asm/revd_s8.c: Likewise.
	* gcc.target/aarch64/sme/acle-asm/revd_u16.c: Likewise.
	* gcc.target/aarch64/sme/acle-asm/revd_u32.c: Likewise.
	* gcc.target/aarch64/sme/acle-asm/revd_u64.c: Likewise.
	* gcc.target/aarch64/sme/acle-asm/revd_u8.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/acge_f16.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/acge_f32.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/acge_f64.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/acgt_f16.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/acgt_f32.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/acgt_f64.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/acle_f16.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/acle_f32.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/acle_f64.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/aclt_f16.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/aclt_f32.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/aclt_f64.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/bic_s8.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/bic_u8.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/cmpuo_f16.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/cmpuo_f32.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/cmpuo_f64.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/dup_f16.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/dup_f32.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/dup_f64.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/dup_s16.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/dup_s32.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/dup_s64.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/dup_s8.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/dup_u16.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/dup_u32.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/dup_u64.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/dup_u8.c: Likewise.
	* gcc.target/aarch64/sve/const_fold_div_1.c: Likewise.
	* gcc.target/aarch64/sve/const_fold_mul_1.c: Likewise.
	* gcc.target/aarch64/sve/dup_imm_1.c: Likewise.
	* gcc.target/aarch64/sve/fdup_1.c: Likewise.
	* gcc.target/aarch64/sve/fold_div_zero.c: Likewise.
	* gcc.target/aarch64/sve/fold_mul_zero.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_2.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_3.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_4.c: Likewise.
	* gcc.target/aarch64/vect-fmovd-zero.c: Likewise.
2024-10-18 09:42:46 +01:00
Christophe Lyon
8e74cbc3a8 arm: [MVE intrinsics] use long_type_suffix / half_type_suffix helpers
In several places we are looking for a type twice or half as large as
the type suffix: this patch introduces helper functions to avoid code
duplication. long_type_suffix is similar to the SVE counterpart, but
adds an 'expected_tclass' parameter.  half_type_suffix is similar to
it, but does not exist in SVE.

2024-08-28  Christophe Lyon  <christophe.lyon@linaro.org>

	gcc/

	* config/arm/arm-mve-builtins-shapes.cc (long_type_suffix): New.
	(half_type_suffix): New.
	(struct binary_move_narrow_def): Use new helper.
	(struct binary_move_narrow_unsigned_def): Likewise.
	(struct binary_rshift_narrow_def): Likewise.
	(struct binary_rshift_narrow_unsigned_def): Likewise.
	(struct binary_widen_def): Likewise.
	(struct binary_widen_n_def): Likewise.
	(struct binary_widen_opt_n_def): Likewise.
	(struct unary_widen_def): Likewise.
2024-10-18 07:41:16 +00:00
Christophe Lyon
a5efcfcc93 arm: [MVE intrinsics] rework vsbcq vsbciq
Implement vsbcq vsbciq using the new MVE builtins framework.

We re-use most of the code introduced by the previous patches.

2024-08-28  Christophe Lyon  <christophe.lyon@linaro.org>

	gcc/

	* config/arm/arm-mve-builtins-base.cc (class vadc_vsbc_impl): Add
	support for vsbciq and vsbcq.
	(vadciq, vadcq): Add new parameter.
	(vsbciq): New.
	(vsbcq): New.
	* config/arm/arm-mve-builtins-base.def (vsbciq): New.
	(vsbcq): New.
	* config/arm/arm-mve-builtins-base.h (vsbciq): New.
	(vsbcq): New.
	* config/arm/arm_mve.h (vsbciq): Delete.
	(vsbciq_m): Delete.
	(vsbcq): Delete.
	(vsbcq_m): Delete.
	(vsbciq_s32): Delete.
	(vsbciq_u32): Delete.
	(vsbciq_m_s32): Delete.
	(vsbciq_m_u32): Delete.
	(vsbcq_s32): Delete.
	(vsbcq_u32): Delete.
	(vsbcq_m_s32): Delete.
	(vsbcq_m_u32): Delete.
	(__arm_vsbciq_s32): Delete.
	(__arm_vsbciq_u32): Delete.
	(__arm_vsbciq_m_s32): Delete.
	(__arm_vsbciq_m_u32): Delete.
	(__arm_vsbcq_s32): Delete.
	(__arm_vsbcq_u32): Delete.
	(__arm_vsbcq_m_s32): Delete.
	(__arm_vsbcq_m_u32): Delete.
	(__arm_vsbciq): Delete.
	(__arm_vsbciq_m): Delete.
	(__arm_vsbcq): Delete.
	(__arm_vsbcq_m): Delete.
2024-10-18 07:41:16 +00:00
Christophe Lyon
6e2b3125c2 arm: [MVE intrinsics] rework vadcq
Implement vadcq using the new MVE builtins framework.

We re-use most of the code introduced by the previous patch to support
vadciq: we just need to initialize carry from the input parameter.

2024-08-28  Christophe Lyon  <christophe.lyon@linaro.org>

	gcc/

	* config/arm/arm-mve-builtins-base.cc (vadcq_vsbc): Add support
	for vadcq.
	* config/arm/arm-mve-builtins-base.def (vadcq): New.
	* config/arm/arm-mve-builtins-base.h (vadcq): New.
	* config/arm/arm_mve.h (vadcq): Delete.
	(vadcq_m): Delete.
	(vadcq_s32): Delete.
	(vadcq_u32): Delete.
	(vadcq_m_s32): Delete.
	(vadcq_m_u32): Delete.
	(__arm_vadcq_s32): Delete.
	(__arm_vadcq_u32): Delete.
	(__arm_vadcq_m_s32): Delete.
	(__arm_vadcq_m_u32): Delete.
	(__arm_vadcq): Delete.
	(__arm_vadcq_m): Delete.
2024-10-18 07:41:16 +00:00
Christophe Lyon
cb21ceae31 arm: [MVE intrinsics] rework vadciq
Implement vadciq using the new MVE builtins framework.

2024-08-28  Christophe Lyon  <christophe.lyon@linaro.org>
	gcc/

	* config/arm/arm-mve-builtins-base.cc (class vadc_vsbc_impl): New.
	(vadciq): New.
	* config/arm/arm-mve-builtins-base.def (vadciq): New.
	* config/arm/arm-mve-builtins-base.h (vadciq): New.
	* config/arm/arm_mve.h (vadciq): Delete.
	(vadciq_m): Delete.
	(vadciq_s32): Delete.
	(vadciq_u32): Delete.
	(vadciq_m_s32): Delete.
	(vadciq_m_u32): Delete.
	(__arm_vadciq_s32): Delete.
	(__arm_vadciq_u32): Delete.
	(__arm_vadciq_m_s32): Delete.
	(__arm_vadciq_m_u32): Delete.
	(__arm_vadciq): Delete.
	(__arm_vadciq_m): Delete.
2024-10-18 07:41:15 +00:00
Christophe Lyon
8c21fc6610 arm: [MVE intrinsics] factorize vadc vadci vsbc vsbci
Factorize vadc/vsbc and vadci/vsbci so that they use the same
parameterized names.

2024-08-28  Christophe Lyon  <christophe.lyon@linaro.org>

	gcc/
	* config/arm/iterators.md (mve_insn): Add VADCIQ_M_S, VADCIQ_M_U,
	VADCIQ_U, VADCIQ_S, VADCQ_M_S, VADCQ_M_U, VADCQ_S, VADCQ_U,
	VSBCIQ_M_S, VSBCIQ_M_U, VSBCIQ_S, VSBCIQ_U, VSBCQ_M_S, VSBCQ_M_U,
	VSBCQ_S, VSBCQ_U.
	(VADCIQ, VSBCIQ): Merge into ...
	(VxCIQ): ... this.
	(VADCIQ_M, VSBCIQ_M): Merge into ...
	(VxCIQ_M): ... this.
	(VSBCQ, VADCQ): Merge into ...
	(VxCQ): ... this.
	(VSBCQ_M, VADCQ_M): Merge into ...
	(VxCQ_M): ... this.
	* config/arm/mve.md
	(mve_vadciq_<supf>v4si, mve_vsbciq_<supf>v4si): Merge into ...
	(@mve_<mve_insn>q_<supf>v4si): ... this.
	(mve_vadciq_m_<supf>v4si, mve_vsbciq_m_<supf>v4si): Merge into ...
	(@mve_<mve_insn>q_m_<supf>v4si): ... this.
	(mve_vadcq_<supf>v4si, mve_vsbcq_<supf>v4si): Merge into ...
	(@mve_<mve_insn>q_<supf>v4si): ... this.
	(mve_vadcq_m_<supf>v4si, mve_vsbcq_m_<supf>v4si): Merge into ...
	(@mve_<mve_insn>q_m_<supf>v4si): ... this.
2024-10-18 07:41:15 +00:00
Christophe Lyon
ba7b97e0bc arm: [MVE intrinsics] add vadc_vsbc shape
This patch adds the vadc_vsbc shape description.

2024-08-28  Christophe Lyon  <chrirstophe.lyon@linaro.org>

	gcc/
	* config/arm/arm-mve-builtins-shapes.cc (vadc_vsbc): New.
	* config/arm/arm-mve-builtins-shapes.h (vadc_vsbc): New.
2024-10-18 07:41:15 +00:00
Christophe Lyon
8d73d2780f arm: [MVE intrinsics] remove vshlcq useless expanders
Since we rewrote the implementation of vshlcq intrinsics, we no longer
need these expanders.

2024-08-28  Christophe Lyon  <christophe.lyon@linaro.org>

	gcc/
	* config/arm/arm-builtins.cc
	(arm_ternop_unone_none_unone_imm_qualifiers)
	(-arm_ternop_none_none_unone_imm_qualifiers): Delete.
	* config/arm/arm_mve_builtins.def (vshlcq_m_vec_s)
	(vshlcq_m_carry_s, vshlcq_m_vec_u, vshlcq_m_carry_u): Delete.
	* config/arm/mve.md (mve_vshlcq_vec_<supf><mode>): Delete.
	(mve_vshlcq_carry_<supf><mode>): Delete.
	(mve_vshlcq_m_vec_<supf><mode>): Delete.
	(mve_vshlcq_m_carry_<supf><mode>): Delete.
2024-10-18 07:41:15 +00:00
Christophe Lyon
4d2b6a7dd5 arm: [MVE intrinsics] rework vshlcq
Implement vshlc using the new MVE builtins framework.

2024-08-28  Christophe Lyon  <christophe.lyon@linaro.org>

	gcc/
	* config/arm/arm-mve-builtins-base.cc (class vshlc_impl): New.
	(vshlc): New.
	* config/arm/arm-mve-builtins-base.def (vshlcq): New.
	* config/arm/arm-mve-builtins-base.h (vshlcq): New.
	* config/arm/arm-mve-builtins.cc
	(function_instance::has_inactive_argument): Handle vshlc.
	* config/arm/arm_mve.h (vshlcq): Delete.
	(vshlcq_m): Delete.
	(vshlcq_s8): Delete.
	(vshlcq_u8): Delete.
	(vshlcq_s16): Delete.
	(vshlcq_u16): Delete.
	(vshlcq_s32): Delete.
	(vshlcq_u32): Delete.
	(vshlcq_m_s8): Delete.
	(vshlcq_m_u8): Delete.
	(vshlcq_m_s16): Delete.
	(vshlcq_m_u16): Delete.
	(vshlcq_m_s32): Delete.
	(vshlcq_m_u32): Delete.
	(__arm_vshlcq_s8): Delete.
	(__arm_vshlcq_u8): Delete.
	(__arm_vshlcq_s16): Delete.
	(__arm_vshlcq_u16): Delete.
	(__arm_vshlcq_s32): Delete.
	(__arm_vshlcq_u32): Delete.
	(__arm_vshlcq_m_s8): Delete.
	(__arm_vshlcq_m_u8): Delete.
	(__arm_vshlcq_m_s16): Delete.
	(__arm_vshlcq_m_u16): Delete.
	(__arm_vshlcq_m_s32): Delete.
	(__arm_vshlcq_m_u32): Delete.
	(__arm_vshlcq): Delete.
	(__arm_vshlcq_m): Delete.
	* config/arm/mve.md (mve_vshlcq_<supf><mode>): Add '@' prefix.
	(mve_vshlcq_m_<supf><mode>): Likewise.
2024-10-18 07:41:14 +00:00
Christophe Lyon
2ddabb28db arm: [MVE intrinsics] add vshlc shape
This patch adds the vshlc shape description.

2024-08-28  Christophe Lyon  <chrirstophe.lyon@linaro.org>

	gcc/
	* config/arm/arm-mve-builtins-shapes.cc (vshlc): New.
	* config/arm/arm-mve-builtins-shapes.h (vshlc): New.
2024-10-18 07:41:14 +00:00
Christophe Lyon
c7f95f2b53 arm: [MVE intrinsics] remove useless v[id]wdup expanders
Like with vddup/vidup, we use code_for_mve_q_wb_u_insn, so we can drop
the expanders and their declarations as builtins, now useless.

2024-08-28  Christophe Lyon  <christophe.lyon@linaro.org>

	gcc/
	* config/arm/arm-builtins.cc
	(arm_quinop_unone_unone_unone_unone_imm_pred_qualifiers): Delete.
	* config/arm/arm_mve_builtins.def (viwdupq_wb_u, vdwdupq_wb_u)
	(viwdupq_m_wb_u, vdwdupq_m_wb_u, viwdupq_m_n_u, vdwdupq_m_n_u)
	(vdwdupq_n_u, viwdupq_n_u): Delete.
	* config/arm/mve.md (mve_vdwdupq_n_u<mode>): Delete.
	(mve_vdwdupq_wb_u<mode>): Delete.
	(mve_vdwdupq_m_n_u<mode>): Delete.
	(mve_vdwdupq_m_wb_u<mode>): Delete.
2024-10-18 07:41:14 +00:00
Christophe Lyon
e65ab03fac arm: [MVE intrinsics] update v[id]wdup tests
Testing v[id]wdup overloads with '1' as argument for uint32_t* does
not make sense: this patch adds a new 'unit32_t *a' parameter to foo2
in such tests.

The difference with v[id]dup tests (where we removed 'foo2') is that
in 'foo1' we test the overload with a variable 'wrap' parameter (b)
and we need foo2 to test the overload with an immediate (1).

2024-08-28  Christophe Lyon  <christophe.lyon@linaro.org>

	gcc/testsuite/

	* gcc.target/arm/mve/intrinsics/vdwdupq_m_wb_u16.c: Use pointer
	parameter in foo2.
	* gcc.target/arm/mve/intrinsics/vdwdupq_m_wb_u32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vdwdupq_m_wb_u8.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vdwdupq_wb_u16.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vdwdupq_wb_u32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vdwdupq_wb_u8.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vdwdupq_x_wb_u16.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vdwdupq_x_wb_u32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vdwdupq_x_wb_u8.c: Likewise.
	* gcc.target/arm/mve/intrinsics/viwdupq_m_wb_u16.c: Likewise.
	* gcc.target/arm/mve/intrinsics/viwdupq_m_wb_u32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/viwdupq_m_wb_u8.c: Likewise.
	* gcc.target/arm/mve/intrinsics/viwdupq_wb_u16.c: Likewise.
	* gcc.target/arm/mve/intrinsics/viwdupq_wb_u32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/viwdupq_wb_u8.c: Likewise.
	* gcc.target/arm/mve/intrinsics/viwdupq_x_wb_u16.c: Likewise.
	* gcc.target/arm/mve/intrinsics/viwdupq_x_wb_u32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/viwdupq_x_wb_u8.c: Likewise.
2024-10-18 07:41:14 +00:00
Christophe Lyon
47ed70f758 arm: [MVE intrinsics] rework vdwdup viwdup
Implement vdwdup and viwdup using the new MVE builtins framework.

In order to share more code with viddup_impl, the patch swaps operands
1 and 2 in @mve_v[id]wdupq_m_wb_u<mode>_insn, so that the parameter
order is similar to what @mve_v[id]dupq_m_wb_u<mode>_insn uses.

2024-08-28  Christophe Lyon  <christophe.lyon@linaro.org>

	gcc/
	* config/arm/arm-mve-builtins-base.cc (viddup_impl): Add support
	for wrapping versions.
	(vdwdupq): New.
	(viwdupq): New.
	* config/arm/arm-mve-builtins-base.def (vdwdupq): New.
	(viwdupq): New.
	* config/arm/arm-mve-builtins-base.h (vdwdupq): New.
	(viwdupq): New.
	* config/arm/arm_mve.h (vdwdupq_m): Delete.
	(vdwdupq_u8): Delete.
	(vdwdupq_u32): Delete.
	(vdwdupq_u16): Delete.
	(viwdupq_m): Delete.
	(viwdupq_u8): Delete.
	(viwdupq_u32): Delete.
	(viwdupq_u16): Delete.
	(vdwdupq_x_u8): Delete.
	(vdwdupq_x_u16): Delete.
	(vdwdupq_x_u32): Delete.
	(viwdupq_x_u8): Delete.
	(viwdupq_x_u16): Delete.
	(viwdupq_x_u32): Delete.
	(vdwdupq_m_n_u8): Delete.
	(vdwdupq_m_n_u32): Delete.
	(vdwdupq_m_n_u16): Delete.
	(vdwdupq_m_wb_u8): Delete.
	(vdwdupq_m_wb_u32): Delete.
	(vdwdupq_m_wb_u16): Delete.
	(vdwdupq_n_u8): Delete.
	(vdwdupq_n_u32): Delete.
	(vdwdupq_n_u16): Delete.
	(vdwdupq_wb_u8): Delete.
	(vdwdupq_wb_u32): Delete.
	(vdwdupq_wb_u16): Delete.
	(viwdupq_m_n_u8): Delete.
	(viwdupq_m_n_u32): Delete.
	(viwdupq_m_n_u16): Delete.
	(viwdupq_m_wb_u8): Delete.
	(viwdupq_m_wb_u32): Delete.
	(viwdupq_m_wb_u16): Delete.
	(viwdupq_n_u8): Delete.
	(viwdupq_n_u32): Delete.
	(viwdupq_n_u16): Delete.
	(viwdupq_wb_u8): Delete.
	(viwdupq_wb_u32): Delete.
	(viwdupq_wb_u16): Delete.
	(vdwdupq_x_n_u8): Delete.
	(vdwdupq_x_n_u16): Delete.
	(vdwdupq_x_n_u32): Delete.
	(vdwdupq_x_wb_u8): Delete.
	(vdwdupq_x_wb_u16): Delete.
	(vdwdupq_x_wb_u32): Delete.
	(viwdupq_x_n_u8): Delete.
	(viwdupq_x_n_u16): Delete.
	(viwdupq_x_n_u32): Delete.
	(viwdupq_x_wb_u8): Delete.
	(viwdupq_x_wb_u16): Delete.
	(viwdupq_x_wb_u32): Delete.
	(__arm_vdwdupq_m_n_u8): Delete.
	(__arm_vdwdupq_m_n_u32): Delete.
	(__arm_vdwdupq_m_n_u16): Delete.
	(__arm_vdwdupq_m_wb_u8): Delete.
	(__arm_vdwdupq_m_wb_u32): Delete.
	(__arm_vdwdupq_m_wb_u16): Delete.
	(__arm_vdwdupq_n_u8): Delete.
	(__arm_vdwdupq_n_u32): Delete.
	(__arm_vdwdupq_n_u16): Delete.
	(__arm_vdwdupq_wb_u8): Delete.
	(__arm_vdwdupq_wb_u32): Delete.
	(__arm_vdwdupq_wb_u16): Delete.
	(__arm_viwdupq_m_n_u8): Delete.
	(__arm_viwdupq_m_n_u32): Delete.
	(__arm_viwdupq_m_n_u16): Delete.
	(__arm_viwdupq_m_wb_u8): Delete.
	(__arm_viwdupq_m_wb_u32): Delete.
	(__arm_viwdupq_m_wb_u16): Delete.
	(__arm_viwdupq_n_u8): Delete.
	(__arm_viwdupq_n_u32): Delete.
	(__arm_viwdupq_n_u16): Delete.
	(__arm_viwdupq_wb_u8): Delete.
	(__arm_viwdupq_wb_u32): Delete.
	(__arm_viwdupq_wb_u16): Delete.
	(__arm_vdwdupq_x_n_u8): Delete.
	(__arm_vdwdupq_x_n_u16): Delete.
	(__arm_vdwdupq_x_n_u32): Delete.
	(__arm_vdwdupq_x_wb_u8): Delete.
	(__arm_vdwdupq_x_wb_u16): Delete.
	(__arm_vdwdupq_x_wb_u32): Delete.
	(__arm_viwdupq_x_n_u8): Delete.
	(__arm_viwdupq_x_n_u16): Delete.
	(__arm_viwdupq_x_n_u32): Delete.
	(__arm_viwdupq_x_wb_u8): Delete.
	(__arm_viwdupq_x_wb_u16): Delete.
	(__arm_viwdupq_x_wb_u32): Delete.
	(__arm_vdwdupq_m): Delete.
	(__arm_vdwdupq_u8): Delete.
	(__arm_vdwdupq_u32): Delete.
	(__arm_vdwdupq_u16): Delete.
	(__arm_viwdupq_m): Delete.
	(__arm_viwdupq_u8): Delete.
	(__arm_viwdupq_u32): Delete.
	(__arm_viwdupq_u16): Delete.
	(__arm_vdwdupq_x_u8): Delete.
	(__arm_vdwdupq_x_u16): Delete.
	(__arm_vdwdupq_x_u32): Delete.
	(__arm_viwdupq_x_u8): Delete.
	(__arm_viwdupq_x_u16): Delete.
	(__arm_viwdupq_x_u32): Delete.
	* config/arm/mve.md (@mve_<mve_insn>q_m_wb_u<mode>_insn): Swap
	operands 1 and 2.
2024-10-18 07:41:14 +00:00
Christophe Lyon
ec11666805 arm: [MVE intrinsics] add vidwdup shape
This patch adds the vidwdup shape description for vdwdup and viwdup.

It is very similar to viddup, but accounts for the additional 'wrap'
scalar parameter.

2024-08-21  Christophe Lyon  <christophe.lyon@linaro.org>

	gcc/
	* config/arm/arm-mve-builtins-shapes.cc (vidwdup): New.
	* config/arm/arm-mve-builtins-shapes.h (vidwdup): New.
2024-10-18 07:41:13 +00:00
Christophe Lyon
42be837c36 arm: [MVE intrinsics] factorize vdwdup viwdup
Factorize vdwdup and viwdup so that they use the same parameterized
names.

Like with vddup and vidup, we do not bother with the corresponding
expanders, as we stop using them in a subsequent patch.

The patch also adds the missing attributes to vdwdupq_wb_u_insn and
viwdupq_wb_u_insn patterns.

2024-08-21  Christophe Lyon  <christophe.lyon@linaro.org>

	gcc/
	* config/arm/iterators.md (mve_insn): Add VIWDUPQ, VDWDUPQ,
	VIWDUPQ_M, VDWDUPQ_M.
	(VIDWDUPQ): New iterator.
	(VIDWDUPQ_M): New iterator.
	* config/arm/mve.md (mve_vdwdupq_wb_u<mode>_insn)
	(mve_viwdupq_wb_u<mode>_insn): Merge into ...
	(@mve_<mve_insn>q_wb_u<mode>_insn): ... this. Add missing
	mve_unpredicated_insn and mve_move attributes.
	(mve_vdwdupq_m_wb_u<mode>_insn, mve_viwdupq_m_wb_u<mode>_insn):
	Merge into ...
	(@mve_<mve_insn>q_m_wb_u<mode>_insn): ... this.
2024-10-18 07:41:13 +00:00
Christophe Lyon
2fd08f37d5 arm: [MVE intrinsics] fix checks of immediate arguments
As discussed in [1], it is better to use "su64" for immediates in
intrinsics signatures in order to provide better diagnostics
(erroneous constants are not truncated for instance).  This patch thus
uses su64 instead of ss32 in binary_lshift_unsigned,
binary_rshift_narrow, binary_rshift_narrow_unsigned, ternary_lshift,
ternary_rshift.

In addition, we fix cases where we called require_integer_immediate
whereas we just want to check that the argument is a scalar, and thus
use require_scalar_type in binary_acca_int32, binary_acca_int64,
unary_int32_acc.

Finally, in binary_lshift_unsigned we just want to check that 'imm' is
an immediate, not the optional predicates.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2024-August/660262.html

2024-08-21  Christophe Lyon  <christophe.lyon@linaro.org>

	gcc/
	* config/arm/arm-mve-builtins-shapes.cc (binary_acca_int32): Fix
	check of scalar argument.
	(binary_acca_int64): Likewise.
	(binary_lshift_unsigned): Likewise.
	(binary_rshift_narrow): Likewise.
	(binary_rshift_narrow_unsigned): Likewise.
	(ternary_lshift): Likewise.
	(ternary_rshift): Likewise.
	(unary_int32_acc): Likewise.
2024-10-18 07:41:13 +00:00
Christophe Lyon
f936ddb753 arm: [MVE intrinsics] remove v[id]dup expanders
We use code_for_mve_q_u_insn, rather than the expanders used by the
previous implementation, so we can remove the expanders and their
declaration as builtins.

2024-08-21  Christophe Lyon  <christophe.lyon@linaro.org>

	gcc/
	* config/arm/arm_mve_builtins.def (vddupq_n_u, vidupq_n_u)
	(vddupq_m_n_u, vidupq_m_n_u): Delete.
	* config/arm/mve.md (mve_vidupq_n_u<mode>, mve_vidupq_m_n_u<mode>)
	(mve_vddupq_n_u<mode>, mve_vddupq_m_n_u<mode>): Delete.
2024-10-18 07:41:13 +00:00
Christophe Lyon
faaf83b9bc arm: [MVE intrinsics] update v[id]dup tests
Testing v[id]dup overloads with '1' as argument for uint32_t* does not
make sense: instead of choosing the '_wb' overload, we choose the
'_n', but we already do that in the '_n' tests.

This patch removes all such bogus foo2 functions.

2024-08-28  Christophe Lyon  <christophe.lyon@linaro.org>

	gcc/testsuite/
	* gcc.target/arm/mve/intrinsics/vddupq_m_wb_u16.c: Remove foo2.
	* gcc.target/arm/mve/intrinsics/vddupq_m_wb_u32.c: Remove foo2.
	* gcc.target/arm/mve/intrinsics/vddupq_m_wb_u8.c: Remove foo2.
	* gcc.target/arm/mve/intrinsics/vddupq_wb_u16.c: Remove foo2.
	* gcc.target/arm/mve/intrinsics/vddupq_wb_u32.c: Remove foo2.
	* gcc.target/arm/mve/intrinsics/vddupq_wb_u8.c: Remove foo2.
	* gcc.target/arm/mve/intrinsics/vddupq_x_wb_u16.c: Remove foo2.
	* gcc.target/arm/mve/intrinsics/vddupq_x_wb_u32.c: Remove foo2.
	* gcc.target/arm/mve/intrinsics/vddupq_x_wb_u8.c: Remove foo2.
	* gcc.target/arm/mve/intrinsics/vidupq_m_wb_u16.c: Remove foo2.
	* gcc.target/arm/mve/intrinsics/vidupq_m_wb_u32.c: Remove foo2.
	* gcc.target/arm/mve/intrinsics/vidupq_m_wb_u8.c: Remove foo2.
	* gcc.target/arm/mve/intrinsics/vidupq_wb_u16.c: Remove foo2.
	* gcc.target/arm/mve/intrinsics/vidupq_wb_u32.c: Remove foo2.
	* gcc.target/arm/mve/intrinsics/vidupq_wb_u8.c: Remove foo2.
	* gcc.target/arm/mve/intrinsics/vidupq_x_wb_u16.c: Remove foo2.
	* gcc.target/arm/mve/intrinsics/vidupq_x_wb_u32.c: Remove foo2.
	* gcc.target/arm/mve/intrinsics/vidupq_x_wb_u8.c: Remove foo2.
2024-10-18 07:41:12 +00:00
Christophe Lyon
d7250b623f arm: [MVE intrinsics] rework vddup vidup
Implement vddup and vidup using the new MVE builtins framework.

We generate better code because we take advantage of the two outputs
produced by the v[id]dup instructions.

For instance, before:
	ldr	r3, [r0]
	sub	r2, r3, #8
	str	r2, [r0]
	mov	r2, r3
	vddup.u16	q3, r2, #1

now:
	ldr	r2, [r0]
	vddup.u16	q3, r2, #1
	str	r2, [r0]

2024-08-21  Christophe Lyon  <christophe.lyon@linaro.org>

	gcc/
	* config/arm/arm-mve-builtins-base.cc (class viddup_impl): New.
	(vddup): New.
	(vidup): New.
	* config/arm/arm-mve-builtins-base.def (vddupq): New.
	(vidupq): New.
	* config/arm/arm-mve-builtins-base.h (vddupq): New.
	(vidupq): New.
	* config/arm/arm_mve.h (vddupq_m): Delete.
	(vddupq_u8): Delete.
	(vddupq_u32): Delete.
	(vddupq_u16): Delete.
	(vidupq_m): Delete.
	(vidupq_u8): Delete.
	(vidupq_u32): Delete.
	(vidupq_u16): Delete.
	(vddupq_x_u8): Delete.
	(vddupq_x_u16): Delete.
	(vddupq_x_u32): Delete.
	(vidupq_x_u8): Delete.
	(vidupq_x_u16): Delete.
	(vidupq_x_u32): Delete.
	(vddupq_m_n_u8): Delete.
	(vddupq_m_n_u32): Delete.
	(vddupq_m_n_u16): Delete.
	(vddupq_m_wb_u8): Delete.
	(vddupq_m_wb_u16): Delete.
	(vddupq_m_wb_u32): Delete.
	(vddupq_n_u8): Delete.
	(vddupq_n_u32): Delete.
	(vddupq_n_u16): Delete.
	(vddupq_wb_u8): Delete.
	(vddupq_wb_u16): Delete.
	(vddupq_wb_u32): Delete.
	(vidupq_m_n_u8): Delete.
	(vidupq_m_n_u32): Delete.
	(vidupq_m_n_u16): Delete.
	(vidupq_m_wb_u8): Delete.
	(vidupq_m_wb_u16): Delete.
	(vidupq_m_wb_u32): Delete.
	(vidupq_n_u8): Delete.
	(vidupq_n_u32): Delete.
	(vidupq_n_u16): Delete.
	(vidupq_wb_u8): Delete.
	(vidupq_wb_u16): Delete.
	(vidupq_wb_u32): Delete.
	(vddupq_x_n_u8): Delete.
	(vddupq_x_n_u16): Delete.
	(vddupq_x_n_u32): Delete.
	(vddupq_x_wb_u8): Delete.
	(vddupq_x_wb_u16): Delete.
	(vddupq_x_wb_u32): Delete.
	(vidupq_x_n_u8): Delete.
	(vidupq_x_n_u16): Delete.
	(vidupq_x_n_u32): Delete.
	(vidupq_x_wb_u8): Delete.
	(vidupq_x_wb_u16): Delete.
	(vidupq_x_wb_u32): Delete.
	(__arm_vddupq_m_n_u8): Delete.
	(__arm_vddupq_m_n_u32): Delete.
	(__arm_vddupq_m_n_u16): Delete.
	(__arm_vddupq_m_wb_u8): Delete.
	(__arm_vddupq_m_wb_u16): Delete.
	(__arm_vddupq_m_wb_u32): Delete.
	(__arm_vddupq_n_u8): Delete.
	(__arm_vddupq_n_u32): Delete.
	(__arm_vddupq_n_u16): Delete.
	(__arm_vidupq_m_n_u8): Delete.
	(__arm_vidupq_m_n_u32): Delete.
	(__arm_vidupq_m_n_u16): Delete.
	(__arm_vidupq_n_u8): Delete.
	(__arm_vidupq_m_wb_u8): Delete.
	(__arm_vidupq_m_wb_u16): Delete.
	(__arm_vidupq_m_wb_u32): Delete.
	(__arm_vidupq_n_u32): Delete.
	(__arm_vidupq_n_u16): Delete.
	(__arm_vidupq_wb_u8): Delete.
	(__arm_vidupq_wb_u16): Delete.
	(__arm_vidupq_wb_u32): Delete.
	(__arm_vddupq_wb_u8): Delete.
	(__arm_vddupq_wb_u16): Delete.
	(__arm_vddupq_wb_u32): Delete.
	(__arm_vddupq_x_n_u8): Delete.
	(__arm_vddupq_x_n_u16): Delete.
	(__arm_vddupq_x_n_u32): Delete.
	(__arm_vddupq_x_wb_u8): Delete.
	(__arm_vddupq_x_wb_u16): Delete.
	(__arm_vddupq_x_wb_u32): Delete.
	(__arm_vidupq_x_n_u8): Delete.
	(__arm_vidupq_x_n_u16): Delete.
	(__arm_vidupq_x_n_u32): Delete.
	(__arm_vidupq_x_wb_u8): Delete.
	(__arm_vidupq_x_wb_u16): Delete.
	(__arm_vidupq_x_wb_u32): Delete.
	(__arm_vddupq_m): Delete.
	(__arm_vddupq_u8): Delete.
	(__arm_vddupq_u32): Delete.
	(__arm_vddupq_u16): Delete.
	(__arm_vidupq_m): Delete.
	(__arm_vidupq_u8): Delete.
	(__arm_vidupq_u32): Delete.
	(__arm_vidupq_u16): Delete.
	(__arm_vddupq_x_u8): Delete.
	(__arm_vddupq_x_u16): Delete.
	(__arm_vddupq_x_u32): Delete.
	(__arm_vidupq_x_u8): Delete.
	(__arm_vidupq_x_u16): Delete.
	(__arm_vidupq_x_u32): Delete.
2024-10-18 07:41:12 +00:00
Christophe Lyon
e38566afb4 arm: [MVE intrinsics] add viddup shape
This patch adds the viddup shape description for vidup and vddup.

This requires the addition of report_not_one_of and
function_checker::require_immediate_one_of to
gcc/config/arm/arm-mve-builtins.cc (they are copies of the aarch64 SVE
counterpart).

This patch also introduces MODE_wb.

2024-08-21  Christophe Lyon  <christophe.lyon@linaro.org>

	gcc/

	* config/arm/arm-mve-builtins-shapes.cc (viddup): New.
	* config/arm/arm-mve-builtins-shapes.h (viddup): New.
	* config/arm/arm-mve-builtins.cc (report_not_one_of): New.
	(function_checker::require_immediate_one_of): New.
	* config/arm/arm-mve-builtins.def (wb): New mode.
	* config/arm/arm-mve-builtins.h (function_checker) Add
	require_immediate_one_of.
2024-10-18 07:41:12 +00:00
Christophe Lyon
387b121467 arm: [MVE intrinsics] factorize vddup vidup
Factorize vddup and vidup so that they use the same parameterized
names.

This patch updates only the (define_insn
"@mve_<mve_insn>q_u<mode>_insn") patterns and does not bother with the
(define_expand "mve_vidupq_n_u<mode>") ones, because a subsequent
patch avoids using them.

2024-08-21  Christophe Lyon  <christophe.lyon@linaro.org>

	gcc/
	* config/arm/iterators.md (mve_insn): Add VIDUPQ, VDDUPQ,
	VIDUPQ_M, VDDUPQ_M.
	(viddupq_op): New.
	(viddupq_m_op): New.
	(VIDDUPQ): New.
	(VIDDUPQ_M): New.
	* config/arm/mve.md (mve_vddupq_u<mode>_insn)
	(mve_vidupq_u<mode>_insn): Merge into ...
	(mve_<mve_insn>q_u<mode>_insn): ... this.
	(mve_vddupq_m_wb_u<mode>_insn, mve_vidupq_m_wb_u<mode>_insn):
	Merge into ...
	(mve_<mve_insn>q_m_wb_u<mode>_insn): ... this.
2024-10-18 07:41:11 +00:00
Christophe Lyon
e4366770dc arm: [MVE intrinsics] rework vctp
Implement vctp using the new MVE builtins framework.

2024-08-21  Christophe Lyon  <christophe.lyon@linaro.org>

gcc/ChangeLog:

	* config/arm/arm-mve-builtins-base.cc (class vctpq_impl): New.
	(vctp16q): New.
	(vctp32q): New.
	(vctp64q): New.
	(vctp8q): New.
	* config/arm/arm-mve-builtins-base.def (vctp16q): New.
	(vctp32q): New.
	(vctp64q): New.
	(vctp8q): New.
	* config/arm/arm-mve-builtins-base.h (vctp16q): New.
	(vctp32q): New.
	(vctp64q): New.
	(vctp8q): New.
	* config/arm/arm-mve-builtins-shapes.cc (vctp): New.
	* config/arm/arm-mve-builtins-shapes.h (vctp): New.
	* config/arm/arm-mve-builtins.cc
	(function_instance::has_inactive_argument): Add support for vctp.
	* config/arm/arm_mve.h (vctp16q): Delete.
	(vctp32q): Delete.
	(vctp64q): Delete.
	(vctp8q): Delete.
	(vctp8q_m): Delete.
	(vctp64q_m): Delete.
	(vctp32q_m): Delete.
	(vctp16q_m): Delete.
	(__arm_vctp16q): Delete.
	(__arm_vctp32q): Delete.
	(__arm_vctp64q): Delete.
	(__arm_vctp8q): Delete.
	(__arm_vctp8q_m): Delete.
	(__arm_vctp64q_m): Delete.
	(__arm_vctp32q_m): Delete.
	(__arm_vctp16q_m): Delete.
	* config/arm/mve.md (mve_vctp<MVE_vctp>q<MVE_vpred>): Add '@'
	prefix.
	(mve_vctp<MVE_vctp>q_m<MVE_vpred>): Likewise.
2024-10-18 07:41:11 +00:00
Christophe Lyon
da92e77ed4 arm: [MVE intrinsics] rework vorn
Implement vorn using the new MVE builtins framework.

2024-07-11  Christophe Lyon  <christophe.lyon@linaro.org>

	gcc/
	* config/arm/arm-mve-builtins-base.cc (vornq): New.
	* config/arm/arm-mve-builtins-base.def (vornq): New.
	* config/arm/arm-mve-builtins-base.h (vornq): New.
	* config/arm/arm-mve-builtins-functions.h (class
	unspec_based_mve_function_exact_insn_vorn): New.
	* config/arm/arm_mve.h (vornq): Delete.
	(vornq_m): Delete.
	(vornq_x): Delete.
	(vornq_u8): Delete.
	(vornq_s8): Delete.
	(vornq_u16): Delete.
	(vornq_s16): Delete.
	(vornq_u32): Delete.
	(vornq_s32): Delete.
	(vornq_f16): Delete.
	(vornq_f32): Delete.
	(vornq_m_s8): Delete.
	(vornq_m_s32): Delete.
	(vornq_m_s16): Delete.
	(vornq_m_u8): Delete.
	(vornq_m_u32): Delete.
	(vornq_m_u16): Delete.
	(vornq_m_f32): Delete.
	(vornq_m_f16): Delete.
	(vornq_x_s8): Delete.
	(vornq_x_s16): Delete.
	(vornq_x_s32): Delete.
	(vornq_x_u8): Delete.
	(vornq_x_u16): Delete.
	(vornq_x_u32): Delete.
	(vornq_x_f16): Delete.
	(vornq_x_f32): Delete.
	(__arm_vornq_u8): Delete.
	(__arm_vornq_s8): Delete.
	(__arm_vornq_u16): Delete.
	(__arm_vornq_s16): Delete.
	(__arm_vornq_u32): Delete.
	(__arm_vornq_s32): Delete.
	(__arm_vornq_m_s8): Delete.
	(__arm_vornq_m_s32): Delete.
	(__arm_vornq_m_s16): Delete.
	(__arm_vornq_m_u8): Delete.
	(__arm_vornq_m_u32): Delete.
	(__arm_vornq_m_u16): Delete.
	(__arm_vornq_x_s8): Delete.
	(__arm_vornq_x_s16): Delete.
	(__arm_vornq_x_s32): Delete.
	(__arm_vornq_x_u8): Delete.
	(__arm_vornq_x_u16): Delete.
	(__arm_vornq_x_u32): Delete.
	(__arm_vornq_f16): Delete.
	(__arm_vornq_f32): Delete.
	(__arm_vornq_m_f32): Delete.
	(__arm_vornq_m_f16): Delete.
	(__arm_vornq_x_f16): Delete.
	(__arm_vornq_x_f32): Delete.
	(__arm_vornq): Delete.
	(__arm_vornq_m): Delete.
	(__arm_vornq_x): Delete.
2024-10-18 07:41:11 +00:00