aarch64: Avoid false dependencies for SVE unary operations

For calls like:

        z0 = svabs_s8_x (p0, z1)

we previously generated:

        abs     z0.b, p0/m, z1.b

However, this creates a false dependency on z0 (the merge input).
This can lead to strange results in some cases, e.g. serialising
the operation behind arbitrary earlier operations, or preventing
two iterations of a loop from being executed in parallel.

This patch therefore ties the input to the output, using a MOVPRFX
if necessary and possible.  (The SVE2 unary long instructions do
not support MOVPRFX.)

When testing the patch, I hit a bug in the big-endian SVE move
optimisation in aarch64_maybe_expand_sve_subreg_move.  I don't
have an indepenedent testcase for it, so I didn't split it out
into a separate patch.

gcc/
	* config/aarch64/aarch64.c (aarch64_maybe_expand_sve_subreg_move):
	Do not optimize LRA subregs.
	* config/aarch64/aarch64-sve.md
	(@aarch64_pred_<SVE_INT_UNARY:optab><mode>): Tie the input to the
	output.
	(@aarch64_sve_revbhw_<SVE_ALL:mode><PRED_HSD:mode>): Likewise.
	(*<ANY_EXTEND:optab><SVE_PARTIAL_I:mode><SVE_HSDI:mode>2): Likewise.
	(@aarch64_pred_sxt<SVE_FULL_HSDI:mode><SVE_PARTIAL_I:mode>): Likewise.
	(*cnot<mode>): Likewise.
	(@aarch64_pred_<SVE_COND_FP_UNARY:optab><mode>): Likewise.
	(@aarch64_sve_<optab>_nontrunc<SVE_FULL_F:mode><SVE_FULL_HSDI:mode>):
	Likewise.
	(@aarch64_sve_<optab>_trunc<VNx2DF_ONLY:mode><VNx4SI_ONLY:mode>):
	Likewise.
	(@aarch64_sve_<optab>_nonextend<SVE_FULL_HSDI:mode><SVE_FULL_F:mode>):
	Likewise.
	(@aarch64_sve_<optab>_extend<VNx4SI_ONLY:mode><VNx2DF_ONLY:mode>):
	Likewise.
	(@aarch64_sve_<optab>_trunc<SVE_FULL_SDF:mode><SVE_FULL_HSF:mode>):
	Likewise.
	(@aarch64_sve_<optab>_trunc<VNx4SF_ONLY:mode><VNx8BF_ONLY:mode>):
	Likewise.
	(@aarch64_sve_<optab>_nontrunc<SVE_FULL_HSF:mode><SVE_FULL_SDF:mode>):
	Likewise.
	* config/aarch64/aarch64-sve2.md
	(@aarch64_pred_<SVE2_COND_FP_UNARY_LONG:sve_fp_op><mode>): Likewise.
	(@aarch64_pred_<SVE2_COND_FP_UNARY_NARROWB:sve_fp_op><mode>): Likewise.
	(@aarch64_pred_<SVE2_U32_UNARY:sve_int_op><mode>): Likewise.
	(@aarch64_pred_<SVE2_COND_INT_UNARY_FP:sve_fp_op><mode>): Likewise.

gcc/testsuite/
	* gcc.target/aarch64/sve/acle/asm/abs_f16.c (abs_f16_x_untied): Expect
	a MOVPRFX instruction.
	* gcc.target/aarch64/sve/acle/asm/abs_f32.c (abs_f32_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/abs_f64.c (abs_f64_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/abs_s16.c (abs_s16_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/abs_s32.c (abs_s32_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/abs_s64.c (abs_s64_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/abs_s8.c (abs_s8_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/cls_s16.c (cls_s16_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/cls_s32.c (cls_s32_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/cls_s64.c (cls_s64_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/cls_s8.c (cls_s8_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/clz_s16.c (clz_s16_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/clz_s32.c (clz_s32_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/clz_s64.c (clz_s64_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/clz_s8.c (clz_s8_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/clz_u16.c (clz_u16_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/clz_u32.c (clz_u32_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/clz_u64.c (clz_u64_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/clz_u8.c (clz_u8_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/cnot_s16.c (cnot_s16_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/cnot_s32.c (cnot_s32_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/cnot_s64.c (cnot_s64_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/cnot_s8.c (cnot_s8_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/cnot_u16.c (cnot_u16_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/cnot_u32.c (cnot_u32_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/cnot_u64.c (cnot_u64_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/cnot_u8.c (cnot_u8_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/cnt_bf16.c (cnt_bf16_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/cnt_f16.c (cnt_f16_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/cnt_f32.c (cnt_f32_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/cnt_f64.c (cnt_f64_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/cnt_s16.c (cnt_s16_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/cnt_s32.c (cnt_s32_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/cnt_s64.c (cnt_s64_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/cnt_s8.c (cnt_s8_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/cnt_u16.c (cnt_u16_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/cnt_u32.c (cnt_u32_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/cnt_u64.c (cnt_u64_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/cnt_u8.c (cnt_u8_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/cvt_bf16.c (cvt_bf16_f32_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/cvt_f16.c (cvt_f16_f32_x_untied)
	(cvt_f16_f64_x_untied, cvt_f16_s16_x_untied, cvt_f16_s32_x_untied)
	(cvt_f16_s64_x_untied, cvt_f16_u16_x_untied, cvt_f16_u32_x_untied)
	(cvt_f16_u64_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/cvt_f32.c (cvt_f32_f16_x_untied)
	(cvt_f32_f64_x_untied, cvt_f32_s16_x_untied, cvt_f32_s32_x_untied)
	(cvt_f32_s64_x_untied, cvt_f32_u16_x_untied, cvt_f32_u32_x_untied)
	(cvt_f32_u64_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/cvt_f64.c (cvt_f64_f16_x_untied)
	(cvt_f64_f32_x_untied, cvt_f64_s16_x_untied, cvt_f64_s32_x_untied)
	(cvt_f64_s64_x_untied, cvt_f64_u16_x_untied, cvt_f64_u32_x_untied)
	(cvt_f64_u64_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/cvt_s16.c (cvt_s16_f16_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/cvt_s32.c (cvt_s32_f16_x_untied)
	(cvt_s32_f32_x_untied, cvt_s32_s64_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/cvt_s64.c (cvt_s64_f16_x_untied)
	(cvt_s64_f32_x_untied, cvt_s64_s64_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/cvt_u16.c (cvt_u16_f16_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/cvt_u32.c (cvt_u32_f16_x_untied)
	(cvt_u32_f32_x_untied, cvt_u32_u64_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/cvt_u64.c (cvt_u64_f16_x_untied)
	(cvt_u64_f32_x_untied, cvt_u64_u64_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/extb_s16.c (extb_s16_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/extb_s32.c (extb_s32_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/extb_s64.c (extb_s64_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/exth_s32.c (exth_s32_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/exth_s64.c (exth_s64_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/extw_s64.c (extw_s64_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/neg_f16.c (neg_f16_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/neg_f32.c (neg_f32_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/neg_f64.c (neg_f64_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/neg_s16.c (neg_s16_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/neg_s32.c (neg_s32_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/neg_s64.c (neg_s64_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/neg_s8.c (neg_s8_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/not_s16.c (not_s16_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/not_s32.c (not_s32_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/not_s64.c (not_s64_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/not_s8.c (not_s8_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/not_u16.c (not_u16_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/not_u32.c (not_u32_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/not_u64.c (not_u64_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/not_u8.c (not_u8_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/rbit_s16.c (rbit_s16_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/rbit_s32.c (rbit_s32_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/rbit_s64.c (rbit_s64_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/rbit_s8.c (rbit_s8_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/rbit_u16.c (rbit_u16_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/rbit_u32.c (rbit_u32_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/rbit_u64.c (rbit_u64_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/rbit_u8.c (rbit_u8_x_untied): Ditto.
	* gcc.target/aarch64/sve/acle/asm/recpx_f16.c (recpx_f16_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/recpx_f32.c (recpx_f32_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/recpx_f64.c (recpx_f64_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/revb_s16.c (revb_s16_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/revb_s32.c (revb_s32_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/revb_s64.c (revb_s64_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/revb_u16.c (revb_u16_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/revb_u32.c (revb_u32_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/revb_u64.c (revb_u64_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/revh_s32.c (revh_s32_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/revh_s64.c (revh_s64_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/revh_u32.c (revh_u32_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/revh_u64.c (revh_u64_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/revw_s64.c (revw_s64_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/revw_u64.c (revw_u64_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/rinta_f16.c (rinta_f16_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/rinta_f32.c (rinta_f32_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/rinta_f64.c (rinta_f64_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/rinti_f16.c (rinti_f16_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/rinti_f32.c (rinti_f32_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/rinti_f64.c (rinti_f64_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/rintm_f16.c (rintm_f16_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/rintm_f32.c (rintm_f32_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/rintm_f64.c (rintm_f64_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/rintn_f16.c (rintn_f16_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/rintn_f32.c (rintn_f32_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/rintn_f64.c (rintn_f64_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/rintp_f16.c (rintp_f16_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/rintp_f32.c (rintp_f32_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/rintp_f64.c (rintp_f64_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/rintx_f16.c (rintx_f16_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/rintx_f32.c (rintx_f32_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/rintx_f64.c (rintx_f64_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/rintz_f16.c (rintz_f16_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/rintz_f32.c (rintz_f32_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/rintz_f64.c (rintz_f64_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/sqrt_f16.c (sqrt_f16_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/sqrt_f32.c (sqrt_f32_x_untied):
	Ditto.
	* gcc.target/aarch64/sve/acle/asm/sqrt_f64.c (sqrt_f64_x_untied):
	Ditto.
	* gcc.target/aarch64/sve2/acle/asm/cvtx_f32.c (cvtx_f32_f64_x_untied):
	Ditto.
	* gcc.target/aarch64/sve2/acle/asm/logb_f16.c (logb_f16_x_untied):
	Ditto.
	* gcc.target/aarch64/sve2/acle/asm/logb_f32.c (logb_f32_x_untied):
	Ditto.
	* gcc.target/aarch64/sve2/acle/asm/logb_f64.c (logb_f64_x_untied):
	Ditto.
	* gcc.target/aarch64/sve2/acle/asm/qabs_s16.c (qabs_s16_x_untied):
	Ditto.
	* gcc.target/aarch64/sve2/acle/asm/qabs_s32.c (qabs_s32_x_untied):
	Ditto.
	* gcc.target/aarch64/sve2/acle/asm/qabs_s64.c (qabs_s64_x_untied):
	Ditto.
	* gcc.target/aarch64/sve2/acle/asm/qabs_s8.c (qabs_s8_x_untied):
	Ditto.
	* gcc.target/aarch64/sve2/acle/asm/qneg_s16.c (qneg_s16_x_untied):
	Ditto.
	* gcc.target/aarch64/sve2/acle/asm/qneg_s32.c (qneg_s32_x_untied):
	Ditto.
	* gcc.target/aarch64/sve2/acle/asm/qneg_s64.c (qneg_s64_x_untied):
	Ditto.
	* gcc.target/aarch64/sve2/acle/asm/qneg_s8.c (qneg_s8_x_untied):
	Ditto.
	* gcc.target/aarch64/sve2/acle/asm/recpe_u32.c (recpe_u32_x_untied):
	Ditto.
	* gcc.target/aarch64/sve2/acle/asm/rsqrte_u32.c (rsqrte_u32_x_untied):
	Ditto.
	* gcc.target/aarch64/sve2/acle/asm/cvtlt_f32.c
	(cvtlt_f32_f16_x_untied): Expect a MOV instruction.
	* gcc.target/aarch64/sve2/acle/asm/cvtlt_f64.c
	(cvtlt_f64_f32_x_untied): Likewise.
This commit is contained in:
Richard Sandiford 2020-11-25 16:14:20 +00:00
parent 4aff491ffc
commit a4d9837ee4
136 changed files with 319 additions and 74 deletions

View File

@ -2925,14 +2925,17 @@
;; Integer unary arithmetic predicated with a PTRUE.
(define_insn "@aarch64_pred_<optab><mode>"
[(set (match_operand:SVE_I 0 "register_operand" "=w")
[(set (match_operand:SVE_I 0 "register_operand" "=w, ?&w")
(unspec:SVE_I
[(match_operand:<VPRED> 1 "register_operand" "Upl")
[(match_operand:<VPRED> 1 "register_operand" "Upl, Upl")
(SVE_INT_UNARY:SVE_I
(match_operand:SVE_I 2 "register_operand" "w"))]
(match_operand:SVE_I 2 "register_operand" "0, w"))]
UNSPEC_PRED_X))]
"TARGET_SVE"
"<sve_int_op>\t%0.<Vetype>, %1/m, %2.<Vetype>"
"@
<sve_int_op>\t%0.<Vetype>, %1/m, %2.<Vetype>
movprfx\t%0, %2\;<sve_int_op>\t%0.<Vetype>, %1/m, %2.<Vetype>"
[(set_attr "movprfx" "*,yes")]
)
;; Predicated integer unary arithmetic with merging.
@ -2998,15 +3001,18 @@
;; Predicated integer unary operations.
(define_insn "@aarch64_pred_<optab><mode>"
[(set (match_operand:SVE_FULL_I 0 "register_operand" "=w")
[(set (match_operand:SVE_FULL_I 0 "register_operand" "=w, ?&w")
(unspec:SVE_FULL_I
[(match_operand:<VPRED> 1 "register_operand" "Upl")
[(match_operand:<VPRED> 1 "register_operand" "Upl, Upl")
(unspec:SVE_FULL_I
[(match_operand:SVE_FULL_I 2 "register_operand" "w")]
[(match_operand:SVE_FULL_I 2 "register_operand" "0, w")]
SVE_INT_UNARY)]
UNSPEC_PRED_X))]
"TARGET_SVE && <elem_bits> >= <min_elem_bits>"
"<sve_int_op>\t%0.<Vetype>, %1/m, %2.<Vetype>"
"@
<sve_int_op>\t%0.<Vetype>, %1/m, %2.<Vetype>
movprfx\t%0, %2\;<sve_int_op>\t%0.<Vetype>, %1/m, %2.<Vetype>"
[(set_attr "movprfx" "*,yes")]
)
;; Another way of expressing the REVB, REVH and REVW patterns, with this
@ -3014,15 +3020,18 @@
;; of lanes and the data mode decides the granularity of the reversal within
;; each lane.
(define_insn "@aarch64_sve_revbhw_<SVE_ALL:mode><PRED_HSD:mode>"
[(set (match_operand:SVE_ALL 0 "register_operand" "=w")
[(set (match_operand:SVE_ALL 0 "register_operand" "=w, ?&w")
(unspec:SVE_ALL
[(match_operand:PRED_HSD 1 "register_operand" "Upl")
[(match_operand:PRED_HSD 1 "register_operand" "Upl, Upl")
(unspec:SVE_ALL
[(match_operand:SVE_ALL 2 "register_operand" "w")]
[(match_operand:SVE_ALL 2 "register_operand" "0, w")]
UNSPEC_REVBHW)]
UNSPEC_PRED_X))]
"TARGET_SVE && <PRED_HSD:elem_bits> > <SVE_ALL:container_bits>"
"rev<SVE_ALL:Vcwtype>\t%0.<PRED_HSD:Vetype>, %1/m, %2.<PRED_HSD:Vetype>"
"@
rev<SVE_ALL:Vcwtype>\t%0.<PRED_HSD:Vetype>, %1/m, %2.<PRED_HSD:Vetype>
movprfx\t%0, %2\;rev<SVE_ALL:Vcwtype>\t%0.<PRED_HSD:Vetype>, %1/m, %2.<PRED_HSD:Vetype>"
[(set_attr "movprfx" "*,yes")]
)
;; Predicated integer unary operations with merging.
@ -3071,28 +3080,34 @@
;; Predicated sign and zero extension from a narrower mode.
(define_insn "*<optab><SVE_PARTIAL_I:mode><SVE_HSDI:mode>2"
[(set (match_operand:SVE_HSDI 0 "register_operand" "=w")
[(set (match_operand:SVE_HSDI 0 "register_operand" "=w, ?&w")
(unspec:SVE_HSDI
[(match_operand:<SVE_HSDI:VPRED> 1 "register_operand" "Upl")
[(match_operand:<SVE_HSDI:VPRED> 1 "register_operand" "Upl, Upl")
(ANY_EXTEND:SVE_HSDI
(match_operand:SVE_PARTIAL_I 2 "register_operand" "w"))]
(match_operand:SVE_PARTIAL_I 2 "register_operand" "0, w"))]
UNSPEC_PRED_X))]
"TARGET_SVE && (~<SVE_HSDI:narrower_mask> & <SVE_PARTIAL_I:self_mask>) == 0"
"<su>xt<SVE_PARTIAL_I:Vesize>\t%0.<SVE_HSDI:Vetype>, %1/m, %2.<SVE_HSDI:Vetype>"
"@
<su>xt<SVE_PARTIAL_I:Vesize>\t%0.<SVE_HSDI:Vetype>, %1/m, %2.<SVE_HSDI:Vetype>
movprfx\t%0, %2\;<su>xt<SVE_PARTIAL_I:Vesize>\t%0.<SVE_HSDI:Vetype>, %1/m, %2.<SVE_HSDI:Vetype>"
[(set_attr "movprfx" "*,yes")]
)
;; Predicated truncate-and-sign-extend operations.
(define_insn "@aarch64_pred_sxt<SVE_FULL_HSDI:mode><SVE_PARTIAL_I:mode>"
[(set (match_operand:SVE_FULL_HSDI 0 "register_operand" "=w")
[(set (match_operand:SVE_FULL_HSDI 0 "register_operand" "=w, ?&w")
(unspec:SVE_FULL_HSDI
[(match_operand:<SVE_FULL_HSDI:VPRED> 1 "register_operand" "Upl")
[(match_operand:<SVE_FULL_HSDI:VPRED> 1 "register_operand" "Upl, Upl")
(sign_extend:SVE_FULL_HSDI
(truncate:SVE_PARTIAL_I
(match_operand:SVE_FULL_HSDI 2 "register_operand" "w")))]
(match_operand:SVE_FULL_HSDI 2 "register_operand" "0, w")))]
UNSPEC_PRED_X))]
"TARGET_SVE
&& (~<SVE_FULL_HSDI:narrower_mask> & <SVE_PARTIAL_I:self_mask>) == 0"
"sxt<SVE_PARTIAL_I:Vesize>\t%0.<SVE_FULL_HSDI:Vetype>, %1/m, %2.<SVE_FULL_HSDI:Vetype>"
"@
sxt<SVE_PARTIAL_I:Vesize>\t%0.<SVE_FULL_HSDI:Vetype>, %1/m, %2.<SVE_FULL_HSDI:Vetype>
movprfx\t%0, %2\;sxt<SVE_PARTIAL_I:Vesize>\t%0.<SVE_FULL_HSDI:Vetype>, %1/m, %2.<SVE_FULL_HSDI:Vetype>"
[(set_attr "movprfx" "*,yes")]
)
;; Predicated truncate-and-sign-extend operations with merging.
@ -3212,20 +3227,23 @@
)
(define_insn "*cnot<mode>"
[(set (match_operand:SVE_FULL_I 0 "register_operand" "=w")
[(set (match_operand:SVE_FULL_I 0 "register_operand" "=w, ?&w")
(unspec:SVE_FULL_I
[(unspec:<VPRED>
[(match_operand:<VPRED> 1 "register_operand" "Upl")
[(match_operand:<VPRED> 1 "register_operand" "Upl, Upl")
(match_operand:SI 5 "aarch64_sve_ptrue_flag")
(eq:<VPRED>
(match_operand:SVE_FULL_I 2 "register_operand" "w")
(match_operand:SVE_FULL_I 2 "register_operand" "0, w")
(match_operand:SVE_FULL_I 3 "aarch64_simd_imm_zero"))]
UNSPEC_PRED_Z)
(match_operand:SVE_FULL_I 4 "aarch64_simd_imm_one")
(match_dup 3)]
UNSPEC_SEL))]
"TARGET_SVE"
"cnot\t%0.<Vetype>, %1/m, %2.<Vetype>"
"@
cnot\t%0.<Vetype>, %1/m, %2.<Vetype>
movprfx\t%0, %2\;cnot\t%0.<Vetype>, %1/m, %2.<Vetype>"
[(set_attr "movprfx" "*,yes")]
)
;; Predicated logical inverse with merging.
@ -3383,14 +3401,17 @@
;; Predicated floating-point unary operations.
(define_insn "@aarch64_pred_<optab><mode>"
[(set (match_operand:SVE_FULL_F 0 "register_operand" "=w")
[(set (match_operand:SVE_FULL_F 0 "register_operand" "=w, ?&w")
(unspec:SVE_FULL_F
[(match_operand:<VPRED> 1 "register_operand" "Upl")
[(match_operand:<VPRED> 1 "register_operand" "Upl, Upl")
(match_operand:SI 3 "aarch64_sve_gp_strictness")
(match_operand:SVE_FULL_F 2 "register_operand" "w")]
(match_operand:SVE_FULL_F 2 "register_operand" "0, w")]
SVE_COND_FP_UNARY))]
"TARGET_SVE"
"<sve_fp_op>\t%0.<Vetype>, %1/m, %2.<Vetype>"
"@
<sve_fp_op>\t%0.<Vetype>, %1/m, %2.<Vetype>
movprfx\t%0, %2\;<sve_fp_op>\t%0.<Vetype>, %1/m, %2.<Vetype>"
[(set_attr "movprfx" "*,yes")]
)
;; Predicated floating-point unary arithmetic with merging.
@ -8575,26 +8596,32 @@
;; Predicated float-to-integer conversion, either to the same width or wider.
(define_insn "@aarch64_sve_<optab>_nontrunc<SVE_FULL_F:mode><SVE_FULL_HSDI:mode>"
[(set (match_operand:SVE_FULL_HSDI 0 "register_operand" "=w")
[(set (match_operand:SVE_FULL_HSDI 0 "register_operand" "=w, ?&w")
(unspec:SVE_FULL_HSDI
[(match_operand:<SVE_FULL_HSDI:VPRED> 1 "register_operand" "Upl")
[(match_operand:<SVE_FULL_HSDI:VPRED> 1 "register_operand" "Upl, Upl")
(match_operand:SI 3 "aarch64_sve_gp_strictness")
(match_operand:SVE_FULL_F 2 "register_operand" "w")]
(match_operand:SVE_FULL_F 2 "register_operand" "0, w")]
SVE_COND_FCVTI))]
"TARGET_SVE && <SVE_FULL_HSDI:elem_bits> >= <SVE_FULL_F:elem_bits>"
"fcvtz<su>\t%0.<SVE_FULL_HSDI:Vetype>, %1/m, %2.<SVE_FULL_F:Vetype>"
"@
fcvtz<su>\t%0.<SVE_FULL_HSDI:Vetype>, %1/m, %2.<SVE_FULL_F:Vetype>
movprfx\t%0, %2\;fcvtz<su>\t%0.<SVE_FULL_HSDI:Vetype>, %1/m, %2.<SVE_FULL_F:Vetype>"
[(set_attr "movprfx" "*,yes")]
)
;; Predicated narrowing float-to-integer conversion.
(define_insn "@aarch64_sve_<optab>_trunc<VNx2DF_ONLY:mode><VNx4SI_ONLY:mode>"
[(set (match_operand:VNx4SI_ONLY 0 "register_operand" "=w")
[(set (match_operand:VNx4SI_ONLY 0 "register_operand" "=w, ?&w")
(unspec:VNx4SI_ONLY
[(match_operand:VNx2BI 1 "register_operand" "Upl")
[(match_operand:VNx2BI 1 "register_operand" "Upl, Upl")
(match_operand:SI 3 "aarch64_sve_gp_strictness")
(match_operand:VNx2DF_ONLY 2 "register_operand" "w")]
(match_operand:VNx2DF_ONLY 2 "register_operand" "0, w")]
SVE_COND_FCVTI))]
"TARGET_SVE"
"fcvtz<su>\t%0.<VNx4SI_ONLY:Vetype>, %1/m, %2.<VNx2DF_ONLY:Vetype>"
"@
fcvtz<su>\t%0.<VNx4SI_ONLY:Vetype>, %1/m, %2.<VNx2DF_ONLY:Vetype>
movprfx\t%0, %2\;fcvtz<su>\t%0.<VNx4SI_ONLY:Vetype>, %1/m, %2.<VNx2DF_ONLY:Vetype>"
[(set_attr "movprfx" "*,yes")]
)
;; Predicated float-to-integer conversion with merging, either to the same
@ -8756,26 +8783,32 @@
;; Predicated integer-to-float conversion, either to the same width or
;; narrower.
(define_insn "@aarch64_sve_<optab>_nonextend<SVE_FULL_HSDI:mode><SVE_FULL_F:mode>"
[(set (match_operand:SVE_FULL_F 0 "register_operand" "=w")
[(set (match_operand:SVE_FULL_F 0 "register_operand" "=w, ?&w")
(unspec:SVE_FULL_F
[(match_operand:<SVE_FULL_HSDI:VPRED> 1 "register_operand" "Upl")
[(match_operand:<SVE_FULL_HSDI:VPRED> 1 "register_operand" "Upl, Upl")
(match_operand:SI 3 "aarch64_sve_gp_strictness")
(match_operand:SVE_FULL_HSDI 2 "register_operand" "w")]
(match_operand:SVE_FULL_HSDI 2 "register_operand" "0, w")]
SVE_COND_ICVTF))]
"TARGET_SVE && <SVE_FULL_HSDI:elem_bits> >= <SVE_FULL_F:elem_bits>"
"<su>cvtf\t%0.<SVE_FULL_F:Vetype>, %1/m, %2.<SVE_FULL_HSDI:Vetype>"
"@
<su>cvtf\t%0.<SVE_FULL_F:Vetype>, %1/m, %2.<SVE_FULL_HSDI:Vetype>
movprfx\t%0, %2\;<su>cvtf\t%0.<SVE_FULL_F:Vetype>, %1/m, %2.<SVE_FULL_HSDI:Vetype>"
[(set_attr "movprfx" "*,yes")]
)
;; Predicated widening integer-to-float conversion.
(define_insn "@aarch64_sve_<optab>_extend<VNx4SI_ONLY:mode><VNx2DF_ONLY:mode>"
[(set (match_operand:VNx2DF_ONLY 0 "register_operand" "=w")
[(set (match_operand:VNx2DF_ONLY 0 "register_operand" "=w, ?&w")
(unspec:VNx2DF_ONLY
[(match_operand:VNx2BI 1 "register_operand" "Upl")
[(match_operand:VNx2BI 1 "register_operand" "Upl, Upl")
(match_operand:SI 3 "aarch64_sve_gp_strictness")
(match_operand:VNx4SI_ONLY 2 "register_operand" "w")]
(match_operand:VNx4SI_ONLY 2 "register_operand" "0, w")]
SVE_COND_ICVTF))]
"TARGET_SVE"
"<su>cvtf\t%0.<VNx2DF_ONLY:Vetype>, %1/m, %2.<VNx4SI_ONLY:Vetype>"
"@
<su>cvtf\t%0.<VNx2DF_ONLY:Vetype>, %1/m, %2.<VNx4SI_ONLY:Vetype>
movprfx\t%0, %2\;<su>cvtf\t%0.<VNx2DF_ONLY:Vetype>, %1/m, %2.<VNx4SI_ONLY:Vetype>"
[(set_attr "movprfx" "*,yes")]
)
;; Predicated integer-to-float conversion with merging, either to the same
@ -8948,14 +8981,17 @@
;; Predicated float-to-float truncation.
(define_insn "@aarch64_sve_<optab>_trunc<SVE_FULL_SDF:mode><SVE_FULL_HSF:mode>"
[(set (match_operand:SVE_FULL_HSF 0 "register_operand" "=w")
[(set (match_operand:SVE_FULL_HSF 0 "register_operand" "=w, ?&w")
(unspec:SVE_FULL_HSF
[(match_operand:<SVE_FULL_SDF:VPRED> 1 "register_operand" "Upl")
[(match_operand:<SVE_FULL_SDF:VPRED> 1 "register_operand" "Upl, Upl")
(match_operand:SI 3 "aarch64_sve_gp_strictness")
(match_operand:SVE_FULL_SDF 2 "register_operand" "w")]
(match_operand:SVE_FULL_SDF 2 "register_operand" "0, w")]
SVE_COND_FCVT))]
"TARGET_SVE && <SVE_FULL_SDF:elem_bits> > <SVE_FULL_HSF:elem_bits>"
"fcvt\t%0.<SVE_FULL_HSF:Vetype>, %1/m, %2.<SVE_FULL_SDF:Vetype>"
"@
fcvt\t%0.<SVE_FULL_HSF:Vetype>, %1/m, %2.<SVE_FULL_SDF:Vetype>
movprfx\t%0, %2\;fcvt\t%0.<SVE_FULL_HSF:Vetype>, %1/m, %2.<SVE_FULL_SDF:Vetype>"
[(set_attr "movprfx" "*,yes")]
)
;; Predicated float-to-float truncation with merging.
@ -9002,14 +9038,17 @@
;; Predicated BFCVT.
(define_insn "@aarch64_sve_<optab>_trunc<VNx4SF_ONLY:mode><VNx8BF_ONLY:mode>"
[(set (match_operand:VNx8BF_ONLY 0 "register_operand" "=w")
[(set (match_operand:VNx8BF_ONLY 0 "register_operand" "=w, ?&w")
(unspec:VNx8BF_ONLY
[(match_operand:VNx4BI 1 "register_operand" "Upl")
[(match_operand:VNx4BI 1 "register_operand" "Upl, Upl")
(match_operand:SI 3 "aarch64_sve_gp_strictness")
(match_operand:VNx4SF_ONLY 2 "register_operand" "w")]
(match_operand:VNx4SF_ONLY 2 "register_operand" "0, w")]
SVE_COND_FCVT))]
"TARGET_SVE_BF16"
"bfcvt\t%0.h, %1/m, %2.s"
"@
bfcvt\t%0.h, %1/m, %2.s
movprfx\t%0, %2\;bfcvt\t%0.h, %1/m, %2.s"
[(set_attr "movprfx" "*,yes")]
)
;; Predicated BFCVT with merging.
@ -9099,14 +9138,17 @@
;; Predicated float-to-float extension.
(define_insn "@aarch64_sve_<optab>_nontrunc<SVE_FULL_HSF:mode><SVE_FULL_SDF:mode>"
[(set (match_operand:SVE_FULL_SDF 0 "register_operand" "=w")
[(set (match_operand:SVE_FULL_SDF 0 "register_operand" "=w, ?&w")
(unspec:SVE_FULL_SDF
[(match_operand:<SVE_FULL_SDF:VPRED> 1 "register_operand" "Upl")
[(match_operand:<SVE_FULL_SDF:VPRED> 1 "register_operand" "Upl, Upl")
(match_operand:SI 3 "aarch64_sve_gp_strictness")
(match_operand:SVE_FULL_HSF 2 "register_operand" "w")]
(match_operand:SVE_FULL_HSF 2 "register_operand" "0, w")]
SVE_COND_FCVT))]
"TARGET_SVE && <SVE_FULL_SDF:elem_bits> > <SVE_FULL_HSF:elem_bits>"
"fcvt\t%0.<SVE_FULL_SDF:Vetype>, %1/m, %2.<SVE_FULL_HSF:Vetype>"
"@
fcvt\t%0.<SVE_FULL_SDF:Vetype>, %1/m, %2.<SVE_FULL_HSF:Vetype>
movprfx\t%0, %2\;fcvt\t%0.<SVE_FULL_SDF:Vetype>, %1/m, %2.<SVE_FULL_HSF:Vetype>"
[(set_attr "movprfx" "*,yes")]
)
;; Predicated float-to-float extension with merging.

View File

@ -1893,10 +1893,10 @@
(unspec:SVE_FULL_SDF
[(match_operand:<VPRED> 1 "register_operand" "Upl")
(match_operand:SI 3 "aarch64_sve_gp_strictness")
(match_operand:<VNARROW> 2 "register_operand" "w")]
(match_operand:<VNARROW> 2 "register_operand" "0")]
SVE2_COND_FP_UNARY_LONG))]
"TARGET_SVE2"
"<sve_fp_op>\t%0.<Vetype>, %1/m, %2.<Ventype>"
"<sve_fp_op>\t%0.<Vetype>, %1/m, %0.<Ventype>"
)
;; Predicated convert long top with merging.
@ -1978,14 +1978,17 @@
;; Predicated FCVTX (equivalent to what would be FCVTXNB, except that
;; it supports MOVPRFX).
(define_insn "@aarch64_pred_<sve_fp_op><mode>"
[(set (match_operand:VNx4SF_ONLY 0 "register_operand" "=w")
[(set (match_operand:VNx4SF_ONLY 0 "register_operand" "=w, ?&w")
(unspec:VNx4SF_ONLY
[(match_operand:<VWIDE_PRED> 1 "register_operand" "Upl")
[(match_operand:<VWIDE_PRED> 1 "register_operand" "Upl, Upl")
(match_operand:SI 3 "aarch64_sve_gp_strictness")
(match_operand:<VWIDE> 2 "register_operand" "w")]
(match_operand:<VWIDE> 2 "register_operand" "0, w")]
SVE2_COND_FP_UNARY_NARROWB))]
"TARGET_SVE2"
"<sve_fp_op>\t%0.<Vetype>, %1/m, %2.<Vewtype>"
"@
<sve_fp_op>\t%0.<Vetype>, %1/m, %2.<Vewtype>
movprfx\t%0, %2\;<sve_fp_op>\t%0.<Vetype>, %1/m, %2.<Vewtype>"
[(set_attr "movprfx" "*,yes")]
)
;; Predicated FCVTX with merging.
@ -2076,15 +2079,18 @@
;; Predicated integer unary operations.
(define_insn "@aarch64_pred_<sve_int_op><mode>"
[(set (match_operand:VNx4SI_ONLY 0 "register_operand" "=w")
[(set (match_operand:VNx4SI_ONLY 0 "register_operand" "=w, ?&w")
(unspec:VNx4SI_ONLY
[(match_operand:<VPRED> 1 "register_operand" "Upl")
[(match_operand:<VPRED> 1 "register_operand" "Upl, Upl")
(unspec:VNx4SI_ONLY
[(match_operand:VNx4SI_ONLY 2 "register_operand" "w")]
[(match_operand:VNx4SI_ONLY 2 "register_operand" "0, w")]
SVE2_U32_UNARY)]
UNSPEC_PRED_X))]
"TARGET_SVE2"
"<sve_int_op>\t%0.<Vetype>, %1/m, %2.<Vetype>"
"@
<sve_int_op>\t%0.<Vetype>, %1/m, %2.<Vetype>
movprfx\t%0, %2\;<sve_int_op>\t%0.<Vetype>, %1/m, %2.<Vetype>"
[(set_attr "movprfx" "*,yes")]
)
;; Predicated integer unary operations with merging.
@ -2139,14 +2145,17 @@
;; Predicated FLOGB.
(define_insn "@aarch64_pred_<sve_fp_op><mode>"
[(set (match_operand:<V_INT_EQUIV> 0 "register_operand" "=w")
[(set (match_operand:<V_INT_EQUIV> 0 "register_operand" "=w, ?&w")
(unspec:<V_INT_EQUIV>
[(match_operand:<VPRED> 1 "register_operand" "Upl")
[(match_operand:<VPRED> 1 "register_operand" "Upl, Upl")
(match_operand:SI 3 "aarch64_sve_gp_strictness")
(match_operand:SVE_FULL_F 2 "register_operand" "w")]
(match_operand:SVE_FULL_F 2 "register_operand" "0, w")]
SVE2_COND_INT_UNARY_FP))]
"TARGET_SVE2"
"<sve_fp_op>\t%0.<Vetype>, %1/m, %2.<Vetype>"
"@
<sve_fp_op>\t%0.<Vetype>, %1/m, %2.<Vetype>
movprfx\t%0, %2\;<sve_fp_op>\t%0.<Vetype>, %1/m, %2.<Vetype>"
[(set_attr "movprfx" "*,yes")]
)
;; Predicated FLOGB with merging.

View File

@ -5390,9 +5390,35 @@ bool
aarch64_maybe_expand_sve_subreg_move (rtx dest, rtx src)
{
gcc_assert (BYTES_BIG_ENDIAN);
if (SUBREG_P (dest))
/* Do not try to optimize subregs that LRA has created for matched
reloads. These subregs only exist as a temporary measure to make
the RTL well-formed, but they are exempt from the usual
TARGET_CAN_CHANGE_MODE_CLASS rules.
For example, if we have:
(set (reg:VNx8HI R1) (foo:VNx8HI (reg:VNx4SI R2)))
and the constraints require R1 and R2 to be in the same register,
LRA may need to create RTL such as:
(set (subreg:VNx4SI (reg:VNx8HI TMP) 0) (reg:VNx4SI R2))
(set (reg:VNx8HI TMP) (foo:VNx8HI (subreg:VNx4SI (reg:VNx8HI TMP) 0)))
(set (reg:VNx8HI R1) (reg:VNx8HI TMP))
which forces both the input and output of the original instruction
to use the same hard register. But for this to work, the normal
rules have to be suppressed on the subreg input, otherwise LRA
would need to reload that input too, meaning that the process
would never terminate. To compensate for this, the normal rules
are also suppressed for the subreg output of the first move.
Ignoring the special case and handling the first move normally
would therefore generate wrong code: we would reverse the elements
for the first subreg but not reverse them back for the second subreg. */
if (SUBREG_P (dest) && !LRA_SUBREG_P (dest))
dest = SUBREG_REG (dest);
if (SUBREG_P (src))
if (SUBREG_P (src) && !LRA_SUBREG_P (src))
src = SUBREG_REG (src);
/* The optimization handles two single SVE REGs with different element

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (abs_f16_x_tied1, svfloat16_t,
/*
** abs_f16_x_untied:
** movprfx z0, z1
** fabs z0\.h, p0/m, z1\.h
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (abs_f32_x_tied1, svfloat32_t,
/*
** abs_f32_x_untied:
** movprfx z0, z1
** fabs z0\.s, p0/m, z1\.s
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (abs_f64_x_tied1, svfloat64_t,
/*
** abs_f64_x_untied:
** movprfx z0, z1
** fabs z0\.d, p0/m, z1\.d
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (abs_s16_x_tied1, svint16_t,
/*
** abs_s16_x_untied:
** movprfx z0, z1
** abs z0\.h, p0/m, z1\.h
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (abs_s32_x_tied1, svint32_t,
/*
** abs_s32_x_untied:
** movprfx z0, z1
** abs z0\.s, p0/m, z1\.s
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (abs_s64_x_tied1, svint64_t,
/*
** abs_s64_x_untied:
** movprfx z0, z1
** abs z0\.d, p0/m, z1\.d
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (abs_s8_x_tied1, svint8_t,
/*
** abs_s8_x_untied:
** movprfx z0, z1
** abs z0\.b, p0/m, z1\.b
** ret
*/

View File

@ -33,6 +33,7 @@ TEST_DUAL_Z (cls_s16_z, svuint16_t, svint16_t,
/*
** cls_s16_x:
** movprfx z0, z4
** cls z0\.h, p0/m, z4\.h
** ret
*/

View File

@ -33,6 +33,7 @@ TEST_DUAL_Z (cls_s32_z, svuint32_t, svint32_t,
/*
** cls_s32_x:
** movprfx z0, z4
** cls z0\.s, p0/m, z4\.s
** ret
*/

View File

@ -33,6 +33,7 @@ TEST_DUAL_Z (cls_s64_z, svuint64_t, svint64_t,
/*
** cls_s64_x:
** movprfx z0, z4
** cls z0\.d, p0/m, z4\.d
** ret
*/

View File

@ -33,6 +33,7 @@ TEST_DUAL_Z (cls_s8_z, svuint8_t, svint8_t,
/*
** cls_s8_x:
** movprfx z0, z4
** cls z0\.b, p0/m, z4\.b
** ret
*/

View File

@ -33,6 +33,7 @@ TEST_DUAL_Z (clz_s16_z, svuint16_t, svint16_t,
/*
** clz_s16_x:
** movprfx z0, z4
** clz z0\.h, p0/m, z4\.h
** ret
*/

View File

@ -33,6 +33,7 @@ TEST_DUAL_Z (clz_s32_z, svuint32_t, svint32_t,
/*
** clz_s32_x:
** movprfx z0, z4
** clz z0\.s, p0/m, z4\.s
** ret
*/

View File

@ -33,6 +33,7 @@ TEST_DUAL_Z (clz_s64_z, svuint64_t, svint64_t,
/*
** clz_s64_x:
** movprfx z0, z4
** clz z0\.d, p0/m, z4\.d
** ret
*/

View File

@ -33,6 +33,7 @@ TEST_DUAL_Z (clz_s8_z, svuint8_t, svint8_t,
/*
** clz_s8_x:
** movprfx z0, z4
** clz z0\.b, p0/m, z4\.b
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (clz_u16_x_tied1, svuint16_t,
/*
** clz_u16_x_untied:
** movprfx z0, z1
** clz z0\.h, p0/m, z1\.h
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (clz_u32_x_tied1, svuint32_t,
/*
** clz_u32_x_untied:
** movprfx z0, z1
** clz z0\.s, p0/m, z1\.s
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (clz_u64_x_tied1, svuint64_t,
/*
** clz_u64_x_untied:
** movprfx z0, z1
** clz z0\.d, p0/m, z1\.d
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (clz_u8_x_tied1, svuint8_t,
/*
** clz_u8_x_untied:
** movprfx z0, z1
** clz z0\.b, p0/m, z1\.b
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (cnot_s16_x_tied1, svint16_t,
/*
** cnot_s16_x_untied:
** movprfx z0, z1
** cnot z0\.h, p0/m, z1\.h
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (cnot_s32_x_tied1, svint32_t,
/*
** cnot_s32_x_untied:
** movprfx z0, z1
** cnot z0\.s, p0/m, z1\.s
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (cnot_s64_x_tied1, svint64_t,
/*
** cnot_s64_x_untied:
** movprfx z0, z1
** cnot z0\.d, p0/m, z1\.d
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (cnot_s8_x_tied1, svint8_t,
/*
** cnot_s8_x_untied:
** movprfx z0, z1
** cnot z0\.b, p0/m, z1\.b
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (cnot_u16_x_tied1, svuint16_t,
/*
** cnot_u16_x_untied:
** movprfx z0, z1
** cnot z0\.h, p0/m, z1\.h
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (cnot_u32_x_tied1, svuint32_t,
/*
** cnot_u32_x_untied:
** movprfx z0, z1
** cnot z0\.s, p0/m, z1\.s
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (cnot_u64_x_tied1, svuint64_t,
/*
** cnot_u64_x_untied:
** movprfx z0, z1
** cnot z0\.d, p0/m, z1\.d
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (cnot_u8_x_tied1, svuint8_t,
/*
** cnot_u8_x_untied:
** movprfx z0, z1
** cnot z0\.b, p0/m, z1\.b
** ret
*/

View File

@ -33,6 +33,7 @@ TEST_DUAL_Z (cnt_bf16_z, svuint16_t, svbfloat16_t,
/*
** cnt_bf16_x:
** movprfx z0, z4
** cnt z0\.h, p0/m, z4\.h
** ret
*/

View File

@ -33,6 +33,7 @@ TEST_DUAL_Z (cnt_f16_z, svuint16_t, svfloat16_t,
/*
** cnt_f16_x:
** movprfx z0, z4
** cnt z0\.h, p0/m, z4\.h
** ret
*/

View File

@ -33,6 +33,7 @@ TEST_DUAL_Z (cnt_f32_z, svuint32_t, svfloat32_t,
/*
** cnt_f32_x:
** movprfx z0, z4
** cnt z0\.s, p0/m, z4\.s
** ret
*/

View File

@ -33,6 +33,7 @@ TEST_DUAL_Z (cnt_f64_z, svuint64_t, svfloat64_t,
/*
** cnt_f64_x:
** movprfx z0, z4
** cnt z0\.d, p0/m, z4\.d
** ret
*/

View File

@ -33,6 +33,7 @@ TEST_DUAL_Z (cnt_s16_z, svuint16_t, svint16_t,
/*
** cnt_s16_x:
** movprfx z0, z4
** cnt z0\.h, p0/m, z4\.h
** ret
*/

View File

@ -33,6 +33,7 @@ TEST_DUAL_Z (cnt_s32_z, svuint32_t, svint32_t,
/*
** cnt_s32_x:
** movprfx z0, z4
** cnt z0\.s, p0/m, z4\.s
** ret
*/

View File

@ -33,6 +33,7 @@ TEST_DUAL_Z (cnt_s64_z, svuint64_t, svint64_t,
/*
** cnt_s64_x:
** movprfx z0, z4
** cnt z0\.d, p0/m, z4\.d
** ret
*/

View File

@ -33,6 +33,7 @@ TEST_DUAL_Z (cnt_s8_z, svuint8_t, svint8_t,
/*
** cnt_s8_x:
** movprfx z0, z4
** cnt z0\.b, p0/m, z4\.b
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (cnt_u16_x_tied1, svuint16_t,
/*
** cnt_u16_x_untied:
** movprfx z0, z1
** cnt z0\.h, p0/m, z1\.h
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (cnt_u32_x_tied1, svuint32_t,
/*
** cnt_u32_x_untied:
** movprfx z0, z1
** cnt z0\.s, p0/m, z1\.s
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (cnt_u64_x_tied1, svuint64_t,
/*
** cnt_u64_x_untied:
** movprfx z0, z1
** cnt z0\.d, p0/m, z1\.d
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (cnt_u8_x_tied1, svuint8_t,
/*
** cnt_u8_x_untied:
** movprfx z0, z1
** cnt z0\.b, p0/m, z1\.b
** ret
*/

View File

@ -66,6 +66,7 @@ TEST_DUAL_Z_REV (cvt_bf16_f32_x_tied1, svbfloat16_t, svfloat32_t,
/*
** cvt_bf16_f32_x_untied:
** movprfx z0, z4
** bfcvt z0\.h, p0/m, z4\.s
** ret
*/

View File

@ -421,6 +421,7 @@ TEST_DUAL_Z_REV (cvt_f16_f32_x_tied1, svfloat16_t, svfloat32_t,
/*
** cvt_f16_f32_x_untied:
** movprfx z0, z4
** fcvt z0\.h, p0/m, z4\.s
** ret
*/
@ -439,6 +440,7 @@ TEST_DUAL_Z_REV (cvt_f16_f64_x_tied1, svfloat16_t, svfloat64_t,
/*
** cvt_f16_f64_x_untied:
** movprfx z0, z4
** fcvt z0\.h, p0/m, z4\.d
** ret
*/
@ -457,6 +459,7 @@ TEST_DUAL_Z_REV (cvt_f16_s16_x_tied1, svfloat16_t, svint16_t,
/*
** cvt_f16_s16_x_untied:
** movprfx z0, z4
** scvtf z0\.h, p0/m, z4\.h
** ret
*/
@ -475,6 +478,7 @@ TEST_DUAL_Z_REV (cvt_f16_s32_x_tied1, svfloat16_t, svint32_t,
/*
** cvt_f16_s32_x_untied:
** movprfx z0, z4
** scvtf z0\.h, p0/m, z4\.s
** ret
*/
@ -493,6 +497,7 @@ TEST_DUAL_Z_REV (cvt_f16_s64_x_tied1, svfloat16_t, svint64_t,
/*
** cvt_f16_s64_x_untied:
** movprfx z0, z4
** scvtf z0\.h, p0/m, z4\.d
** ret
*/
@ -511,6 +516,7 @@ TEST_DUAL_Z_REV (cvt_f16_u16_x_tied1, svfloat16_t, svuint16_t,
/*
** cvt_f16_u16_x_untied:
** movprfx z0, z4
** ucvtf z0\.h, p0/m, z4\.h
** ret
*/
@ -529,6 +535,7 @@ TEST_DUAL_Z_REV (cvt_f16_u32_x_tied1, svfloat16_t, svuint32_t,
/*
** cvt_f16_u32_x_untied:
** movprfx z0, z4
** ucvtf z0\.h, p0/m, z4\.s
** ret
*/
@ -547,6 +554,7 @@ TEST_DUAL_Z_REV (cvt_f16_u64_x_tied1, svfloat16_t, svuint64_t,
/*
** cvt_f16_u64_x_untied:
** movprfx z0, z4
** ucvtf z0\.h, p0/m, z4\.d
** ret
*/

View File

@ -319,6 +319,7 @@ TEST_DUAL_Z_REV (cvt_f32_f16_x_tied1, svfloat32_t, svfloat16_t,
/*
** cvt_f32_f16_x_untied:
** movprfx z0, z4
** fcvt z0\.s, p0/m, z4\.h
** ret
*/
@ -337,6 +338,7 @@ TEST_DUAL_Z_REV (cvt_f32_f64_x_tied1, svfloat32_t, svfloat64_t,
/*
** cvt_f32_f64_x_untied:
** movprfx z0, z4
** fcvt z0\.s, p0/m, z4\.d
** ret
*/
@ -355,6 +357,7 @@ TEST_DUAL_Z_REV (cvt_f32_s32_x_tied1, svfloat32_t, svint32_t,
/*
** cvt_f32_s32_x_untied:
** movprfx z0, z4
** scvtf z0\.s, p0/m, z4\.s
** ret
*/
@ -373,6 +376,7 @@ TEST_DUAL_Z_REV (cvt_f32_s64_x_tied1, svfloat32_t, svint64_t,
/*
** cvt_f32_s64_x_untied:
** movprfx z0, z4
** scvtf z0\.s, p0/m, z4\.d
** ret
*/
@ -391,6 +395,7 @@ TEST_DUAL_Z_REV (cvt_f32_u32_x_tied1, svfloat32_t, svuint32_t,
/*
** cvt_f32_u32_x_untied:
** movprfx z0, z4
** ucvtf z0\.s, p0/m, z4\.s
** ret
*/
@ -409,6 +414,7 @@ TEST_DUAL_Z_REV (cvt_f32_u64_x_tied1, svfloat32_t, svuint64_t,
/*
** cvt_f32_u64_x_untied:
** movprfx z0, z4
** ucvtf z0\.s, p0/m, z4\.d
** ret
*/

View File

@ -319,6 +319,7 @@ TEST_DUAL_Z_REV (cvt_f64_f16_x_tied1, svfloat64_t, svfloat16_t,
/*
** cvt_f64_f16_x_untied:
** movprfx z0, z4
** fcvt z0\.d, p0/m, z4\.h
** ret
*/
@ -337,6 +338,7 @@ TEST_DUAL_Z_REV (cvt_f64_f32_x_tied1, svfloat64_t, svfloat32_t,
/*
** cvt_f64_f32_x_untied:
** movprfx z0, z4
** fcvt z0\.d, p0/m, z4\.s
** ret
*/
@ -355,6 +357,7 @@ TEST_DUAL_Z_REV (cvt_f64_s32_x_tied1, svfloat64_t, svint32_t,
/*
** cvt_f64_s32_x_untied:
** movprfx z0, z4
** scvtf z0\.d, p0/m, z4\.s
** ret
*/
@ -373,6 +376,7 @@ TEST_DUAL_Z_REV (cvt_f64_s64_x_tied1, svfloat64_t, svint64_t,
/*
** cvt_f64_s64_x_untied:
** movprfx z0, z4
** scvtf z0\.d, p0/m, z4\.d
** ret
*/
@ -391,6 +395,7 @@ TEST_DUAL_Z_REV (cvt_f64_u32_x_tied1, svfloat64_t, svuint32_t,
/*
** cvt_f64_u32_x_untied:
** movprfx z0, z4
** ucvtf z0\.d, p0/m, z4\.s
** ret
*/
@ -409,6 +414,7 @@ TEST_DUAL_Z_REV (cvt_f64_u64_x_tied1, svfloat64_t, svuint64_t,
/*
** cvt_f64_u64_x_untied:
** movprfx z0, z4
** ucvtf z0\.d, p0/m, z4\.d
** ret
*/

View File

@ -64,6 +64,7 @@ TEST_DUAL_Z_REV (cvt_s16_f16_x_tied1, svint16_t, svfloat16_t,
/*
** cvt_s16_f16_x_untied:
** movprfx z0, z4
** fcvtzs z0\.h, p0/m, z4\.h
** ret
*/

View File

@ -166,6 +166,7 @@ TEST_DUAL_Z_REV (cvt_s32_f16_x_tied1, svint32_t, svfloat16_t,
/*
** cvt_s32_f16_x_untied:
** movprfx z0, z4
** fcvtzs z0\.s, p0/m, z4\.h
** ret
*/
@ -184,6 +185,7 @@ TEST_DUAL_Z_REV (cvt_s32_f32_x_tied1, svint32_t, svfloat32_t,
/*
** cvt_s32_f32_x_untied:
** movprfx z0, z4
** fcvtzs z0\.s, p0/m, z4\.s
** ret
*/
@ -202,6 +204,7 @@ TEST_DUAL_Z_REV (cvt_s32_f64_x_tied1, svint32_t, svfloat64_t,
/*
** cvt_s32_f64_x_untied:
** movprfx z0, z4
** fcvtzs z0\.s, p0/m, z4\.d
** ret
*/

View File

@ -166,6 +166,7 @@ TEST_DUAL_Z_REV (cvt_s64_f16_x_tied1, svint64_t, svfloat16_t,
/*
** cvt_s64_f16_x_untied:
** movprfx z0, z4
** fcvtzs z0\.d, p0/m, z4\.h
** ret
*/
@ -184,6 +185,7 @@ TEST_DUAL_Z_REV (cvt_s64_f32_x_tied1, svint64_t, svfloat32_t,
/*
** cvt_s64_f32_x_untied:
** movprfx z0, z4
** fcvtzs z0\.d, p0/m, z4\.s
** ret
*/
@ -202,6 +204,7 @@ TEST_DUAL_Z_REV (cvt_s64_f64_x_tied1, svint64_t, svfloat64_t,
/*
** cvt_s64_f64_x_untied:
** movprfx z0, z4
** fcvtzs z0\.d, p0/m, z4\.d
** ret
*/

View File

@ -64,6 +64,7 @@ TEST_DUAL_Z_REV (cvt_u16_f16_x_tied1, svuint16_t, svfloat16_t,
/*
** cvt_u16_f16_x_untied:
** movprfx z0, z4
** fcvtzu z0\.h, p0/m, z4\.h
** ret
*/

View File

@ -166,6 +166,7 @@ TEST_DUAL_Z_REV (cvt_u32_f16_x_tied1, svuint32_t, svfloat16_t,
/*
** cvt_u32_f16_x_untied:
** movprfx z0, z4
** fcvtzu z0\.s, p0/m, z4\.h
** ret
*/
@ -184,6 +185,7 @@ TEST_DUAL_Z_REV (cvt_u32_f32_x_tied1, svuint32_t, svfloat32_t,
/*
** cvt_u32_f32_x_untied:
** movprfx z0, z4
** fcvtzu z0\.s, p0/m, z4\.s
** ret
*/
@ -202,6 +204,7 @@ TEST_DUAL_Z_REV (cvt_u32_f64_x_tied1, svuint32_t, svfloat64_t,
/*
** cvt_u32_f64_x_untied:
** movprfx z0, z4
** fcvtzu z0\.s, p0/m, z4\.d
** ret
*/

View File

@ -166,6 +166,7 @@ TEST_DUAL_Z_REV (cvt_u64_f16_x_tied1, svuint64_t, svfloat16_t,
/*
** cvt_u64_f16_x_untied:
** movprfx z0, z4
** fcvtzu z0\.d, p0/m, z4\.h
** ret
*/
@ -184,6 +185,7 @@ TEST_DUAL_Z_REV (cvt_u64_f32_x_tied1, svuint64_t, svfloat32_t,
/*
** cvt_u64_f32_x_untied:
** movprfx z0, z4
** fcvtzu z0\.d, p0/m, z4\.s
** ret
*/
@ -202,6 +204,7 @@ TEST_DUAL_Z_REV (cvt_u64_f64_x_tied1, svuint64_t, svfloat64_t,
/*
** cvt_u64_f64_x_untied:
** movprfx z0, z4
** fcvtzu z0\.d, p0/m, z4\.d
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (extb_s16_x_tied1, svint16_t,
/*
** extb_s16_x_untied:
** movprfx z0, z1
** sxtb z0\.h, p0/m, z1\.h
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (extb_s32_x_tied1, svint32_t,
/*
** extb_s32_x_untied:
** movprfx z0, z1
** sxtb z0\.s, p0/m, z1\.s
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (extb_s64_x_tied1, svint64_t,
/*
** extb_s64_x_untied:
** movprfx z0, z1
** sxtb z0\.d, p0/m, z1\.d
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (exth_s32_x_tied1, svint32_t,
/*
** exth_s32_x_untied:
** movprfx z0, z1
** sxth z0\.s, p0/m, z1\.s
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (exth_s64_x_tied1, svint64_t,
/*
** exth_s64_x_untied:
** movprfx z0, z1
** sxth z0\.d, p0/m, z1\.d
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (extw_s64_x_tied1, svint64_t,
/*
** extw_s64_x_untied:
** movprfx z0, z1
** sxtw z0\.d, p0/m, z1\.d
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (neg_f16_x_tied1, svfloat16_t,
/*
** neg_f16_x_untied:
** movprfx z0, z1
** fneg z0\.h, p0/m, z1\.h
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (neg_f32_x_tied1, svfloat32_t,
/*
** neg_f32_x_untied:
** movprfx z0, z1
** fneg z0\.s, p0/m, z1\.s
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (neg_f64_x_tied1, svfloat64_t,
/*
** neg_f64_x_untied:
** movprfx z0, z1
** fneg z0\.d, p0/m, z1\.d
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (neg_s16_x_tied1, svint16_t,
/*
** neg_s16_x_untied:
** movprfx z0, z1
** neg z0\.h, p0/m, z1\.h
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (neg_s32_x_tied1, svint32_t,
/*
** neg_s32_x_untied:
** movprfx z0, z1
** neg z0\.s, p0/m, z1\.s
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (neg_s64_x_tied1, svint64_t,
/*
** neg_s64_x_untied:
** movprfx z0, z1
** neg z0\.d, p0/m, z1\.d
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (neg_s8_x_tied1, svint8_t,
/*
** neg_s8_x_untied:
** movprfx z0, z1
** neg z0\.b, p0/m, z1\.b
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (not_s16_x_tied1, svint16_t,
/*
** not_s16_x_untied:
** movprfx z0, z1
** not z0\.h, p0/m, z1\.h
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (not_s32_x_tied1, svint32_t,
/*
** not_s32_x_untied:
** movprfx z0, z1
** not z0\.s, p0/m, z1\.s
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (not_s64_x_tied1, svint64_t,
/*
** not_s64_x_untied:
** movprfx z0, z1
** not z0\.d, p0/m, z1\.d
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (not_s8_x_tied1, svint8_t,
/*
** not_s8_x_untied:
** movprfx z0, z1
** not z0\.b, p0/m, z1\.b
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (not_u16_x_tied1, svuint16_t,
/*
** not_u16_x_untied:
** movprfx z0, z1
** not z0\.h, p0/m, z1\.h
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (not_u32_x_tied1, svuint32_t,
/*
** not_u32_x_untied:
** movprfx z0, z1
** not z0\.s, p0/m, z1\.s
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (not_u64_x_tied1, svuint64_t,
/*
** not_u64_x_untied:
** movprfx z0, z1
** not z0\.d, p0/m, z1\.d
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (not_u8_x_tied1, svuint8_t,
/*
** not_u8_x_untied:
** movprfx z0, z1
** not z0\.b, p0/m, z1\.b
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (rbit_s16_x_tied1, svint16_t,
/*
** rbit_s16_x_untied:
** movprfx z0, z1
** rbit z0\.h, p0/m, z1\.h
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (rbit_s32_x_tied1, svint32_t,
/*
** rbit_s32_x_untied:
** movprfx z0, z1
** rbit z0\.s, p0/m, z1\.s
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (rbit_s64_x_tied1, svint64_t,
/*
** rbit_s64_x_untied:
** movprfx z0, z1
** rbit z0\.d, p0/m, z1\.d
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (rbit_s8_x_tied1, svint8_t,
/*
** rbit_s8_x_untied:
** movprfx z0, z1
** rbit z0\.b, p0/m, z1\.b
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (rbit_u16_x_tied1, svuint16_t,
/*
** rbit_u16_x_untied:
** movprfx z0, z1
** rbit z0\.h, p0/m, z1\.h
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (rbit_u32_x_tied1, svuint32_t,
/*
** rbit_u32_x_untied:
** movprfx z0, z1
** rbit z0\.s, p0/m, z1\.s
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (rbit_u64_x_tied1, svuint64_t,
/*
** rbit_u64_x_untied:
** movprfx z0, z1
** rbit z0\.d, p0/m, z1\.d
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (rbit_u8_x_tied1, svuint8_t,
/*
** rbit_u8_x_untied:
** movprfx z0, z1
** rbit z0\.b, p0/m, z1\.b
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (recpx_f16_x_tied1, svfloat16_t,
/*
** recpx_f16_x_untied:
** movprfx z0, z1
** frecpx z0\.h, p0/m, z1\.h
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (recpx_f32_x_tied1, svfloat32_t,
/*
** recpx_f32_x_untied:
** movprfx z0, z1
** frecpx z0\.s, p0/m, z1\.s
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (recpx_f64_x_tied1, svfloat64_t,
/*
** recpx_f64_x_untied:
** movprfx z0, z1
** frecpx z0\.d, p0/m, z1\.d
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (revb_s16_x_tied1, svint16_t,
/*
** revb_s16_x_untied:
** movprfx z0, z1
** revb z0\.h, p0/m, z1\.h
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (revb_s32_x_tied1, svint32_t,
/*
** revb_s32_x_untied:
** movprfx z0, z1
** revb z0\.s, p0/m, z1\.s
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (revb_s64_x_tied1, svint64_t,
/*
** revb_s64_x_untied:
** movprfx z0, z1
** revb z0\.d, p0/m, z1\.d
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (revb_u16_x_tied1, svuint16_t,
/*
** revb_u16_x_untied:
** movprfx z0, z1
** revb z0\.h, p0/m, z1\.h
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (revb_u32_x_tied1, svuint32_t,
/*
** revb_u32_x_untied:
** movprfx z0, z1
** revb z0\.s, p0/m, z1\.s
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (revb_u64_x_tied1, svuint64_t,
/*
** revb_u64_x_untied:
** movprfx z0, z1
** revb z0\.d, p0/m, z1\.d
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (revh_s32_x_tied1, svint32_t,
/*
** revh_s32_x_untied:
** movprfx z0, z1
** revh z0\.s, p0/m, z1\.s
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (revh_s64_x_tied1, svint64_t,
/*
** revh_s64_x_untied:
** movprfx z0, z1
** revh z0\.d, p0/m, z1\.d
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (revh_u32_x_tied1, svuint32_t,
/*
** revh_u32_x_untied:
** movprfx z0, z1
** revh z0\.s, p0/m, z1\.s
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (revh_u64_x_tied1, svuint64_t,
/*
** revh_u64_x_untied:
** movprfx z0, z1
** revh z0\.d, p0/m, z1\.d
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (revw_s64_x_tied1, svint64_t,
/*
** revw_s64_x_untied:
** movprfx z0, z1
** revw z0\.d, p0/m, z1\.d
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (revw_u64_x_tied1, svuint64_t,
/*
** revw_u64_x_untied:
** movprfx z0, z1
** revw z0\.d, p0/m, z1\.d
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (rinta_f16_x_tied1, svfloat16_t,
/*
** rinta_f16_x_untied:
** movprfx z0, z1
** frinta z0\.h, p0/m, z1\.h
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (rinta_f32_x_tied1, svfloat32_t,
/*
** rinta_f32_x_untied:
** movprfx z0, z1
** frinta z0\.s, p0/m, z1\.s
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (rinta_f64_x_tied1, svfloat64_t,
/*
** rinta_f64_x_untied:
** movprfx z0, z1
** frinta z0\.d, p0/m, z1\.d
** ret
*/

View File

@ -73,6 +73,7 @@ TEST_UNIFORM_Z (rinti_f16_x_tied1, svfloat16_t,
/*
** rinti_f16_x_untied:
** movprfx z0, z1
** frinti z0\.h, p0/m, z1\.h
** ret
*/

Some files were not shown because too many files have changed in this diff Show More