[ARM][GCC][2/x]: MVE ACLE intrinsics framework patch.

This patch is part of MVE ACLE intrinsics framework.
This patches add support to update (read/write) the APSR (Application Program Status Register)
register and FPSCR (Floating-point Status and Control Register) register for MVE.
This patch also enables thumb2 mov RTL patterns for MVE.

A new feature bit vfp_base is added. This bit is enabled for all VFP, MVE and MVE with floating point
extensions. This bit is used to enable the macro TARGET_VFP_BASE. For all the VFP instructions, RTL patterns,
status and control registers are guarded by TARGET_HAVE_FLOAT. But this patch modifies that and the
common instructions, RTL patterns, status and control registers bewteen MVE and VFP are guarded by
TARGET_VFP_BASE macro.

The RTL pattern set_fpscr and get_fpscr are updated to use VFPCC_REGNUM because few MVE intrinsics
set/get carry bit of FPSCR register.

Please refer to Arm reference manual [1] for more details.
[1] https://developer.arm.com/docs/ddi0553/latest

2020-03-16  Andre Vieira  <andre.simoesdiasvieira@arm.com>
	    Mihail Ionescu  <mihail.ionescu@arm.com>
	    Srinath Parvathaneni  <srinath.parvathaneni@arm.com>

	* common/config/arm/arm-common.c (arm_asm_auto_mfpu): When vfp_base
	feature bit is on and -mfpu=auto is passed as compiler option, do not
	generate error on not finding any matching fpu. Because in this case
	fpu is not required.
	* config/arm/arm-cpus.in (vfp_base): Define feature bit, this bit is
	enabled for MVE and also for all VFP extensions.
	(VFPv2): Modify fgroup to enable vfp_base feature bit when ever VFPv2
	is enabled.
	(MVE): Define fgroup to enable feature bits mve, vfp_base and armv7em.
	(MVE_FP): Define fgroup to enable feature bits is fgroup MVE and FPv5
	along with feature bits mve_float.
	(mve): Modify add options in armv8.1-m.main arch for MVE.
	(mve.fp): Modify add options in armv8.1-m.main arch for MVE with
	floating point.
	* config/arm/arm.c (use_return_insn): Replace the
	check with TARGET_VFP_BASE.
	(thumb2_legitimate_index_p): Replace TARGET_HARD_FLOAT with
	TARGET_VFP_BASE.
	(arm_rtx_costs_internal): Replace "TARGET_HARD_FLOAT || TARGET_HAVE_MVE"
	with TARGET_VFP_BASE, to allow cost calculations for copies in MVE as
	well.
	(arm_get_vfp_saved_size): Replace TARGET_HARD_FLOAT with
	TARGET_VFP_BASE, to allow space calculation for VFP registers in MVE
	as well.
	(arm_compute_frame_layout): Likewise.
	(arm_save_coproc_regs): Likewise.
	(arm_fixed_condition_code_regs): Modify to enable using VFPCC_REGNUM
	in MVE as well.
	(arm_hard_regno_mode_ok): Replace "TARGET_HARD_FLOAT || TARGET_HAVE_MVE"
	with equivalent macro TARGET_VFP_BASE.
	(arm_expand_epilogue_apcs_frame): Likewise.
	(arm_expand_epilogue): Likewise.
	(arm_conditional_register_usage): Likewise.
	(arm_declare_function_name): Add check to skip printing .fpu directive
	in assembly file when TARGET_VFP_BASE is enabled and fpu_to_print is
	"softvfp".
	* config/arm/arm.h (TARGET_VFP_BASE): Define.
	* config/arm/arm.md (arch): Add "mve" to arch.
	(eq_attr "arch" "mve"): Enable on TARGET_HAVE_MVE is true.
	(vfp_pop_multiple_with_writeback): Replace "TARGET_HARD_FLOAT
	|| TARGET_HAVE_MVE" with equivalent macro TARGET_VFP_BASE.
	* config/arm/constraints.md (Uf): Define to allow modification to FPCCR
	in MVE.
	* config/arm/thumb2.md (thumb2_movsfcc_soft_insn): Modify target guard
	to not allow for MVE.
	* config/arm/unspecs.md (UNSPEC_GET_FPSCR): Move to volatile unspecs
	enum.
	(VUNSPEC_GET_FPSCR): Define.
	* config/arm/vfp.md (thumb2_movhi_vfp): Add support for VMSR and VMRS
	instructions which move to general-purpose Register from Floating-point
	Special register and vice-versa.
	(thumb2_movhi_fp16): Likewise.
	(thumb2_movsi_vfp): Add support for VMSR and VMRS instructions along
	with MCR and MRC instructions which set and get Floating-point Status
	and Control Register (FPSCR).
	(movdi_vfp): Modify pattern to enable Single-precision scalar float move
	in MVE.
	(thumb2_movdf_vfp): Modify pattern to enable Double-precision scalar
	float move patterns in MVE.
	(thumb2_movsfcc_vfp): Modify pattern to enable single float conditional
	code move patterns of VFP also in MVE by adding TARGET_VFP_BASE check.
	(thumb2_movdfcc_vfp): Modify pattern to enable double float conditional
	code move patterns of VFP also in MVE by adding TARGET_VFP_BASE check.
	(push_multi_vfp): Add support to use VFP VPUSH pattern for MVE by adding
	TARGET_VFP_BASE check.
	(set_fpscr): Add support to set FPSCR register for MVE. Modify pattern
	using VFPCC_REGNUM as few MVE intrinsics use carry bit of FPSCR
	register.
	(get_fpscr): Add support to get FPSCR register for MVE. Modify pattern
		using VFPCC_REGNUM as few MVE intrinsics use carry bit of FPSCR
	register.

2020-03-16  Srinath Parvathaneni  <srinath.parvathaneni@arm.com>

	* gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c: New test.
	* gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c: Likewise.
	* gcc.target/arm/mve/intrinsics/mve_fpu1.c: Likewise.
	* gcc.target/arm/mve/intrinsics/mve_fpu2.c: Likewise.
	* gcc.target/arm/mve/intrinsics/mve_fpu3.c: Likewise.
This commit is contained in:
Srinath Parvathaneni 2020-03-16 17:22:39 +00:00 committed by Kyrylo Tkachov
parent 63c8f7d6a0
commit c7be0832b5
16 changed files with 287 additions and 72 deletions

View File

@ -1,3 +1,80 @@
2020-03-16 Andre Vieira <andre.simoesdiasvieira@arm.com>
Mihail Ionescu <mihail.ionescu@arm.com>
Srinath Parvathaneni <srinath.parvathaneni@arm.com>
* common/config/arm/arm-common.c (arm_asm_auto_mfpu): When vfp_base
feature bit is on and -mfpu=auto is passed as compiler option, do not
generate error on not finding any matching fpu. Because in this case
fpu is not required.
* config/arm/arm-cpus.in (vfp_base): Define feature bit, this bit is
enabled for MVE and also for all VFP extensions.
(VFPv2): Modify fgroup to enable vfp_base feature bit when ever VFPv2
is enabled.
(MVE): Define fgroup to enable feature bits mve, vfp_base and armv7em.
(MVE_FP): Define fgroup to enable feature bits is fgroup MVE and FPv5
along with feature bits mve_float.
(mve): Modify add options in armv8.1-m.main arch for MVE.
(mve.fp): Modify add options in armv8.1-m.main arch for MVE with
floating point.
* config/arm/arm.c (use_return_insn): Replace the
check with TARGET_VFP_BASE.
(thumb2_legitimate_index_p): Replace TARGET_HARD_FLOAT with
TARGET_VFP_BASE.
(arm_rtx_costs_internal): Replace "TARGET_HARD_FLOAT || TARGET_HAVE_MVE"
with TARGET_VFP_BASE, to allow cost calculations for copies in MVE as
well.
(arm_get_vfp_saved_size): Replace TARGET_HARD_FLOAT with
TARGET_VFP_BASE, to allow space calculation for VFP registers in MVE
as well.
(arm_compute_frame_layout): Likewise.
(arm_save_coproc_regs): Likewise.
(arm_fixed_condition_code_regs): Modify to enable using VFPCC_REGNUM
in MVE as well.
(arm_hard_regno_mode_ok): Replace "TARGET_HARD_FLOAT || TARGET_HAVE_MVE"
with equivalent macro TARGET_VFP_BASE.
(arm_expand_epilogue_apcs_frame): Likewise.
(arm_expand_epilogue): Likewise.
(arm_conditional_register_usage): Likewise.
(arm_declare_function_name): Add check to skip printing .fpu directive
in assembly file when TARGET_VFP_BASE is enabled and fpu_to_print is
"softvfp".
* config/arm/arm.h (TARGET_VFP_BASE): Define.
* config/arm/arm.md (arch): Add "mve" to arch.
(eq_attr "arch" "mve"): Enable on TARGET_HAVE_MVE is true.
(vfp_pop_multiple_with_writeback): Replace "TARGET_HARD_FLOAT
|| TARGET_HAVE_MVE" with equivalent macro TARGET_VFP_BASE.
* config/arm/constraints.md (Uf): Define to allow modification to FPCCR
in MVE.
* config/arm/thumb2.md (thumb2_movsfcc_soft_insn): Modify target guard
to not allow for MVE.
* config/arm/unspecs.md (UNSPEC_GET_FPSCR): Move to volatile unspecs
enum.
(VUNSPEC_GET_FPSCR): Define.
* config/arm/vfp.md (thumb2_movhi_vfp): Add support for VMSR and VMRS
instructions which move to general-purpose Register from Floating-point
Special register and vice-versa.
(thumb2_movhi_fp16): Likewise.
(thumb2_movsi_vfp): Add support for VMSR and VMRS instructions along
with MCR and MRC instructions which set and get Floating-point Status
and Control Register (FPSCR).
(movdi_vfp): Modify pattern to enable Single-precision scalar float move
in MVE.
(thumb2_movdf_vfp): Modify pattern to enable Double-precision scalar
float move patterns in MVE.
(thumb2_movsfcc_vfp): Modify pattern to enable single float conditional
code move patterns of VFP also in MVE by adding TARGET_VFP_BASE check.
(thumb2_movdfcc_vfp): Modify pattern to enable double float conditional
code move patterns of VFP also in MVE by adding TARGET_VFP_BASE check.
(push_multi_vfp): Add support to use VFP VPUSH pattern for MVE by adding
TARGET_VFP_BASE check.
(set_fpscr): Add support to set FPSCR register for MVE. Modify pattern
using VFPCC_REGNUM as few MVE intrinsics use carry bit of FPSCR
register.
(get_fpscr): Add support to get FPSCR register for MVE. Modify pattern
using VFPCC_REGNUM as few MVE intrinsics use carry bit of FPSCR
register.
2020-03-16 Andre Vieira <andre.simoesdiasvieira@arm.com>
Mihail Ionescu <mihail.ionescu@arm.com>
Srinath Parvathaneni <srinath.parvathaneni@arm.com>

View File

@ -1009,7 +1009,8 @@ arm_asm_auto_mfpu (int argc, const char **argv)
}
}
gcc_assert (i != TARGET_FPU_auto);
gcc_assert (i != TARGET_FPU_auto
|| bitmap_bit_p (target_isa, isa_bit_vfp_base));
}
auto_fpu = (char *) xmalloc (strlen (fpuname) + sizeof ("-mfpu="));

View File

@ -135,6 +135,10 @@ define feature armv8_1m_main
# Floating point and Neon extensions.
# VFPv1 is not supported in GCC.
# This feature bit is enabled for all VFP, MVE and
# MVE with floating point extensions.
define feature vfp_base
# Vector floating point v2.
define feature vfpv2
@ -234,7 +238,7 @@ define fgroup ALL_SIMD ALL_SIMD_INTERNAL ALL_SIMD_EXTERNAL
# List of all FPU bits to strip out if -mfpu is used to override the
# default. fp16 is deliberately missing from this list.
define fgroup ALL_FPU_INTERNAL vfpv2 vfpv3 vfpv4 fpv5 fp16conv fp_dbl ALL_SIMD_INTERNAL
define fgroup ALL_FPU_INTERNAL vfp_base vfpv2 vfpv3 vfpv4 fpv5 fp16conv fp_dbl ALL_SIMD_INTERNAL
# Similarly, but including fp16 and other extensions that aren't part of
# -mfpu support.
define fgroup ALL_FPU_EXTERNAL fp16 bf16
@ -279,10 +283,12 @@ define fgroup ARMv8r ARMv8a
define fgroup ARMv8_1m_main ARMv8m_main armv8_1m_main
# Useful combinations.
define fgroup VFPv2 vfpv2
define fgroup VFPv2 vfp_base vfpv2
define fgroup VFPv3 VFPv2 vfpv3
define fgroup VFPv4 VFPv3 vfpv4 fp16conv
define fgroup FPv5 VFPv4 fpv5
define fgroup MVE mve vfp_base armv7em
define fgroup MVE_FP MVE FPv5 fp16 mve_float
define fgroup FP_DBL fp_dbl
define fgroup FP_D32 FP_DBL fp_d32
@ -699,8 +705,8 @@ begin arch armv8.1-m.main
option fp add FPv5 fp16
option fp.dp add FPv5 FP_DBL fp16
option nofp remove ALL_FP
option mve add mve armv7em
option mve.fp add mve FPv5 fp16 mve_float armv7em
option mve add MVE
option mve.fp add MVE_FP
end arch armv8.1-m.main
begin arch iwmmxt

View File

@ -4295,7 +4295,7 @@ use_return_insn (int iscond, rtx sibling)
/* Can't be done if any of the VFP regs are pushed,
since this also requires an insn. */
if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
if (TARGET_VFP_BASE)
for (regno = FIRST_VFP_REGNUM; regno <= LAST_VFP_REGNUM; regno++)
if (df_regs_ever_live_p (regno) && !call_used_or_fixed_reg_p (regno))
return 0;
@ -6289,7 +6289,7 @@ use_vfp_abi (enum arm_pcs pcs_variant, bool is_double)
return false;
return (TARGET_32BIT && TARGET_HARD_FLOAT &&
(TARGET_VFP_DOUBLE || !is_double));
(TARGET_VFP_DOUBLE || !is_double));
}
/* Return true if an argument whose type is TYPE, or mode is MODE, is
@ -8512,7 +8512,7 @@ thumb2_legitimate_index_p (machine_mode mode, rtx index, int strict_p)
/* ??? Combine arm and thumb2 coprocessor addressing modes. */
/* Standard coprocessor addressing modes. */
if (TARGET_HARD_FLOAT
if (TARGET_VFP_BASE
&& (mode == SFmode || mode == DFmode))
return (code == CONST_INT && INTVAL (index) < 1024
/* Thumb-2 allows only > -256 index range for it's core register
@ -9905,7 +9905,7 @@ arm_rtx_costs_internal (rtx x, enum rtx_code code, enum rtx_code outer_code,
/* Assume that most copies can be done with a single insn,
unless we don't have HW FP, in which case everything
larger than word mode will require two insns. */
*cost = COSTS_N_INSNS (((!(TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
*cost = COSTS_N_INSNS (((!TARGET_VFP_BASE
&& GET_MODE_SIZE (mode) > 4)
|| mode == DImode)
? 2 : 1);
@ -20821,7 +20821,7 @@ arm_get_vfp_saved_size (void)
saved = 0;
/* Space for saved VFP registers. */
if (TARGET_HARD_FLOAT)
if (TARGET_VFP_BASE)
{
count = 0;
for (regno = FIRST_VFP_REGNUM;
@ -22364,7 +22364,7 @@ arm_compute_frame_layout (void)
func_type = arm_current_func_type ();
/* Space for saved VFP registers. */
if (! IS_VOLATILE (func_type)
&& (TARGET_HARD_FLOAT || TARGET_HAVE_MVE))
&& TARGET_VFP_BASE)
saved += arm_get_vfp_saved_size ();
/* Allocate space for saving/restoring FPCXTNS in Armv8.1-M Mainline
@ -22588,7 +22588,7 @@ arm_save_coproc_regs(void)
saved_size += 8;
}
if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
if (TARGET_VFP_BASE)
{
start_reg = FIRST_VFP_REGNUM;
@ -24546,7 +24546,7 @@ arm_fixed_condition_code_regs (unsigned int *p1, unsigned int *p2)
return false;
*p1 = CC_REGNUM;
*p2 = TARGET_HARD_FLOAT ? VFPCC_REGNUM : INVALID_REGNUM;
*p2 = TARGET_VFP_BASE ? VFPCC_REGNUM : INVALID_REGNUM;
return true;
}
@ -24965,7 +24965,7 @@ arm_hard_regno_mode_ok (unsigned int regno, machine_mode mode)
{
if (GET_MODE_CLASS (mode) == MODE_CC)
return (regno == CC_REGNUM
|| ((TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
|| (TARGET_VFP_BASE
&& regno == VFPCC_REGNUM));
if (regno == CC_REGNUM && GET_MODE_CLASS (mode) != MODE_CC)
@ -24982,7 +24982,7 @@ arm_hard_regno_mode_ok (unsigned int regno, machine_mode mode)
start of an even numbered register pair. */
return (ARM_NUM_REGS (mode) < 2) || (regno < LAST_LO_REGNUM);
if ((TARGET_HARD_FLOAT || TARGET_HAVE_MVE) && IS_VFP_REGNUM (regno))
if (TARGET_VFP_BASE && IS_VFP_REGNUM (regno))
{
if (mode == DFmode)
return VFP_REGNO_OK_FOR_DOUBLE (regno);
@ -26933,7 +26933,7 @@ arm_expand_epilogue_apcs_frame (bool really_return)
floats_from_frame += 4;
}
if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
if (TARGET_VFP_BASE)
{
int start_reg;
rtx ip_rtx = gen_rtx_REG (SImode, IP_REGNUM);
@ -27179,7 +27179,7 @@ arm_expand_epilogue (bool really_return)
}
}
if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
if (TARGET_VFP_BASE)
{
/* Generate VFP register multi-pop. */
int end_reg = LAST_VFP_REGNUM + 1;
@ -29699,7 +29699,7 @@ arm_conditional_register_usage (void)
if (TARGET_THUMB1)
fixed_regs[LR_REGNUM] = call_used_regs[LR_REGNUM] = 1;
if (TARGET_32BIT && (TARGET_HARD_FLOAT || TARGET_HAVE_MVE))
if (TARGET_32BIT && TARGET_VFP_BASE)
{
/* VFPv3 registers are disabled when earlier VFP
versions are selected due to the definition of
@ -32478,7 +32478,8 @@ arm_declare_function_name (FILE *stream, const char *name, tree decl)
= TARGET_SOFT_FLOAT
? "softvfp" : arm_identify_fpu_from_isa (arm_active_target.isa);
if (fpu_to_print != arm_last_printed_arch_string)
if (!(!strcmp (fpu_to_print.c_str (), "softvfp") && TARGET_VFP_BASE)
&& (fpu_to_print != arm_last_printed_arch_string))
{
asm_fprintf (asm_out_file, "\t.fpu %s\n", fpu_to_print.c_str ());
arm_last_printed_fpu_string = fpu_to_print;

View File

@ -334,6 +334,19 @@ emission of floating point pcs attributes. */
isa_bit_mve_float) \
&& !TARGET_GENERAL_REGS_ONLY)
/* MVE have few common instructions as VFP, like VLDM alias VPOP, VLDR, VSTM
alia VPUSH, VSTR and VMOV, VMSR and VMRS. In the same manner it updates few
registers such as FPCAR, FPCCR, FPDSCR, FPSCR, MVFR0, MVFR1 and MVFR2. All
the VFP instructions, RTL patterns and register are guarded by
TARGET_HARD_FLOAT. But the common instructions, RTL pattern and registers
between MVE and VFP will be guarded by the following macro TARGET_VFP_BASE
hereafter. */
#define TARGET_VFP_BASE (arm_float_abi != ARM_FLOAT_ABI_SOFT \
&& bitmap_bit_p (arm_active_target.isa, \
isa_bit_vfp_base) \
&& !TARGET_GENERAL_REGS_ONLY)
/* Nonzero if integer division instructions supported. */
#define TARGET_IDIV ((TARGET_ARM && arm_arch_arm_hwdiv) \
|| (TARGET_THUMB && arm_arch_thumb_hwdiv))

View File

@ -134,7 +134,7 @@
; arm_arch6. "v6t2" for Thumb-2 with arm_arch6 and "v8mb" for ARMv8-M
; Baseline. This attribute is used to compute attribute "enabled",
; use type "any" to enable an alternative in all cases.
(define_attr "arch" "any,a,t,32,t1,t2,v6,nov6,v6t2,v8mb,iwmmxt,iwmmxt2,armv6_or_vfpv3,neon"
(define_attr "arch" "any,a,t,32,t1,t2,v6,nov6,v6t2,v8mb,iwmmxt,iwmmxt2,armv6_or_vfpv3,neon,mve"
(const_string "any"))
(define_attr "arch_enabled" "no,yes"
@ -188,6 +188,10 @@
(and (eq_attr "arch" "neon")
(match_test "TARGET_NEON"))
(const_string "yes")
(and (eq_attr "arch" "mve")
(match_test "TARGET_HAVE_MVE"))
(const_string "yes")
]
(const_string "no")))
@ -11758,7 +11762,7 @@
(match_operand:SI 2 "const_int_I_operand" "I")))
(set (match_operand:DF 3 "vfp_hard_register_operand" "")
(mem:DF (match_dup 1)))])]
"TARGET_32BIT && (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)"
"TARGET_32BIT && TARGET_VFP_BASE"
"*
{
int num_regs = XVECLEN (operands[0], 0);

View File

@ -38,7 +38,7 @@
;; in all states: Pf, Pg
;; The following memory constraints have been used:
;; in ARM/Thumb-2 state: Uh, Ut, Uv, Uy, Un, Um, Us, Up
;; in ARM/Thumb-2 state: Uh, Ut, Uv, Uy, Un, Um, Us, Up, Uf
;; in ARM state: Uq
;; in Thumb state: Uu, Uw
;; in all states: Q
@ -46,6 +46,9 @@
(define_register_constraint "Up" "TARGET_HAVE_MVE ? VPR_REG : NO_REGS"
"MVE VPR register")
(define_register_constraint "Uf" "TARGET_HAVE_MVE ? VFPCC_REG : NO_REGS"
"MVE FPCCR register")
(define_register_constraint "t" "TARGET_32BIT ? VFP_LO_REGS : NO_REGS"
"The VFP registers @code{s0}-@code{s31}.")

View File

@ -517,7 +517,7 @@
[(match_operand 4 "cc_register" "") (const_int 0)])
(match_operand:SF 1 "s_register_operand" "0,r")
(match_operand:SF 2 "s_register_operand" "r,0")))]
"TARGET_THUMB2 && TARGET_SOFT_FLOAT"
"TARGET_THUMB2 && TARGET_SOFT_FLOAT && !TARGET_HAVE_MVE"
"@
it\\t%D3\;mov%D3\\t%0, %2
it\\t%d3\;mov%d3\\t%0, %1"

View File

@ -170,6 +170,7 @@
UNSPEC_TORC ; Used by the intrinsic form of the iWMMXt TORC instruction.
UNSPEC_TORVSC ; Used by the intrinsic form of the iWMMXt TORVSC instruction.
UNSPEC_TEXTRC ; Used by the intrinsic form of the iWMMXt TEXTRC instruction.
UNSPEC_GET_FPSCR ; Represent fetch of FPSCR content.
])
@ -216,7 +217,6 @@
VUNSPEC_SLX ; Represent a store-register-release-exclusive.
VUNSPEC_LDA ; Represent a store-register-acquire.
VUNSPEC_STL ; Represent a store-register-release.
VUNSPEC_GET_FPSCR ; Represent fetch of FPSCR content.
VUNSPEC_SET_FPSCR ; Represent assign of FPSCR content.
VUNSPEC_PROBE_STACK_RANGE ; Represent stack range probing.
VUNSPEC_CDP ; Represent the coprocessor cdp instruction.

View File

@ -74,10 +74,10 @@
(define_insn "*thumb2_movhi_vfp"
[(set
(match_operand:HI 0 "nonimmediate_operand"
"=rk, r, l, r, m, r, *t, r, *t")
"=rk, r, l, r, m, r, *t, r, *t, Up, r")
(match_operand:HI 1 "general_operand"
"rk, I, Py, n, r, m, r, *t, *t"))]
"TARGET_THUMB2 && TARGET_HARD_FLOAT
"rk, I, Py, n, r, m, r, *t, *t, r, Up"))]
"TARGET_THUMB2 && TARGET_VFP_BASE
&& !TARGET_VFP_FP16INST
&& (register_operand (operands[0], HImode)
|| register_operand (operands[1], HImode))"
@ -99,20 +99,24 @@
return "vmov%?\t%0, %1\t%@ int";
case 8:
return "vmov%?.f32\t%0, %1\t%@ int";
case 9:
return "vmsr%?\t P0, %1\t@ movhi";
case 10:
return "vmrs%?\t %0, P0\t@ movhi";
default:
gcc_unreachable ();
}
}
[(set_attr "predicable" "yes")
(set_attr "predicable_short_it"
"yes, no, yes, no, no, no, no, no, no")
"yes, no, yes, no, no, no, no, no, no, no, no")
(set_attr "type"
"mov_reg, mov_imm, mov_imm, mov_imm, store_4, load_4,\
f_mcr, f_mrc, fmov")
(set_attr "arch" "*, *, *, v6t2, *, *, *, *, *")
(set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *")
(set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *")
(set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4")]
f_mcr, f_mrc, fmov, mve_move, mve_move")
(set_attr "arch" "*, *, *, v6t2, *, *, *, *, *, mve, mve")
(set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *, *, *")
(set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *, *, *")
(set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4, 4, 4")]
)
;; Patterns for HI moves which provide more data transfer instructions when FP16
@ -170,10 +174,10 @@
(define_insn "*thumb2_movhi_fp16"
[(set
(match_operand:HI 0 "nonimmediate_operand"
"=rk, r, l, r, m, r, *t, r, *t")
"=rk, r, l, r, m, r, *t, r, *t, Up, r")
(match_operand:HI 1 "general_operand"
"rk, I, Py, n, r, m, r, *t, *t"))]
"TARGET_THUMB2 && TARGET_VFP_FP16INST
"rk, I, Py, n, r, m, r, *t, *t, r, Up"))]
"TARGET_THUMB2 && (TARGET_VFP_FP16INST || TARGET_HAVE_MVE)
&& (register_operand (operands[0], HImode)
|| register_operand (operands[1], HImode))"
{
@ -194,21 +198,25 @@
return "vmov.f16\t%0, %1\t%@ int";
case 8:
return "vmov%?.f32\t%0, %1\t%@ int";
case 9:
return "vmsr%?\tP0, %1\t%@ movhi";
case 10:
return "vmrs%?\t%0, P0\t%@ movhi";
default:
gcc_unreachable ();
}
}
[(set_attr "predicable"
"yes, yes, yes, yes, yes, yes, no, no, yes")
"yes, yes, yes, yes, yes, yes, no, no, yes, yes, yes")
(set_attr "predicable_short_it"
"yes, no, yes, no, no, no, no, no, no")
"yes, no, yes, no, no, no, no, no, no, no, no")
(set_attr "type"
"mov_reg, mov_imm, mov_imm, mov_imm, store_4, load_4,\
f_mcr, f_mrc, fmov")
(set_attr "arch" "*, *, *, v6t2, *, *, *, *, *")
(set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *")
(set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *")
(set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4")]
f_mcr, f_mrc, fmov, mve_move, mve_move")
(set_attr "arch" "*, *, *, v6t2, *, *, *, *, *, mve, mve")
(set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *, *, *")
(set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *, *, *")
(set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4, 4, 4")]
)
;; SImode moves
@ -258,9 +266,11 @@
;; is chosen with length 2 when the instruction is predicated for
;; arm_restrict_it.
(define_insn "*thumb2_movsi_vfp"
[(set (match_operand:SI 0 "nonimmediate_operand" "=rk,r,l,r,r,lk*r,m,*t, r,*t,*t, *Uv")
(match_operand:SI 1 "general_operand" "rk,I,Py,K,j,mi,lk*r, r,*t,*t,*UvTu,*t"))]
"TARGET_THUMB2 && TARGET_HARD_FLOAT
[(set (match_operand:SI 0 "nonimmediate_operand" "=rk,r,l,r,r,l,*hk,m,*m,*t,\
r,*t,*t,*Uv, Up, r,Uf,r")
(match_operand:SI 1 "general_operand" "rk,I,Py,K,j,mi,*mi,l,*hk,r,*t,\
*t,*UvTu,*t, r, Up,r,Uf"))]
"TARGET_THUMB2 && TARGET_VFP_BASE
&& ( s_register_operand (operands[0], SImode)
|| s_register_operand (operands[1], SImode))"
"*
@ -275,30 +285,44 @@
case 4:
return \"movw%?\\t%0, %1\";
case 5:
case 6:
/* Cannot load it directly, split to load it via MOV / MOVT. */
if (!MEM_P (operands[1]) && arm_disable_literal_pool)
return \"#\";
return \"ldr%?\\t%0, %1\";
case 6:
return \"str%?\\t%1, %0\";
case 7:
return \"vmov%?\\t%0, %1\\t%@ int\";
case 8:
return \"vmov%?\\t%0, %1\\t%@ int\";
return \"str%?\\t%1, %0\";
case 9:
return \"vmov%?\\t%0, %1\\t%@ int\";
case 10:
return \"vmov%?\\t%0, %1\\t%@ int\";
case 11:
return \"vmov%?.f32\\t%0, %1\\t%@ int\";
case 10: case 11:
case 12: case 13:
return output_move_vfp (operands);
case 14:
return \"vmsr\\t P0, %1\";
case 15:
return \"vmrs\\t %0, P0\";
case 16:
return \"mcr\\tp10, 7, %1, cr1, cr0, 0\\t @SET_FPSCR\";
case 17:
return \"mrc\\tp10, 7, %0, cr1, cr0, 0\\t @GET_FPSCR\";
default:
gcc_unreachable ();
}
"
[(set_attr "predicable" "yes")
(set_attr "predicable_short_it" "yes,no,yes,no,no,no,no,no,no,no,no,no")
(set_attr "type" "mov_reg,mov_reg,mov_reg,mvn_reg,mov_imm,load_4,store_4,f_mcr,f_mrc,fmov,f_loads,f_stores")
(set_attr "length" "2,4,2,4,4,4,4,4,4,4,4,4")
(set_attr "pool_range" "*,*,*,*,*,1018,*,*,*,*,1018,*")
(set_attr "neg_pool_range" "*,*,*,*,*, 0,*,*,*,*,1008,*")]
(set_attr "predicable_short_it" "yes,no,yes,no,no,no,no,no,no,no,no,no,no,\
no,no,no,no,no")
(set_attr "type" "mov_reg,mov_reg,mov_reg,mvn_reg,mov_imm,load_4,load_4,\
store_4,store_4,f_mcr,f_mrc,fmov,f_loads,f_stores,mve_move,\
mve_move,mrs,mrs")
(set_attr "length" "2,4,2,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4")
(set_attr "pool_range" "*,*,*,*,*,1018,4094,*,*,*,*,*,1018,*,*,*,*,*")
(set_attr "arch" "*,*,*,*,*,*,*,*,*,*,*,*,*,*,mve,mve,mve,mve")
(set_attr "neg_pool_range" "*,*,*,*,*, 0, 0,*,*,*,*,*,1008,*,*,*,*,*")]
)
@ -306,12 +330,12 @@
(define_insn "*movdi_vfp"
[(set (match_operand:DI 0 "nonimmediate_di_operand" "=r,r,r,r,r,r,m,w,!r,w,w, Uv")
(match_operand:DI 1 "di_operand" "r,rDa,Db,Dc,mi,mi,r,r,w,w,UvTu,w"))]
"TARGET_32BIT && TARGET_HARD_FLOAT
(match_operand:DI 1 "di_operand" "r,rDa,Db,Dc,mi,mi,r,r,w,w,UvTu,w"))]
"TARGET_32BIT && TARGET_VFP_BASE
&& ( register_operand (operands[0], DImode)
|| register_operand (operands[1], DImode))
&& !(TARGET_NEON && CONST_INT_P (operands[1])
&& simd_immediate_valid_for_move (operands[1], DImode, NULL, NULL))"
&& !((TARGET_NEON || TARGET_HAVE_MVE) && CONST_INT_P (operands[1])
&& simd_immediate_valid_for_move (operands[1], DImode, NULL, NULL))"
"*
switch (which_alternative)
{
@ -333,7 +357,7 @@
case 8:
return \"vmov%?\\t%Q0, %R0, %P1\\t%@ int\";
case 9:
if (TARGET_VFP_SINGLE)
if (TARGET_VFP_SINGLE || TARGET_HAVE_MVE)
return \"vmov%?.f32\\t%0, %1\\t%@ int\;vmov%?.f32\\t%p0, %p1\\t%@ int\";
else
return \"vmov%?.f64\\t%P0, %P1\\t%@ int\";
@ -390,9 +414,15 @@
case 6: /* S register from immediate. */
return \"vmov.f16\\t%0, %1\t%@ __<fporbf>\";
case 7: /* S register from memory. */
return \"vld1.16\\t{%z0}, %A1\";
if (TARGET_HAVE_MVE)
return \"vldr.16\\t%0, %A1\";
else
return \"vld1.16\\t{%z0}, %A1\";
case 8: /* Memory from S register. */
return \"vst1.16\\t{%z1}, %A0\";
if (TARGET_HAVE_MVE)
return \"vstr.16\\t%1, %A0\";
else
return \"vst1.16\\t{%z1}, %A0\";
case 9: /* ARM register from constant. */
{
long bits;
@ -593,7 +623,7 @@
(define_insn "*thumb2_movsf_vfp"
[(set (match_operand:SF 0 "nonimmediate_operand" "=t,?r,t, t ,Uv,r ,m,t,r")
(match_operand:SF 1 "hard_sf_operand" " ?r,t,Dv,UvHa,t, mHa,r,t,r"))]
"TARGET_THUMB2 && TARGET_HARD_FLOAT
"TARGET_THUMB2 && TARGET_VFP_BASE
&& ( s_register_operand (operands[0], SFmode)
|| s_register_operand (operands[1], SFmode))"
"*
@ -682,7 +712,7 @@
(define_insn "*thumb2_movdf_vfp"
[(set (match_operand:DF 0 "nonimmediate_soft_df_operand" "=w,?r,w ,w,w ,Uv,r ,m,w,r")
(match_operand:DF 1 "hard_df_operand" " ?r,w,Dy,G,UvHa,w, mHa,r, w,r"))]
"TARGET_THUMB2 && TARGET_HARD_FLOAT
"TARGET_THUMB2 && TARGET_VFP_BASE
&& ( register_operand (operands[0], DFmode)
|| register_operand (operands[1], DFmode))"
"*
@ -760,7 +790,7 @@
[(match_operand 4 "cc_register" "") (const_int 0)])
(match_operand:SF 1 "s_register_operand" "0,t,t,0,?r,?r,0,t,t")
(match_operand:SF 2 "s_register_operand" "t,0,t,?r,0,?r,t,0,t")))]
"TARGET_THUMB2 && TARGET_HARD_FLOAT && !arm_restrict_it"
"TARGET_THUMB2 && TARGET_VFP_BASE && !arm_restrict_it"
"@
it\\t%D3\;vmov%D3.f32\\t%0, %2
it\\t%d3\;vmov%d3.f32\\t%0, %1
@ -806,7 +836,8 @@
[(match_operand 4 "cc_register" "") (const_int 0)])
(match_operand:DF 1 "s_register_operand" "0,w,w,0,?r,?r,0,w,w")
(match_operand:DF 2 "s_register_operand" "w,0,w,?r,0,?r,w,0,w")))]
"TARGET_THUMB2 && TARGET_HARD_FLOAT && TARGET_VFP_DOUBLE && !arm_restrict_it"
"TARGET_THUMB2 && TARGET_VFP_BASE && TARGET_VFP_DOUBLE
&& !arm_restrict_it"
"@
it\\t%D3\;vmov%D3.f64\\t%P0, %P2
it\\t%d3\;vmov%d3.f64\\t%P0, %P1
@ -1977,7 +2008,7 @@
[(set (match_operand:BLK 0 "memory_operand" "=m")
(unspec:BLK [(match_operand:DF 1 "vfp_register_operand" "")]
UNSPEC_PUSH_MULT))])]
"TARGET_32BIT && TARGET_HARD_FLOAT"
"TARGET_32BIT && TARGET_VFP_BASE"
"* return vfp_output_vstmd (operands);"
[(set_attr "type" "f_stored")]
)
@ -2065,16 +2096,18 @@
;; Write Floating-point Status and Control Register.
(define_insn "set_fpscr"
[(unspec_volatile [(match_operand:SI 0 "register_operand" "r")] VUNSPEC_SET_FPSCR)]
"TARGET_HARD_FLOAT"
[(set (reg:SI VFPCC_REGNUM)
(unspec_volatile:SI
[(match_operand:SI 0 "register_operand" "r")] VUNSPEC_SET_FPSCR))]
"TARGET_VFP_BASE"
"mcr\\tp10, 7, %0, cr1, cr0, 0\\t @SET_FPSCR"
[(set_attr "type" "mrs")])
;; Read Floating-point Status and Control Register.
(define_insn "get_fpscr"
[(set (match_operand:SI 0 "register_operand" "=r")
(unspec_volatile:SI [(const_int 0)] VUNSPEC_GET_FPSCR))]
"TARGET_HARD_FLOAT"
(unspec:SI [(reg:SI VFPCC_REGNUM)] UNSPEC_GET_FPSCR))]
"TARGET_VFP_BASE"
"mrc\\tp10, 7, %0, cr1, cr0, 0\\t @GET_FPSCR"
[(set_attr "type" "mrs")])

View File

@ -1,3 +1,12 @@
2020-03-16 Srinath Parvathaneni <srinath.parvathaneni@arm.com>
* gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c: New test.
* gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c: Likewise.
* gcc.target/arm/mve/intrinsics/mve_fpu1.c: Likewise.
* gcc.target/arm/mve/intrinsics/mve_fpu2.c: Likewise.
* gcc.target/arm/mve/intrinsics/mve_fpu3.c: Likewise.
2020-03-16 Andre Vieira <andre.simoesdiasvieira@arm.com>
Mihail Ionescu <mihail.ionescu@arm.com>
Srinath Parvathaneni <srinath.parvathaneni@arm.com>

View File

@ -0,0 +1,14 @@
/* { dg-do compile } */
/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
/* { dg-additional-options "-march=armv8.1-m.main+mve.fp -mfloat-abi=hard -mthumb" } */
#include "arm_mve.h"
int8x16_t
foo1 (int8x16_t value)
{
int8x16_t b = value;
return b;
}
/* { dg-final { scan-assembler "\.fpu fpv5-sp-d16" } } */

View File

@ -0,0 +1,14 @@
/* { dg-do compile } */
/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
/* { dg-additional-options "-march=armv8.1-m.main+mve.fp -mfloat-abi=softfp -mthumb" } */
#include "arm_mve.h"
int8x16_t
foo1 (int8x16_t value)
{
int8x16_t b = value;
return b;
}
/* { dg-final { scan-assembler "\.fpu fpv5-sp-d16" } } */

View File

@ -0,0 +1,14 @@
/* { dg-do compile } */
/* { dg-require-effective-target arm_v8_1m_mve_ok } */
/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -mthumb" } */
#include "arm_mve.h"
int8x16_t
foo1 (int8x16_t value)
{
int8x16_t b = value;
return b;
}
/* { dg-final { scan-assembler-not "\.fpu softvfp" } } */

View File

@ -0,0 +1,14 @@
/* { dg-do compile } */
/* { dg-require-effective-target arm_v8_1m_mve_ok } */
/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=softfp -mthumb" } */
#include "arm_mve.h"
int8x16_t
foo1 (int8x16_t value)
{
int8x16_t b = value;
return b;
}
/* { dg-final { scan-assembler-not "\.fpu softvfp" } } */

View File

@ -0,0 +1,12 @@
/* { dg-do compile } */
/* { dg-require-effective-target arm_v8_1m_mve_ok } */
/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=soft -mthumb" } */
int
foo1 (int value)
{
int b = value;
return b;
}
/* { dg-final { scan-assembler "\.fpu softvfp" } } */