mirror of
git://gcc.gnu.org/git/gcc.git
synced 2025-03-03 11:06:09 +08:00
Accept the architecture configure option and resolve build failures. This is enough to build binaries, but I've not got a device to test it on, so there are probably runtime issues to fix. The cache control instructions might be unsafe (or too conservative), and the kernel metadata might be off. Vector reductions will need to be reworked for RDNA2. In principle, it would be better to use wavefrontsize32 for this architecture, but that would mean switching everything to allow SImode masks, so wavefrontsize64 it is. The multilib is not included in the default configuration so either configure --with-arch=gfx1030 or include it in --with-multilib-list=gfx1030,.... The majority of this patch has no effect on other devices, but changing from using scalar writes for the exit value to vector writes means we don't need the scalar cache write-back instruction anywhere (which doesn't exist in RDNA2). gcc/ChangeLog: * config.gcc: Allow --with-arch=gfx1030. * config/gcn/gcn-hsa.h (NO_XNACK): gfx1030 does not support xnack. (ASM_SPEC): gfx1030 needs -mattr=+wavefrontsize64 set. * config/gcn/gcn-opts.h (enum processor_type): Add PROCESSOR_GFX1030. (TARGET_GFX1030): New. (TARGET_RDNA2): New. * config/gcn/gcn-valu.md (@dpp_move<mode>): Disable for RDNA2. (addc<mode>3<exec_vcc>): Add RDNA2 syntax variant. (subc<mode>3<exec_vcc>): Likewise. (<convop><mode><vndi>2_exec): Add RDNA2 alternatives. (vec_cmp<mode>di): Likewise. (vec_cmp<u><mode>di): Likewise. (vec_cmp<mode>di_exec): Likewise. (vec_cmp<u><mode>di_exec): Likewise. (vec_cmp<mode>di_dup): Likewise. (vec_cmp<mode>di_dup_exec): Likewise. (reduc_<reduc_op>_scal_<mode>): Disable for RDNA2. (*<reduc_op>_dpp_shr_<mode>): Likewise. (*plus_carry_dpp_shr_<mode>): Likewise. (*plus_carry_in_dpp_shr_<mode>): Likewise. * config/gcn/gcn.cc (gcn_option_override): Recognise gfx1030. (gcn_global_address_p): RDNA2 only allows smaller offsets. (gcn_addr_space_legitimate_address_p): Likewise. (gcn_omp_device_kind_arch_isa): Recognise gfx1030. (gcn_expand_epilogue): Use VGPRs instead of SGPRs. (output_file_start): Configure gfx1030. * config/gcn/gcn.h (TARGET_CPU_CPP_BUILTINS): Add __RDNA2__; (ASSEMBLER_DIALECT): New. * config/gcn/gcn.md (rdna): New define_attr. (enabled): Use "rdna" attribute. (gcn_return): Remove s_dcache_wb. (addcsi3_scalar): Add RDNA2 syntax variant. (addcsi3_scalar_zero): Likewise. (addptrdi3): Likewise. (mulsi3): v_mul_lo_i32 should be v_mul_lo_u32 on all ISA. (*memory_barrier): Add RDNA2 syntax variant. (atomic_load<mode>): Add RDNA2 cache control variants, and disable scalar atomics for RDNA2. (atomic_store<mode>): Likewise. (atomic_exchange<mode>): Likewise. * config/gcn/gcn.opt (gpu_type): Add gfx1030. * config/gcn/mkoffload.cc (EF_AMDGPU_MACH_AMDGCN_GFX1030): New. (main): Recognise -march=gfx1030. * config/gcn/t-omp-device: Add gfx1030 isa. libgcc/ChangeLog: * config/gcn/amdgcn_veclib.h (CDNA3_PLUS): Set false for __RDNA2__. libgomp/ChangeLog: * plugin/plugin-gcn.c (EF_AMDGPU_MACH_AMDGCN_GFX1030): New. (isa_hsa_name): Recognise gfx1030. (isa_code): Likewise. * team.c (defined): Remove s_endpgm. |
||
---|---|---|
.. | ||
config | ||
plugin | ||
testsuite | ||
.gitattributes | ||
acc_prof.h | ||
acinclude.m4 | ||
aclocal.m4 | ||
affinity-fmt.c | ||
affinity.c | ||
alloc.c | ||
allocator.c | ||
atomic.c | ||
barrier.c | ||
ChangeLog | ||
ChangeLog.graphite | ||
config.h.in | ||
configure | ||
configure.ac | ||
configure.tgt | ||
critical.c | ||
env.c | ||
error.c | ||
fortran.c | ||
hashtab.h | ||
icv-device.c | ||
icv.c | ||
iter_ull.c | ||
iter.c | ||
libgomp_f.h.in | ||
libgomp_g.h | ||
libgomp-plugin.c | ||
libgomp-plugin.h | ||
libgomp.h | ||
libgomp.map | ||
libgomp.spec.in | ||
libgomp.texi | ||
lock.c | ||
loop_ull.c | ||
loop.c | ||
Makefile.am | ||
Makefile.in | ||
oacc-async.c | ||
oacc-cuda.c | ||
oacc-host.c | ||
oacc-init.c | ||
oacc-int.h | ||
oacc-mem.c | ||
oacc-parallel.c | ||
oacc-plugin.c | ||
oacc-plugin.h | ||
oacc-profiling.c | ||
oacc-target.c | ||
omp_lib.f90.in | ||
omp_lib.h.in | ||
omp.h.in | ||
openacc_lib.h | ||
openacc.f90 | ||
openacc.h | ||
ordered.c | ||
parallel.c | ||
priority_queue.c | ||
priority_queue.h | ||
scope.c | ||
sections.c | ||
secure_getenv.h | ||
single.c | ||
splay-tree.c | ||
splay-tree.h | ||
target.c | ||
task.c | ||
taskloop.c | ||
team.c | ||
teams.c | ||
work.c |