Go to file
Juzhe-Zhong f0e28d8c13 RISC-V: Fix failed hoist in LICM of vmv.v.x instruction
Confirm dynamic LMUL algorithm works well for choosing LMUL = 4 for the PR:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111848

But it generate horrible register spillings.

The root cause is that we didn't hoist the vmv.v.x outside the loop which
increase the SLP loop register pressure.

So, change the COSNT_VECTOR move into vec_duplicate splitter that we can gain better optimizations:

1. better LICM.
2. More opportunities of transforming 'vv' into 'vx' in the future.

Before this patch:

f3:
        ble     a4,zero,.L8
        csrr    t0,vlenb
        slli    t1,t0,4
        csrr    a6,vlenb
        sub     sp,sp,t1
        csrr    a5,vlenb
        slli    a6,a6,3
        slli    a5,a5,2
        add     a6,a6,sp
        vsetvli a7,zero,e16,m8,ta,ma
        slli    a4,a4,3
        vid.v   v8
        addi    t6,a5,-1
        vand.vi v8,v8,-2
        neg     t5,a5
        vs8r.v  v8,0(sp)
        vadd.vi v8,v8,1
        vs8r.v  v8,0(a6)
        j       .L4
.L12:
        vsetvli a7,zero,e16,m8,ta,ma
.L4:
        csrr    t0,vlenb
        slli    t0,t0,3
        vl8re16.v       v16,0(sp)
        add     t0,t0,sp
        vmv.v.x v8,t6
        mv      t1,a4
        vand.vv v24,v16,v8
        mv      a6,a4
        vl8re16.v       v16,0(t0)
        vand.vv v8,v16,v8
        bleu    a4,a5,.L3
        mv      a6,a5
.L3:
        vsetvli zero,a6,e8,m4,ta,ma
        vle8.v  v20,0(a2)
        vle8.v  v16,0(a3)
        vsetvli a7,zero,e8,m4,ta,ma
        vrgatherei16.vv v4,v20,v24
        vadd.vv v4,v16,v4
        vsetvli zero,a6,e8,m4,ta,ma
        vse8.v  v4,0(a0)
        vle8.v  v20,0(a2)
        vsetvli a7,zero,e8,m4,ta,ma
        vrgatherei16.vv v4,v20,v8
        vadd.vv v4,v4,v16
        vsetvli zero,a6,e8,m4,ta,ma
        vse8.v  v4,0(a1)
        add     a4,a4,t5
        add     a0,a0,a5
        add     a3,a3,a5
        add     a1,a1,a5
        add     a2,a2,a5
        bgtu    t1,a5,.L12
        csrr    t0,vlenb
        slli    t1,t0,4
        add     sp,sp,t1
        jr      ra
.L8:
        ret

After this patch:

f3:
	ble	a4,zero,.L6
	csrr	a6,vlenb
	csrr	a5,vlenb
	slli	a6,a6,2
	slli	a5,a5,2
	addi	a6,a6,-1
	slli	a4,a4,3
	neg	t5,a5
	vsetvli	t1,zero,e16,m8,ta,ma
	vmv.v.x	v24,a6
	vid.v	v8
	vand.vi	v8,v8,-2
	vadd.vi	v16,v8,1
	vand.vv	v8,v8,v24
	vand.vv	v16,v16,v24
.L4:
	mv	t1,a4
	mv	a6,a4
	bleu	a4,a5,.L3
	mv	a6,a5
.L3:
	vsetvli	zero,a6,e8,m4,ta,ma
	vle8.v	v28,0(a2)
	vle8.v	v24,0(a3)
	vsetvli	a7,zero,e8,m4,ta,ma
	vrgatherei16.vv	v4,v28,v8
	vadd.vv	v4,v24,v4
	vsetvli	zero,a6,e8,m4,ta,ma
	vse8.v	v4,0(a0)
	vle8.v	v28,0(a2)
	vsetvli	a7,zero,e8,m4,ta,ma
	vrgatherei16.vv	v4,v28,v16
	vadd.vv	v4,v4,v24
	vsetvli	zero,a6,e8,m4,ta,ma
	vse8.v	v4,0(a1)
	add	a4,a4,t5
	add	a0,a0,a5
	add	a3,a3,a5
	add	a1,a1,a5
	add	a2,a2,a5
	bgtu	t1,a5,.L4
.L6:
	ret

Note that this patch triggers multiple FAILs:
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_arith_run-3.c execution test
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_arith_run-3.c execution test
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_arith_run-4.c execution test
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_arith_run-4.c execution test
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_arith_run-8.c execution test
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_arith_run-8.c execution test
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c execution test
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c execution test
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-2.c execution test
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-2.c execution test
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-1.c execution test
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-1.c execution test
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-2.c execution test
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-2.c execution test

They failed are all because of bugs on VSETVL PASS:

10dd4:       0c707057                vsetvli zero,zero,e8,mf2,ta,ma
   10dd8:       5e06b8d7                vmv.v.i v17,13
   10ddc:       9ed030d7                vmv1r.v v1,v13
   10de0:       b21040d7                vncvt.x.x.w     v1,v1           ----> raise illegal instruction since we don't have SEW = 8 -> SEW = 4 narrowing.
   10de4:       5e0785d7                vmv.v.v v11,v15

Confirm the recent VSETVL refactor patch: https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633231.html fixed all of them.

So this patch should be committed after the VSETVL refactor patch.

	PR target/111848

gcc/ChangeLog:

	* config/riscv/riscv-selftests.cc (run_const_vector_selftests): Adapt selftest.
	* config/riscv/riscv-v.cc (expand_const_vector): Change it into vec_duplicate splitter.

gcc/testsuite/ChangeLog:

	* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-7.c: Adapt test.
	* gcc.dg/vect/costmodel/riscv/rvv/pr111848.c: New test.
2023-10-20 11:51:21 +08:00
c++tools
config
contrib Daily bump. 2023-10-06 00:17:37 +00:00
fixincludes
gcc RISC-V: Fix failed hoist in LICM of vmv.v.x instruction 2023-10-20 11:51:21 +08:00
gnattools
gotools
include Daily bump. 2023-10-13 00:18:18 +00:00
INSTALL
intl
libada
libatomic
libbacktrace
libcc1
libcody
libcpp Daily bump. 2023-10-09 00:17:27 +00:00
libdecnumber
libffi
libgcc Daily bump. 2023-10-19 00:18:05 +00:00
libgfortran
libgm2
libgo
libgomp Daily bump. 2023-10-16 00:17:13 +00:00
libiberty
libitm
libobjc
libphobos Daily bump. 2023-10-17 00:17:33 +00:00
libquadmath
libsanitizer
libssp
libstdc++-v3 Daily bump. 2023-10-20 00:16:39 +00:00
libvtv
lto-plugin
maintainer-scripts
zlib
.dir-locals.el
.gitattributes
.gitignore
ABOUT-NLS
ar-lib
ChangeLog Daily bump. 2023-10-16 00:17:13 +00:00
ChangeLog.jit
ChangeLog.tree-ssa
compile
config-ml.in
config.guess
config.rpath
config.sub
configure
configure.ac
COPYING
COPYING3
COPYING3.LIB
COPYING.LIB
COPYING.RUNTIME
depcomp
install-sh
libtool-ldflags
libtool.m4
lt~obsolete.m4
ltgcc.m4
ltmain.sh
ltoptions.m4
ltsugar.m4
ltversion.m4
MAINTAINERS MAINTAINERS: Fix write after approval name order 2023-10-11 14:53:44 +02:00
Makefile.def sim: add distclean dep for gnulib 2023-10-15 22:40:42 +05:45
Makefile.in sim: add distclean dep for gnulib 2023-10-15 22:40:42 +05:45
Makefile.tpl Makefile.tpl: disable -Werror for feedback stage [PR111663] 2023-10-06 20:25:20 +01:00
missing
mkdep
mkinstalldirs
move-if-change
multilib.am
README
SECURITY.txt secpol: consistent indentation 2023-10-05 12:00:39 -04:00
symlink-tree
test-driver
ylwrap

This directory contains the GNU Compiler Collection (GCC).

The GNU Compiler Collection is free software.  See the files whose
names start with COPYING for copying permission.  The manuals, and
some of the runtime libraries, are under different terms; see the
individual source files for details.

The directory INSTALL contains copies of the installation information
as HTML and plain text.  The source of this information is
gcc/doc/install.texi.  The installation information includes details
of what is included in the GCC sources and what files GCC installs.

See the file gcc/doc/gcc.texi (together with other files that it
includes) for usage and porting information.  An online readable
version of the manual is in the files gcc/doc/gcc.info*.

See http://gcc.gnu.org/bugs/ for how to report bugs usefully.

Copyright years on GCC source files may be listed using range
notation, e.g., 1987-2012, indicating that every year in the range,
inclusive, is a copyrightable year that could otherwise be listed
individually.