Roger Sayle be39636d9f Improve code generation for x86_64 [PR 92180]
This patch catches a missed optimization opportunity where GCC currently
generates worse code than LLVM.  The issue, as nicely analyzed in bugzilla,
boils down to the following three insns in combine:

(insn 6 5 7 2 (parallel [
            (set (reg:DI 85)
                (ashift:DI (reg:DI 85)
                    (const_int 32 [0x20])))
            (clobber (reg:CC 17 flags))
        ]) "pr92180.c":4:10 564 {*ashldi3_1}
     (expr_list:REG_UNUSED (reg:CC 17 flags)
        (nil)))
(insn 7 6 14 2 (parallel [
            (set (reg:DI 84)
                (ior:DI (reg:DI 84)
                    (reg:DI 85)))
            (clobber (reg:CC 17 flags))
        ]) "pr92180.c":4:10 454 {*iordi_1}
     (expr_list:REG_DEAD (reg:DI 85)
        (expr_list:REG_UNUSED (reg:CC 17 flags)
            (nil))))
(insn 14 7 15 2 (set (reg/i:SI 0 ax)
        (subreg:SI (reg:DI 84) 0)) "pr92180.c":5:1 67 {*movsi_internal}
     (expr_list:REG_DEAD (reg:DI 84)
        (nil)))

Normally, combine/simplify-rtx would notice that insns 6 and 7
(which update highpart bits) are unnecessary as the final insn 14
only requires to lowpart bits.  The complication is that insn 14
sets a hard register in targetm.class_likely_spilled_p which
prevents combine from performing its simplifications, and removing
the redundant instructions.

At first glance a fix would appear to require changes to combine,
potentially affecting code generation on all small register class
targets...  An alternate (and I think clever) solution is to spot
that this problematic situation can be avoided by the backend.

At RTL expansion time, the middle-end has a clear separation between
pseudos and hard registers, so the RTL initially contains:

(insn 9 8 10 2 (set (reg:SI 86)
        (subreg:SI (reg:DI 82 [ _1 ]) 0)) "pr92180.c":6:10 -1
     (nil))
(insn 10 9 14 2 (set (reg:SI 83 [ <retval> ])
        (reg:SI 86)) "pr92180.c":6:10 -1
     (nil))
(insn 14 10 15 2 (set (reg/i:SI 0 ax)
        (reg:SI 83 [ <retval> ])) "pr92180.c":7:1 -1
     (nil))

which can be optimized without problems by combine; it is only the
intervening passes (initially fwprop1) that propagate computations
into sets of hard registers, and disable those opportunities.

The solution proposed here is to have the x86 backend/recog prevent
early RTL passes composing instructions (that set likely_spilled hard
registers) that they (combine) can't simplify, until after reload.
We allow sets from pseudo registers, immediate constants and memory
accesses, but anything more complicated is performed via a temporary
pseudo.  Not only does this simplify things for the register allocator,
but any remaining register-to-register moves are easily cleaned up
by the late optimization passes after reload, such as peephole2 and
cprop_hardreg.

This patch has been tested on x86_64-pc-linux-gnu with a
"make bootstrap" and a "make -k check" with no new failures.
Ok for mainline?

gcc
	PR rtl-optimization/92180
	* config/i386/i386.c (ix86_hardreg_mov_ok): New function to
	determine whether (set DST SRC) should be allowed at this point.
	* config/i386/i386-protos.h (ix86_hardreg_mov_ok): Prototype here.
	* config/i386/i386-expand.c (ix86_expand_move): Check whether
	this is a complex set of a likely spilled hard register, and if
	so place the value in a pseudo, and load the hard reg from it.
	* config/i386/i386.md (*movdi_internal, *movsi_internal)
	(*movhi_internal, *movqi_internal): Make these instructions
	conditional on ix86_hardreg_mov_ok.
	(*lea<mode>): Make this define_insn_and_split conditional on
	ix86_hardreg_mov_ok.

gcc/testsuite
	PR rtl-optimization/92180
	* gcc.target/i386/pr92180.c: New test.
2020-11-16 16:56:45 -07:00
2020-11-16 20:26:28 +01:00
2020-10-30 00:16:29 +00:00
2020-10-21 00:16:36 +00:00
2020-11-12 00:16:39 +00:00
2020-11-14 00:16:38 +00:00
2020-11-16 00:16:31 +00:00
2020-11-02 20:53:00 +00:00
2020-11-10 07:25:32 -08:00
2020-11-15 00:16:26 +00:00
2020-11-14 00:16:38 +00:00
2020-11-14 00:16:38 +00:00
2020-11-14 00:16:38 +00:00
2020-11-16 12:52:43 +01:00
2020-11-14 00:16:38 +00:00

This directory contains the GNU Compiler Collection (GCC).

The GNU Compiler Collection is free software.  See the files whose
names start with COPYING for copying permission.  The manuals, and
some of the runtime libraries, are under different terms; see the
individual source files for details.

The directory INSTALL contains copies of the installation information
as HTML and plain text.  The source of this information is
gcc/doc/install.texi.  The installation information includes details
of what is included in the GCC sources and what files GCC installs.

See the file gcc/doc/gcc.texi (together with other files that it
includes) for usage and porting information.  An online readable
version of the manual is in the files gcc/doc/gcc.info*.

See http://gcc.gnu.org/bugs/ for how to report bugs usefully.

Copyright years on GCC source files may be listed using range
notation, e.g., 1987-2012, indicating that every year in the range,
inclusive, is a copyrightable year that could otherwise be listed
individually.
Description
No description provided
Readme 2.1 GiB
Languages
C++ 31.9%
C 31.3%
Ada 12%
D 6.5%
Go 6.4%
Other 11.5%