Jakub Jelinek 2c08964027 i386: Improve *concat<mode><dwi>3_{1,2,3,4} patterns [PR107627]
On the first testcase we've regressed since 12 at -O2:
-       movq    8(%rsi), %rax
-       movq    %rdi, %r8
-       movq    (%rsi), %rdi
+       movq    (%rsi), %rax
+       movq    8(%rsi), %r8
        movl    %edx, %ecx
-       shrdq   %rdi, %rax
-       movq    %rax, (%r8)
+       xorl    %r9d, %r9d
+       movq    %rax, %rdx
+       xorl    %eax, %eax
+       orq     %r8, %rax
+       orq     %r9, %rdx
+       shrdq   %rdx, %rax
+       movq    %rax, (%rdi)
On the second testcase we've emitted such terrible code
with the useless xors and ors for a long time.
For PR91681 the *concat<mode><dwi>3_{1,2,3,4} patterns have been added
but they allow just register inputs and register or memory offsettable
output.
The following patch fixes this by allowing also memory inputs on those
patterns, because the pattern is then split to 0-2 emit_move_insns or
one xchg and those can handle loads from memory too just fine.
So that we don't narrow memory loads (source has 128-bit (or for ia32
64-bit) load and we would make 64-bit (or for ia32 32-bit) load out of it),
register_operand -> nonmemory_operand change is done only for operands
in zero_extend arguments.  o <- m, m or o <- m, r or o <- r, m alternatives
aren't used, we'd lack registers to perform the moves.  But what is
in addition to the current ro <- r, r supported are r <- m, r and r <- r, m
(in that case we just need to be careful about corner cases, see what
emit_move_insn we'd call and if we wouldn't clobber registers used in m's
address before loading - split_double_concat handles that now) and
&r <- m, m (in that case I think the early clobber is the easiest solution).

The first testcase then on 12 -> patched trunk at -O2 changes:
-       movq    8(%rsi), %rax
-       movq    %rdi, %r8
-       movq    (%rsi), %rdi
+       movq    8(%rsi), %r9
+       movq    (%rsi), %r10
        movl    %edx, %ecx
-       shrdq   %rdi, %rax
-       movq    %rax, (%r8)
+       movq    %r9, %rax
+       shrdq   %r10, %rax
+       movq    %rax, (%rdi)
so same amount of instructions and second testcase 12 -> patched trunk
at -O2 -m32:
-       pushl   %edi
-       xorl    %edi, %edi
        pushl   %esi
-       movl    16(%esp), %esi
+       pushl   %ebx
+       movl    16(%esp), %eax
        movl    20(%esp), %ecx
-       movl    (%esi), %eax
-       movl    4(%esi), %esi
-       movl    %eax, %edx
-       movl    $0, %eax
-       orl     %edi, %edx
-       orl     %esi, %eax
-       shrdl   %edx, %eax
        movl    12(%esp), %edx
+       movl    4(%eax), %ebx
+       movl    (%eax), %esi
+       movl    %ebx, %eax
+       shrdl   %esi, %eax
        movl    %eax, (%edx)
+       popl    %ebx
        popl    %esi
-       popl    %edi

BTW, I wonder if we couldn't add additional patterns which would catch
the case where one of the operands is constant and how does this interact
with the stv pass in 32-bit mode where I think stv is right after combine,
so if we match these patterns, perhaps it would be nice to handle them
in stv (unless they are handled there already).

2022-12-01  Jakub Jelinek  <jakub@redhat.com>

	PR target/107627
	* config/i386/i386.md (*concat<mode><dwi>3_1, *concat<mode><dwi>3_2):
	For operands which are zero_extend arguments allow memory if
	output operand is a register.
	(*concat<mode><dwi>3_3, *concat<mode><dwi>3_4): Likewise.  If
	both input operands are memory, use early clobber on output operand.
	* config/i386/i386-expand.cc (split_double_concat): Deal with corner
	cases where one input is memory and the other is not and the address
	of the memory input uses a register we'd overwrite before loading
	the memory into a register.

	* gcc.target/i386/pr107627-1.c: New test.
	* gcc.target/i386/pr107627-2.c: New test.
2022-12-01 09:29:23 +01:00
2022-11-24 00:17:47 +00:00
2022-11-15 08:32:29 +00:00
2022-11-26 00:17:08 +00:00
2022-11-24 00:17:47 +00:00
2022-09-01 00:17:39 +00:00
2022-08-31 00:16:45 +00:00
2022-11-16 00:17:09 +00:00
2022-11-24 00:17:47 +00:00
2022-08-26 00:16:21 +00:00
2022-11-17 00:16:52 +00:00
2022-10-13 00:17:37 +00:00
2022-11-02 00:17:38 +00:00
2022-11-24 00:17:47 +00:00
2022-11-24 00:17:47 +00:00
2022-11-24 00:17:47 +00:00
2022-10-13 00:17:37 +00:00
2022-11-30 12:21:15 -08:00
2022-12-01 00:17:51 +00:00
2022-11-24 00:17:47 +00:00
2022-10-13 00:17:37 +00:00
2022-11-30 00:17:59 +00:00
2022-11-24 00:17:47 +00:00
2022-12-01 00:17:51 +00:00
2022-11-24 00:17:47 +00:00
2022-07-19 17:07:04 +03:00
2022-11-26 00:17:08 +00:00
2021-12-21 09:10:57 +01:00
2022-10-31 11:15:45 +00:00

This directory contains the GNU Compiler Collection (GCC).

The GNU Compiler Collection is free software.  See the files whose
names start with COPYING for copying permission.  The manuals, and
some of the runtime libraries, are under different terms; see the
individual source files for details.

The directory INSTALL contains copies of the installation information
as HTML and plain text.  The source of this information is
gcc/doc/install.texi.  The installation information includes details
of what is included in the GCC sources and what files GCC installs.

See the file gcc/doc/gcc.texi (together with other files that it
includes) for usage and porting information.  An online readable
version of the manual is in the files gcc/doc/gcc.info*.

See http://gcc.gnu.org/bugs/ for how to report bugs usefully.

Copyright years on GCC source files may be listed using range
notation, e.g., 1987-2012, indicating that every year in the range,
inclusive, is a copyrightable year that could otherwise be listed
individually.
Description
No description provided
Readme 2.1 GiB
Languages
C++ 31.9%
C 31.3%
Ada 12%
D 6.5%
Go 6.4%
Other 11.5%