Go to file
Jakub Jelinek c25c172959 match.pd: Fix up the new simpliofiers using with_possible_nonzero_bits2 [PR117420]
The following testcase shows wrong-code caused by incorrect use
of with_possible_nonzero_bits2.
That matcher is defined as
/* Slightly extended version, do not make it recursive to keep it cheap.  */
(match (with_possible_nonzero_bits2 @0)
 with_possible_nonzero_bits@0)
(match (with_possible_nonzero_bits2 @0)
 (bit_and:c with_possible_nonzero_bits@0 @2))
and because with_possible_nonzero_bits includes the SSA_NAME case with
integral/pointer argument, both forms can actually match when a SSA_NAME
with integral/pointer type has a def stmt which is BIT_AND_EXPR
assignment with say SSA_NAME with integral/pointer type as one of its
operands (or INTEGER_CST, another with_possible_nonzero_bits case).
And in match.pd the latter actually wins if both match and so when using
(with_possible_nonzero_bits2 @0) the @0 will actually be one of the
BIT_AND_EXPR operands if that form is matched.

Now, with_possible_nonzero_bits2 and with_certain_nonzero_bits2 were added
for the
/* X == C (or X & Z == Y | C) is impossible if ~nonzero(X) & C != 0.  */
(for cmp (eq ne)
 (simplify
  (cmp:c (with_possible_nonzero_bits2 @0) (with_certain_nonzero_bits2 @1))
  (if (wi::bit_and_not (wi::to_wide (@1), get_nonzero_bits (@0)) != 0)
   { constant_boolean_node (cmp == NE_EXPR, type); })))
simplifier, but even for that one I think they do not do a good job, they
might actually pessimize stuff rather than optimize, but at least does not
result in wrong-code, because the operands are solely tested with
wi::to_wide or get_nonzero_bits, but not actually used in the
simplification.  The reason why it can pessimize stuff is say if we have
  # RANGE [irange] int ... MASK 0xb VALUE 0x0
  x_1 = ...;
  # RANGE [irange] int ... MASK 0x8 VALUE 0x0
  _2 = x_1 & 0xc;
  _3 = _2 == 2;
then if it used just with_possible_nonzero_bits@0, @0 would have
get_nonzero_bits (@0) 0x8 and (2 & ~8) != 0, so we can fold it into
  _3 = 0;
But as it uses (with_possible_nonzero_bits2 @0), @0 is x_1 rather
than _2 and get_nonzero_bits (@0) is unnecessarily conservative,
0xb rather than 0x8 and (2 & ~0xb) == 0, so we don't optimize.
Now, with_possible_nonzero_bits2 can actually improve stuff as well in that
pattern, if say value ranges aren't fully computed yet or the BIT_AND_EXPR
assignment has been added later and the lhs doesn't have range computed yet,
get_nonzero_range on the BIT_AND_EXPR lhs will be all bits set, while
on the BIT_AND_EXPR operand might actually succeed.

I believe better would be to either modify get_nonzero_bits so that it
special cases the SSA_NAME with BIT_AND_EXPR def_stmt (but one level
deep only like with_possible_nonzero_bits2, no recursion), in that case
return bitwise and of get_nonzero_bits (non-recursive) for the lhs and
both operands, and possibly BIT_AND_EXPR itself e.g. for GENERIC
matching during by returning bitwise and of both operands.
Then with_possible_nonzero_bits2 could be needed for the GENERIC case,
perhaps have the second match #if GENERIC, but changed so that the @N
operand always is the whole thing rather than its operand which is
error-prone.  Or add get_nonzero_bits wrapper with a different name
which would do that.

with_certain_nonzero_bits2 could be changed similarly, these days
we can test known non-zero bits rather than possible non-zero bits on
SSA_NAMEs too, we record both mask and value, so possible nonzero bits
(aka. get_nonzero_bits) is mask () | value (), while known nonzero bits
is value () & ~mask (), with a new function (get_known_nonzero_bits
or get_certain_nonzero_bits etc.) which handles that.

Anyway, the following patch doesn't do what I wrote above just yet,
for that single pattern it is just a missed optimization.
But the with_possible_nonzero_bits2 uses in the 3 new simplifiers are
just completely incorrect, because they don't just use the @0 operand
in get_nonzero_bits (pessimizing stuff if value ranges are fully computed),
but also use it in the replacement, then they act as if the BIT_AND_EXPR
wasn't there at all.
While we could use (with_possible_nonzero_bits2@3 @0) and use
get_nonzero_bits (@0) and use @3 in the replacement, that would still
often be a pessimization, so I've just used with_possible_nonzero_bits@0.

2024-11-22  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/117420
	* match.pd ((X >> C1) << (C1 + C2) -> X << C2,
	(X >> C1) * (C2 << C1) -> X * C2, X / (1 << C) -> X /[ex] (1 << C)):
	Use with_possible_nonzero_bits@0 rather than
	(with_possible_nonzero_bits2 @0).

	* gcc.dg/torture/pr117420.c: New test.
2024-11-22 19:50:22 +01:00
.forgejo
.github
c++tools
config
contrib
fixincludes
gcc match.pd: Fix up the new simpliofiers using with_possible_nonzero_bits2 [PR117420] 2024-11-22 19:50:22 +01:00
gnattools
gotools
include OpenMP: 'interop' construct - add C/C++ parser support, improve Fortran parsing 2024-11-22 15:30:53 +01:00
INSTALL
libada
libatomic
libbacktrace
libcc1
libcody
libcpp
libdecnumber
libffi
libgcc
libgfortran
libgm2
libgo
libgomp Daily bump. 2024-11-22 00:20:01 +00:00
libgrust
libiberty
libitm
libobjc
libphobos
libquadmath
libsanitizer libsanitizer: Move language level from gnu++14 to gnu++17 2024-11-22 08:52:44 -08:00
libssp
libstdc++-v3
libvtv
lto-plugin
maintainer-scripts
zlib
.b4-config
.dir-locals.el
.gitattributes
.gitignore
ABOUT-NLS
ar-lib
ChangeLog
ChangeLog.jit
ChangeLog.tree-ssa
compile
config-ml.in
config.guess
config.rpath
config.sub
configure
configure.ac
COPYING
COPYING3
COPYING3.LIB
COPYING.LIB
COPYING.RUNTIME
depcomp
install-sh
libtool-ldflags
libtool.m4
lt~obsolete.m4
ltgcc.m4
ltmain.sh
ltoptions.m4
ltsugar.m4
ltversion.m4
MAINTAINERS MAINTAINERS: Add myself to write after approval 2024-11-22 13:37:34 +01:00
Makefile.def
Makefile.in
Makefile.tpl
missing
mkdep
mkinstalldirs
move-if-change
multilib.am
README
SECURITY.txt
symlink-tree
test-driver
ylwrap

This directory contains the GNU Compiler Collection (GCC).

The GNU Compiler Collection is free software.  See the files whose
names start with COPYING for copying permission.  The manuals, and
some of the runtime libraries, are under different terms; see the
individual source files for details.

The directory INSTALL contains copies of the installation information
as HTML and plain text.  The source of this information is
gcc/doc/install.texi.  The installation information includes details
of what is included in the GCC sources and what files GCC installs.

See the file gcc/doc/gcc.texi (together with other files that it
includes) for usage and porting information.  An online readable
version of the manual is in the files gcc/doc/gcc.info*.

See http://gcc.gnu.org/bugs/ for how to report bugs usefully.

Copyright years on GCC source files may be listed using range
notation, e.g., 1987-2012, indicating that every year in the range,
inclusive, is a copyrightable year that could otherwise be listed
individually.