In take_address_of, we may refrain from completing a decl_address
INSERT if gsi is NULL, so dnn't even ask for an INSERT in this case.
for gcc/ChangeLog
* tree-parloops.cc (take_address_of): Skip INSERT if !gsi.
Check, after inserting entries, that they don't look empty.
for gcc/ChangeLog
* hash-map.h (put, get_or_insert): Check that entry does not
look empty after insertion.
Check, after adding a key to a hash set, that the entry does not look
empty.
for gcc/ChangeLog
* hash-set.h (add): Check that the inserted entry does not
look empty.
When decl is NULL, don't record its mapping in the
decl_to_instance_map.
for gcc/ada/ChangeLog
* gcc-interface/trans.cc (Sloc_to_locus): Don't map NULL decl.
When adding a catch-all partition, we map NULL to it. That mapping is
ineffective and unnecessary. Drop it.
for gcc/lto/ChangeLog
* lto-partition.cc (lto_1_to_1_map): Drop NULL partition
mapping.
cxx_eval_call_expression requests an INSERT even in cases when it
would later decide not to insert. This could break double-hashing
chains. Arrange for it to use NO_INSERT when the insertion would not
be completed.
for gcc/cp/ChangeLog
* constexpr.cc (cxx_eval_call_expression): Do not request an
INSERT that would not be completed.
Insertion of a tm_restart_node in tm_restart failed to record the
newly-allocated node in the hash table.
for gcc/ChangeLog
* trans-mem.cc (split_bb_make_tm_edge): Record new node in
tm_restart.
lookup_expr_in_table is not used for insertions, but it mistakenly
used INSERT rather than NO_INSERT.
for gcc/ChangeLog
* postreload-gcse.cc (lookup_expr_in_table): Use NO_INSERT.
If a result doesn't have a default def, don't attempt to remap it.
for gcc/ChangeLog
* tree-inline.cc (declare_return_variable): Don't remap NULL
default def of result.
When a TREE_OPERAND is NULL, do not cache it.
for gcc/ChangeLog
* tree-ssa-loop-niter.cc (expand_simple_operands): Refrain
from caching NULL TREE_OPERANDs.
Use NO_INSERT to test whether inserting should be attempted.
for gcc/cp/ChangeLog
* constraint.cc (normalize_concept_check): Use NO_INSERT for
pre-insertion check.
Avoid adding NULL vnodes to referenced tables.
for gcc/ChangeLog
* varpool.cc (symbol_table::remove_unreferenced_decls): Do not
add NULL vnodes to referenced table.
Avoid hash table lookups between requesting an insert and storing the
inserted value in avail_exprs_stack. Lookups before the insert is
completed could fail to find double-hashed elements.
for gcc/ChangeLog
* tree-ssa-scopedtables.cc
(avail_exprs_stack::lookup_avail_expr): Finish hash table
insertion before further lookups.
Perhaps no problem, but for safety.
gcc/ChangeLog:
* config/xtensa/xtensa.cc (xtensa_expand_prologue): Fix to check
DF availability before use of DF_* macros.
The middle-end doesn't have a preferred canonical form for expressing
zero-extension, sometimes using an AND, sometimes pairs of SHIFTs,
and sometimes using zero_extend. Pending changes to RTL simplification
will/may alter some of these representations, so a few additional
patterns are required to recognize these alternate representations
and avoid any testsuite regressions.
As an example, *popcountsi2_zext is currently represented as:
[(set (match_operand:DI 0 "register_operand" "=r")
(and:DI
(subreg:DI
(popcount:SI
(match_operand:SI 1 "nonimmediate_operand" "rm")) 0)
(const_int 63)))
(clobber (reg:CC FLAGS_REG))]
this patch adds an alternate/equivalent pattern that matches:
[(set (match_operand:DI 0 "register_operand" "=r")
(zero_extend:DI
(popcount:SI (match_operand:SI 1 "nonimmediate_operand" "rm"))))
(clobber (reg:CC FLAGS_REG))]
Another example is *popcounthi2 which is currently represented as:
[(set (match_operand:SI 0 "register_operand")
(popcount:SI
(zero_extend:SI (match_operand:HI 1 "nonimmediate_operand"))))
(clobber (reg:CC FLAGS_REG))]
this patch adds an alternate/equivalent pattern that matches:
[(set (match_operand:SI 0 "register_operand")
(zero_extend:SI
(popcount:HI (match_operand:HI 1 "nonimmediate_operand"))))
(clobber (reg:CC FLAGS_REG))]
The contents of the machine description definitions remain the same.
it's just the expected RTL is slightly different but equivalent.
Providing both forms makes the backend more robust to middle-end
changes [and possibly catches some missed optimizations].
2022-12-28 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* config/i386/i386.md (*clzsi2_lzcnt_zext_2): define_insn_and_split
to match ZERO_EXTEND form of *clzsi2_lzcnt_zext.
(*clzsi2_lzcnt_zext_2_falsedep): Likewise, new define_insn to match
ZERO_EXTEND form of *clzsi2_lzcnt_zext_falsedep.
(*bmi2_bzhi_zero_extendsidi_5): Likewise, new define_insn to match
ZERO_EXTEND form of *bmi2_bzhi_zero_extendsidi.
(*popcountsi2_zext_2): Likewise, new define_insn_and_split to match
ZERO_EXTEND form of *popcountsi2_zext.
(*popcountsi2_zext_2_falsedep): Likewise, new define_insn to match
ZERO_EXTEND form of *popcountsi2_zext_falsedep.
(*popcounthi2_2): Likewise, new define_insn_and_split to match
ZERO_EXTEND form of *popcounthi2.
(define_peephole2): ZERO_EXTEND variant of HImode popcount&1 using
parity flag peephole2.
This patch is a one line change, to call ix86_expand_clear instead of
emit_move_insn with const0_rtx in ix86_split_ashl, allowing the backend
to use an xor instruction to clear a register if appropriate.
The effect is demonstrated with the following function.
__int128 foo(__int128 x, unsigned long long b) {
return ((__int128)b << 72) + x;
}
previously with -O2, GCC would generate
foo: movl $0, %eax
salq $8, %rdx
addq %rdi, %rax
adcq %rsi, %rdx
ret
with this patch, it now generates
foo: xorl %eax, %eax
salq $8, %rdx
addq %rdi, %rax
adcq %rsi, %rdx
ret
2022-12-28 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* config/i386/i386-expand.cc (ix86_split_ashl): Call
ix86_expand_clear to generate an xor instruction.
gcc/testsuite/ChangeLog
* gcc.target/i386/ashlti3-1.c: New test case.
This patch restructures the loop over the GP registers
which saves/restores then as part of the prologue/epilogue.
No functional change is intended by this patch, but it
offers the possibility to use load-pair/store-pair instructions.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_next_saved_reg): New function.
(riscv_is_eh_return_data_register): New function.
(riscv_for_each_saved_reg): Restructure loop.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
The comment above the enumeration of existing attributes got out of
order and a few entries were forgotten.
This patch synchronizes the comments according to the list.
This commit does not include any functional change.
gcc/ChangeLog:
* config/riscv/riscv.md: Sync comments with code.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Newer binutils uses all caps, where it was all lower case
previously. Pushed as obvious.
This should resolve:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100383
gcc/
* configure.ac: use grep -i for case insensitive test.
* configure: Regenerate.
Signed-off-by: Jonathan Yong <10walls@gmail.com>
This improves RTL dumps readability. No functional changes.
gcc/
* config/xtensa/xtensa.md (unspec): Extract UNSPEC_* constants
into this enum.
(unspecv): Extract UNSPECV_* constants into this enum.
gcc/ChangeLog:
* config/xtensa/xtensa.cc (xtensa_expand_prologue): Modify to
exit the inspection loops as soon as the necessity of stack
pointer is found.
Like d0bbecb1c418b680505faa998fe420f0fd4bbfc1, we add a wrapper to
prevent it pull stdint.h from standard C library.
gcc/testsuite:
* gcc.target/riscv/rvv/vsetvl/riscv_vector.h: New.
This patch is to fix issue of visiting non-existing block of CFG.
Since blocks index of CFG in GCC are not always contiguous, we will potentially
visit a gap block which is no existing in the current CFG.
This patch can avoid visiting non existing block in CFG.
I noticed such issue in my internal regression of current testsuite
when I change the X86 server machine. This patch fix it:
17:27:15 job(build_and_test_rv32): Increased FAIL List:
17:27:15 job(build_and_test_rv32): FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-46.c
-O2 -flto -fno-use-linker-plugin -flto-partition=none (internal compiler error: Segmentation fault)
gcc/ChangeLog:
* config/riscv/riscv-vsetvl.cc
(pass_vsetvl::compute_global_backward_infos): Change to visit CFG.
(pass_vsetvl::prune_expressions): Ditto.
PR106680 shows that -m32 -mpowerpc64 is different from
-mpowerpc64 -m32, this is determined by the way how we
handle option powerpc64 in rs6000_handle_option.
Segher pointed out this difference should be taken as
a bug and we should ensure that option powerpc64 is
independent of -m32/-m64. So this patch removes the
handlings in rs6000_handle_option and add some necessary
supports in rs6000_option_override_internal instead.
With this patch, if users specify -m{no-,}powerpc64, the
specified value is honoured, otherwise, for 64bit it
always enables OPTION_MASK_POWERPC64; while for 32bit
and TARGET_POWERPC64 and OS_MISSING_POWERPC64, it disables
OPTION_MASK_POWERPC64.
btw, following Segher's suggestion, I did some tries to warn
when OPTION_MASK_POWERPC64 is set for OS_MISSING_POWERPC64.
If warn for the case that powerpc64 is specified explicitly,
there are some TCs using -m32 -mpowerpc64 on ppc64-linux,
they need some updates, meanwhile the artificial run
with "--target_board=unix'{-m32/-mpowerpc64}'" will have
noisy warnings on ppc64-linux. If warn for the case that
it's specified implicitly, they can just be initialized by
TARGET_DEFAULT (like -m32 on ppc64-linux) or set from the
given cpu mask, we have to special case them and not to warn.
As Segher's latest comment, I decide not to warn them and
keep it consistent with before.
Bootstrapped and regress-tested on:
- powerpc64-linux-gnu P7 and P8 {-m64,-m32}
- powerpc64le-linux-gnu P9 and P10
- powerpc-ibm-aix7.2.0.0 {-maix64,-maix32}
- powerpc-darwin9 (with Iain's help)
PR target/106680
gcc/ChangeLog:
* common/config/rs6000/rs6000-common.cc (rs6000_handle_option): Remove
the adjustment for option powerpc64 in -m64 handling, and remove the
whole -m32 handling.
* config/rs6000/rs6000.cc (rs6000_option_override_internal): When no
explicit powerpc64 option is provided, enable it for -m64. For 32 bit
and OS_MISSING_POWERPC64, disable powerpc64 if it's enabled but not
specified explicitly.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr106680-1.c: New test.
* gcc.target/powerpc/pr106680-2.c: New test.
* gcc.target/powerpc/pr106680-3.c: New test.
* gcc.target/powerpc/pr106680-4.c: New test.
2022-12-27 Kewen Lin <linkw@linux.ibm.com>
Iain Sandoe <iain@sandoe.co.uk>
if (mdaz-ftz)
link crtfastmath.o
else if ((Ofast || ffast-math || funsafe-math-optimizations)
&& !shared && !mno-daz-ftz)
link crtfastmath.o
else
Don't link crtfastmath.o
gcc/ChangeLog:
PR target/55522
PR target/36821
* config/i386/gnu-user-common.h (GNU_USER_TARGET_MATHFILE_SPEC):
Link crtfastmath.o whenever -mdaz-ftz is specified. Don't link
crtfastmath.o when -share or -mno-daz-ftz is specified.
* config/i386/i386.opt (mdaz-ftz): New option.
* doc/invoke.texi (x86 options): Document mftz-daz.
This patch tweaks the x86 backend to use the movss and movsd instructions
to perform some vector permutations on integer vectors (V4SI and V2DI) in
the same way they are used for floating point vectors (V4SF and V2DF).
As a motivating example, consider:
typedef unsigned int v4si __attribute__((vector_size(16)));
typedef float v4sf __attribute__((vector_size(16)));
v4si foo(v4si x,v4si y) { return (v4si){y[0],x[1],x[2],x[3]}; }
v4sf bar(v4sf x,v4sf y) { return (v4sf){y[0],x[1],x[2],x[3]}; }
which is currently compiled with -O2 to:
foo: movdqa %xmm0, %xmm2
shufps $80, %xmm0, %xmm1
movdqa %xmm1, %xmm0
shufps $232, %xmm2, %xmm0
ret
bar: movss %xmm1, %xmm0
ret
with this patch both functions compile to the same form.
Likewise for the V2DI case:
typedef unsigned long v2di __attribute__((vector_size(16)));
typedef double v2df __attribute__((vector_size(16)));
v2di foo(v2di x,v2di y) { return (v2di){y[0],x[1]}; }
v2df bar(v2df x,v2df y) { return (v2df){y[0],x[1]}; }
which currently generates:
foo: shufpd $2, %xmm0, %xmm1
movdqa %xmm1, %xmm0
ret
bar: movsd %xmm1, %xmm0
ret
2022-12-25 Roger Sayle <roger@nextmovesoftware.com>
Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog
* config/i386/i386-builtin.def (__builtin_ia32_movss): Update
CODE_FOR_sse_movss to CODE_FOR_sse_movss_v4sf.
(__builtin_ia32_movsd): Likewise, update CODE_FOR_sse2_movsd to
CODE_FOR_sse2_movsd_v2df.
* config/i386/i386-expand.cc (split_convert_uns_si_sse): Update
gen_sse_movss call to gen_sse_movss_v4sf, and gen_sse2_movsd call
to gen_sse2_movsd_v2df.
(expand_vec_perm_movs): Also allow V4SImode with TARGET_SSE and
V2DImode with TARGET_SSE2.
* config/i386/sse.md
(avx512fp16_fcmaddcsh_v8hf_mask3<round_expand_name>): Update
gen_sse_movss call to gen_sse_movss_v4sf.
(avx512fp16_fmaddcsh_v8hf_mask3<round_expand_name>): Likewise.
(sse_movss_<mode>): Renamed from sse_movss using VI4F_128 mode
iterator to handle both V4SF and V4SI.
(sse2_movsd_<mode>): Likewise, renamed from sse2_movsd using
VI8F_128 mode iterator to handle both V2DF and V2DI.
gcc/testsuite/ChangeLog
* gcc.target/i386/sse-movss-4.c: New test case.
* gcc.target/i386/sse2-movsd-3.c: New test case.
Broken by 9149a5b7e0a66b7b94d5b7db3194a975d18dea2f.
CC_NONE is defined by wingdi.h and conflicting with gcc.
Committed as obvious.
libgcc/:
* config/i386/gthr-win32.h: undef CC_NONE
Signed-off-by: Jonathan Yong <10walls@gmail.com>
My recently added testcases gcc.target/i386/pr107548-[12].c need to be
tweaked slightly for -march=cascadelake. Committed as obvious.
2022-12-24 Roger Sayle <roger@nextmovesoftware.com>
gcc/testsuite/ChangeLog
PR target/107548
* gcc.target/i386/pr107548-1.c: Match both vmovd and movd.
* gcc.target/i386/pr107548-2.c: Match both vpaddq and paddq.
Several systems/distributions do not provide the raw tzdata.zi file in
their zoneinfo installation. However, we might provide an alternate
installation path at configure time, so that we should check for the
tzdata.zi file first and then fall back to system-specific files like
+VERSION etc. on those systems.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
libstdc++-v3/ChangeLog:
* src/c++20/tzdb.cc (remote_version): Look for the tzdata.zi
file before falling back to system-specific ones on Darwin and
BSD.
in leap_seconds.cc, we are testing to see if the function that
overrides the default zoneinfo directory has been called. That
is implemented with a static boolean that needs to be initialized
to false.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
libstdc++-v3/ChangeLog:
* testsuite/std/time/tzdb/leap_seconds.cc: Initialize the
override_used test var to false.
On Darwin, GCC now uses a libgcc_s.1.1 for builtins and forwards the system
unwinder. We do, however, build a backwards compatibility libgcc_s.1.dylib.
However, this is not needed by GCC and can cause incorrect operation when
DYLD_LIBRARY_PATH is in use.
Since we do not need or use it during the build, the solution is to skip the
installation into the $build/gcc directory.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
libgcc/ChangeLog:
* config/t-slibgcc-darwin (install-darwin-libgcc-stubs): Skip the
install of libgcc_s.1.dylib when the installation is into the build
gcc directory.