The visitor for lowering IMPORTED_DECLs did not have an override for
dealing with importing OverloadSet symbols. This has now been
implemented in the code generator.
PR d/108050
gcc/d/ChangeLog:
* decl.cc (DeclVisitor::visit (Import *)): Handle build_import_decl
returning a TREE_LIST.
* imports.cc (ImportVisitor::visit (OverloadSet *)): New override.
gcc/testsuite/ChangeLog:
* gdc.dg/imports/pr108050/mod1.d: New.
* gdc.dg/imports/pr108050/mod2.d: New.
* gdc.dg/imports/pr108050/package.d: New.
* gdc.dg/pr108050.d: New test.
In order to support CR on a line, we need to open files
with newline='\n' as our line endings supposed to be of UNIX style.
contrib/ChangeLog:
* check_GNU_style.py: Use newline=\n.
* check_GNU_style_lib.py: Simplify.
* gcc-changelog/git_commit.py: Fix issues seen
Rust patchset.
* gcc-changelog/git_email.py: Use newline argument.
* gcc-changelog/test_email.py: New test.
* gcc-changelog/test_patches.txt: New test.
* mklog.py: Use newline argument.
As well as removing unnecessary casts, this results in less temporaries
being generated during the initial gimple lowering pass. Otherwise the
code generated is identical to the former intrinsic expansion.
gcc/d/ChangeLog:
* intrinsics.cc (expand_intrinsic_bsf): Fix comment.
(expand_intrinsic_bsr): Use BIT_XOR_EXPR instead of MINUS_EXPR.
The PR notices we fail to simplify
a_4 = &x_3(D)->data;
b_5 = x_3(D) + 16;
_1 = b_5 - a_4;
together with the enabler handling ADDR_EXPR leafs in separate
stmts in match.pd the suggested patterns work.
PR tree-optimization/89317
* match.pd ((p + b) - &p->c -> b - offsetof(c)): New patterns.
* gcc.dg/tree-ssa/pr89317.c: New testcase.
The following allows to match ADDR_EXPR for both the invariant
&a.b case as well as the &p->d case in a separate definition
transparently. This also allows to remove the hack we employ
for CONSTRUCTOR which we handle for example with
(match vec_same_elem_p
CONSTRUCTOR@0
(if (TREE_CODE (@0) == SSA_NAME
&& uniform_vector_p (gimple_assign_rhs1 (SSA_NAME_DEF_STMT (@0))))))
Note CONSTUCTORs always appear as separate definition in GIMPLE,
but I continue to play safe and ADDR_EXPRs are now matched in
both places where previously ADDR_EXPR@0 would have missed
the &p->x case.
This is a prerequesite for the PR89317 fix.
* genmatch.cc (dt_node::gen_kids): Handle ADDR_EXPR in both
the GENERIC and GIMPLE op position.
(dt_simplify::gen): Capture both GENERIC and GIMPLE op
position for ADDR_EXPR and CONSTRUCTOR.
* match.pd: Simplify CONSTRUCTOR leaf handling.
* gcc.dg/tree-ssa/forwprop-3.c: Adjust.
* g++.dg/tree-ssa/pr31146-2.C: Likewise.
The following avoids CSE of &ps->wp to &ps->wp.hwnd confusing
-Wstringopt-overflow by making sure to produce addresses to the
biggest container from vectorization. For this I introduce
strip_zero_offset_components which turns &ps->wp.hwnd into
&(*ps) and use that to base the vector data references on.
That will also work for addresses with variable components,
alternatively emitting pointer arithmetic via calling
get_inner_reference and gimplifying that would be possible
but likely more intrusive.
This is by no means a complete fix for all of those issues
(avoiding ADDR_EXPRs in favor of pointer arithmetic might be).
Other passes will have similar issues.
In theory that might now cause false negatives.
PR tree-optimization/106904
* tree.h (strip_zero_offset_components): Declare.
* tree.cc (strip_zero_offset_components): Define.
* tree-vect-data-refs.cc (vect_create_addr_base_for_vector_ref):
Strip zero offset components before building the address.
* gcc.dg/Wstringop-overflow-pr106904.c: New testcase.
Seemingly, 's' (in VI that's the 's'ubstitute command) appeared verbatim in
a gfc_error message when to doing the '...' to %<...%> replacements in commit
r13-4590-g84f6f8a2a97f88be01e223c9c9dbab801a4f501f
gcc/fortran/
* openmp.cc (gfc_match_omp_context_selector_specification):
Remove spurious 's' in an error message.
gcc/fortran/ChangeLog:
PR fortran/106911
* simplify.cc (gfc_simplify_ishftc): If the SIZE argument is known
to be outside the allowed range, terminate simplification.
gcc/testsuite/ChangeLog:
PR fortran/106911
* gfortran.dg/pr106911.f90: New test.
The following testcase ICEs, because the latch bb ends with
asm goto which has both fallthrough to the header and one or more labels
in the header too. In that case there is just a single edge out of the
latch block, but still the asm goto is stmt_ends_bb_p statement, yet
ivopts decides to emit an IV bump at the IP_END position and inserts
it into the same bb as the asm goto after it, which then fails verification
(control flow in the middle of bb).
The following patch fixes it by splitting the latch -> header edge in that
case and inserting into the newly created bb, where split_edge ->
redirect_edge_and_branch is able to deal with this case correctly.
2022-12-10 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/107997
* tree-ssa-loop-ivopts.cc: Include cfganal.h.
(create_new_iv) <case IP_END>: If ip_end_pos bb is non-empty and ends
with a stmt which ends bb, instead of adding iv update after it split
the latch edge and insert iterator into the new latch bb.
* gcc.c-torture/compile/pr107997.c: New test.
This commit enabled reverse offload for nvptx such that gomp_target_rev
actually gets called. And it fills the latter function to do all of
the following: finding the host function to the device func ptr and
copying the arguments to the host, processing the mapping/firstprivate,
calling the host function, copying back the data and freeing as needed.
The data handling is made easier by assuming that all host variables
either existed before (and are in the mapping) or that those are
devices variables not yet available on the host. Thus, the reverse
mapping can do without refcounts etc. Note that the spec disallows
inside a target region device-affecting constructs other than target
plus ancestor device-modifier and it also limits the clauses permitted
on this construct.
For the function addresses, an additional splay tree is used; for
the lookup of mapped variables, the existing splay-tree is used.
Unfortunately, its data structure requires a full walk of the tree;
Additionally, the just mapped variables are recorded in a separate
data structure an extra lookup. While the lookup is slow, assuming
that only few variables get mapped in each reverse offload construct
and that reverse offload is the exception and not performance critical,
this seems to be acceptable.
libgomp/ChangeLog:
* libgomp.h (struct target_mem_desc): Predeclare; move
below after 'reverse_splay_tree_node' and add rev_array
member.
(struct reverse_splay_tree_key_s, reverse_splay_compare): New.
(reverse_splay_tree_node, reverse_splay_tree,
reverse_splay_tree_key): New typedef.
(struct gomp_device_descr): Add mem_map_rev member.
* oacc-host.c (host_dispatch): NULL init .mem_map_rev.
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_get_num_devices): Claim
support for GOMP_REQUIRES_REVERSE_OFFLOAD.
* splay-tree.h (splay_tree_callback_stop): New typedef; like
splay_tree_callback but returning int not void.
(splay_tree_foreach_lazy): Define; like splay_tree_foreach but
taking splay_tree_callback_stop as argument.
* splay-tree.c (splay_tree_foreach_internal_lazy,
splay_tree_foreach_lazy): New; but early exit if callback returns
nonzero.
* target.c: Instatiate splay_tree_c with splay_tree_prefix 'reverse'.
(gomp_map_lookup_rev): New.
(gomp_load_image_to_device): Handle reverse-offload function
lookup table.
(gomp_unload_image_from_device): Free devicep->mem_map_rev.
(struct gomp_splay_tree_rev_lookup_data, gomp_splay_tree_rev_lookup,
gomp_map_rev_lookup, struct cpy_data, gomp_map_cdata_lookup_int,
gomp_map_cdata_lookup): New auxiliary structs and functions for
gomp_target_rev.
(gomp_target_rev): Implement reverse offloading and its mapping.
(gomp_target_init): Init current_device.mem_map_rev.root.
* testsuite/libgomp.fortran/reverse-offload-2.f90: New test.
* testsuite/libgomp.fortran/reverse-offload-3.f90: New test.
* testsuite/libgomp.fortran/reverse-offload-4.f90: New test.
* testsuite/libgomp.fortran/reverse-offload-5.f90: New test.
* testsuite/libgomp.fortran/reverse-offload-5a.f90: New test without
mapping of on-device allocated variables.
Add initial ChangeLog file in libgm2 and gcc/m2.
ChangeLog:
* libgm2: (New directory).
* libgm2/ChangeLog: (New file).
gcc/ChangeLog:
* m2: (New directory).
* m2/ChangeLog: (New file).
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
When using %qs instead of '%s' or %<=%> instead of '=' looks nicer
by having nicer quotes and bold text, if the terminal supports it;
otherwise, plain quotes are used.
gcc/fortran/ChangeLog:
* match.cc (gfc_match_member_sep): Use %<...%> in gfc_error.
* openmp.cc (gfc_match_oacc_routine, gfc_match_omp_context_selector,
gfc_match_omp_context_selector_specification,
gfc_match_omp_declare_variant, resolve_omp_clauses): Likewise;
use %qs instead of '%s'.
* primary.cc (match_real_constant, gfc_match_varspec): Likewise.
* resolve.cc (gfc_resolve_formal_arglist, resolve_operator,
resolve_ordinary_assign): Likewise.
Prepare to add changelogs for the Modula2 front end by changing
the contrib git_commit.py script.
contrib/ChangeLog:
* gcc-changelog/git_commit.py (default_changelog_locations):
New entry for gcc/m2. New entry for libgm2.
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
Function rs6000_emit_set_const/rs6000_emit_set_long_const are only invoked from
two "define_split"s where the target operand is limited to gpc_reg_operand or
int_reg_operand, then the operand must be REG_P.
And in rs6000_emit_set_const/rs6000_emit_set_long_const, to create temp rtx,
it is using code like "gen_reg_rtx({S|D}Imode)", it must also be REG_P.
So, copy_rtx is not needed for temp and dest.
This patch removes those "copy_rtx" for rs6000_emit_set_const and
rs6000_emit_set_long_const.
gcc/ChangeLog:
* config/rs6000/rs6000.cc (rs6000_emit_set_const): Remove copy_rtx.
(rs6000_emit_set_long_const): Likewise.
Similar story as PR103661, we again return a negative number
for __builtin_cpu_supports:
Documentation says:
int __builtin_cpu_supports(const char *feature)
This function returns a positive integer if the run-time CPU supports feature and returns 0 otherwise.
while we return -2147483648.
Moreover, I noticed "x86-64" is not a valid option for __builtin_cpu_is,
but for __builtin_cpu_supports.
PR target/107551
gcc/ChangeLog:
* config/i386/i386-builtins.cc (fold_builtin_cpu): Use same path
as for PR103661.
* doc/extend.texi: Fix "x86-64" use.
gcc/testsuite/ChangeLog:
* gcc.target/i386/builtin_target.c: Add more checks.
This change resolves a naming conflict introduced by the recently added
SUBTARGET_CC1_SPEC to gcc.cc. Some targets (mips and loongarch) aready used
a SUBTARGET_CC1_SPEC define. Rename the define used by gcc.cc to OS_CC1_SPEC.
gcc/ChangeLog:
* config/rtems.h (SUBTARGET_CC1_SPEC): Rename to...
(OS_CC1_SPEC): ...this.
* gcc.cc (SUBTARGET_CC1_SPEC): Rename to...
(OS_CC1_SPEC): ...this.
gcc/analyzer/ChangeLog:
PR analyzer/108003
* call-summary.cc
(call_summary_replay::convert_region_from_summary_1): Convert
heap_regs_in_use from auto_sbitmap to auto_bitmap.
* region-model-manager.cc
(region_model_manager::get_or_create_region_for_heap_alloc):
Convert from sbitmap to bitmap.
* region-model-manager.h: Likewise.
* region-model.cc
(region_model::get_or_create_region_for_heap_alloc): Convert from
auto_sbitmap to auto_bitmap.
(region_model::get_referenced_base_regions): Likewise.
* region-model.h: Include "bitmap.h" rather than "sbitmap.h".
(region_model::get_referenced_base_regions): Convert from
auto_sbitmap to auto_bitmap.
gcc/testsuite/ChangeLog:
PR analyzer/108003
* g++.dg/analyzer/pr108003.C: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
gcc/analyzer/ChangeLog:
* region-model-impl-calls.cc (class kf_memcpy): Rename to...
(class kf_memcpy_memmove): ...this.
(kf_memcpy::impl_call_pre): Rename to...
(kf_memcpy_memmove::impl_call_pre): ...this, and check the src for
poison.
(register_known_functions): Update for above renaming, and
register BUILT_IN_MEMMOVE and BUILT_IN_MEMMOVE_CHK.
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/memcpy-1.c (test_8a, test_8b): New tests.
* gcc.dg/analyzer/memmove-1.c: New test, based on memcpy-1.c
* gcc.dg/analyzer/out-of-bounds-1.c (test7): Update expected
result for uninit srcBuf.
* gcc.dg/analyzer/out-of-bounds-5.c (test8, test9): Add
dg-warnings for memcpy from uninit src vla.
* gcc.dg/analyzer/pr104308.c (test_memmove_within_uninit):
Expect creation point note to be missing on riscv*-*-*.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
These are not valid in C++11 and cause a warning when preprocessing,
even though they're inside a skipped group.
chrono:2436: warning: missing terminating ' character
libstdc++-v3/ChangeLog:
PR libstdc++/108015
* include/std/chrono (hh_mm_ss): Remove digit separators.
We define these with the 'struct' keyword, but the standard uses
'class'. This results in warnings if users try to refer to them using
elaborated type specifiers.
libstdc++-v3/ChangeLog:
* include/bits/chrono.h (duration, time_point): Change 'struct'
to 'class'.
I got a complaint that while Clang docs suggest options that improve
the quality of the backtraces ASAN prints (cf.
<https://clang.llvm.org/docs/AddressSanitizer.html#usage>), our docs
don't say anything to that effect. This patch amends that with a new
paragraph. (It deliberately doesn't mention -fno-omit-frame-pointer.)
gcc/ChangeLog:
* doc/invoke.texi (-fsanitize=address): Suggest options to improve
stack traces.
The existing comparison was incorrect for non-PRECISE counts
(e.g., AFDO): we could end up with a 0 base_count, which could
lead to asserts, e.g., in good_cloning_opportunity_p.
Tested on x86_64-pc-linux-gnu.
gcc/ChangeLog:
PR ipa/108000
* ipa-cp.cc (ipcp_propagate_stage): Fix profile count comparison
gcc/testsuite
* gcc.dg/tree-prof/pr108000.c: Regression test
The eBPF architecture provides 'end[be,le]' instructions for endianness
swapping. Add a define_insn for bswap<mode>2 to use them instaed of
falling back on a libcall.
gcc/
* config/bpf/bpf.md (bswap<mode>2): New define_insn.
gcc/testsuite/
* gcc.target/bpf/bswap-1.c: New test.
The previous patch avoided building an initializer_list<string> at all when
building a vector<string>, but in situations where that isn't possible, we
could still build the initializer_list with a loop over a constant array.
This is represented using a VEC_INIT_EXPR, which required adjusting a couple
of places that expected the initializer array to have the same type as the
target array and fixing build_vec_init not to undo our efforts.
PR c++/105838
gcc/cp/ChangeLog:
* call.cc (convert_like_internal) [ck_list]: Use
maybe_init_list_as_array.
* constexpr.cc (cxx_eval_vec_init_1): Init might have
a different type.
* tree.cc (build_vec_init_elt): Likewise.
* init.cc (build_vec_init): Handle from_array from a
TARGET_EXPR. Retain TARGET_EXPR of a different type.
gcc/testsuite/ChangeLog:
* g++.dg/tree-ssa/initlist-opt2.C: New test.
When constructing a vector<string> from { "strings" }, first is built an
initializer_list<string>, which is then copied into the strings in the
vector. But this is inefficient: better would be treat the { "strings" }
as a range and construct the strings in the vector directly from the
string-literals. We can do this transformation for standard library
classes because we know the design patterns they follow.
PR c++/105838
gcc/cp/ChangeLog:
* call.cc (list_ctor_element_type): New.
(braced_init_element_type): New.
(has_non_trivial_temporaries): New.
(maybe_init_list_as_array): New.
(maybe_init_list_as_range): New.
(build_user_type_conversion_1): Use maybe_init_list_as_range.
* parser.cc (cp_parser_braced_list): Call
recompute_constructor_flags.
* cp-tree.h (find_temps_r): Declare.
gcc/testsuite/ChangeLog:
* g++.dg/tree-ssa/initlist-opt1.C: New test.
In this PR, initializing the array of std::string to pass to the vector
initializer_list constructor gets very confusing to the optimizers as the
number of elements increases, primarily because of all the std::allocator
temporaries passed to all the string constructors. Instead of creating one
for each string, let's share an allocator between all the strings; we can do
this safely because we know that std::allocator is stateless and that string
doesn't care about the object identity of its allocator parameter.
PR c++/105838
gcc/cp/ChangeLog:
* cp-tree.h (is_std_allocator): Declare.
* constexpr.cc (is_std_allocator): Split out from...
(is_std_allocator_allocate): ...here.
* init.cc (find_temps_r): New.
(find_allocator_temp): New.
(build_vec_init): Use it.
gcc/testsuite/ChangeLog:
* g++.dg/tree-ssa/allocator-opt1.C: New test.
Currently patchable area is at the wrong place on AArch64. It is placed
immediately after function label, before .cfi_startproc. This patch
adds UNSPECV_PATCHABLE_AREA for pseudo patchable area instruction and
modifies aarch64_print_patchable_function_entry to avoid placing
patchable area before .cfi_startproc.
gcc/
PR target/98776
* config/aarch64/aarch64-protos.h (aarch64_output_patchable_area):
Declared.
* config/aarch64/aarch64.cc (aarch64_print_patchable_function_entry):
Emit an UNSPECV_PATCHABLE_AREA pseudo instruction.
(aarch64_output_patchable_area): New.
* config/aarch64/aarch64.md (UNSPECV_PATCHABLE_AREA): New.
(patchable_area): Define.
gcc/testsuite/
PR target/98776
* gcc.target/aarch64/pr98776.c: New.
* gcc.target/aarch64/pr92424-2.c: Adjust pattern.
* gcc.target/aarch64/pr92424-3.c: Adjust pattern.
In commit e5cfb9cac1d7aba9a8ea73bfe7922cfaff9d61f3 I introduced tests
for strdup and strndup with leaks. Fix those leaks.
gcc/testsuite/ChangeLog:
* gcc.dg/builtin-dynamic-object-size-0.c (test_strdup,
test_strndup, test_strdup_min, test_strndup_min): Free RES
before returning from function.
Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>
The following testcase FAILs on aarch64-linux. We have some atomic
instruction followed by 2 DEBUG_INSNs (if -g only of course) followed
by NOTE_INSN_EPILOGUE_BEG followed by some USE insn.
Now, split3 pass replaces the atomic instruction with a code sequence
which ends with a conditional jump and the split3 pass calls
find_many_sub_basic_blocks.
For -g0, find_bb_boundaries sees the flow_transfer_insn (the new conditional
jump), then NOTE_INSN_EPILOGUE_BEG which can live in between basic blocks
and then the USE insn, so splits block after the NOTE_INSN_EPILOGUE_BEG
and puts the NOTE in between the blocks.
For -g, if sees a DEBUG_INSN after the flow_transfer_insn, so sets
debug_insn to it, then walks over another DEBUG_INSN, NOTE_INSN_EPILOGUE_BEG
until it finally sees the USE insn, and triggers the:
rtx_insn *prev = PREV_INSN (insn);
/* If the first non-debug inside_basic_block_p insn after a control
flow transfer is not a label, split the block before the debug
insn instead of before the non-debug insn, so that the debug
insns are not lost. */
if (debug_insn && code != CODE_LABEL && code != BARRIER)
prev = PREV_INSN (debug_insn);
code I've added for PR81325. If there are only DEBUG_INSNs, that is
the right thing to do, but if in between debug_insn and insn there are
notes which can stay in between basic blocks or simnilarly JUMP_TABLE_DATA
or their associated CODE_LABELs, it causes -fcompare-debug differences.
The following patch fixes it by clearing debug_insn if JUMP_TABLE_DATA
or associated CODE_LABEL is seen (I'm afraid there is no good answer
what to do with DEBUG_INSNs before those; the code then removes them:
/* Clean up the bb field for the insns between the blocks. */
for (x = NEXT_INSN (flow_transfer_insn);
x != BB_HEAD (fallthru->dest);
x = next)
{
next = NEXT_INSN (x);
/* Debug insns should not be in between basic blocks,
drop them on the floor. */
if (DEBUG_INSN_P (x))
delete_insn (x);
else if (!BARRIER_P (x))
set_block_for_insn (x, NULL);
}
but if there are NOTEs, the patch just reorders the NOTEs and DEBUG_INSNs,
such that the NOTEs come first (so that they stay in between basic blocks
like with -g0) and DEBUG_INSNs after those (so that bb is split before
them, so they will be in the basic block after NOTE_INSN_BASIC_BLOCK).
2022-12-08 Jakub Jelinek <jakub@redhat.com>
PR debug/106719
* cfgbuild.cc (find_bb_boundaries): If there are NOTEs in between
debug_insn (seen after flow_transfer_insn) and insn, move NOTEs
before all the DEBUG_INSNs and split after NOTEs. If there are
other insns like jump table data, clear debug_insn.
* gcc.dg/pr106719.c: New test.
PR tree-optimization/107985
gcc/
* gimple-range-op.cc
(gimple_range_op_handler::gimple_range_op_handler): Check if type
of the operands is supported.
* gimple-range.cc (gimple_ranger::prefill_stmt_dependencies): Do
not assert if here is no range-op handler.
gcc/testsuite/
* g++.dg/pr107985.C: New.