Those qualifications are needed in _GLIBCXX_INLINE_VERSION mode because in <cctype>
symbols are not put in versioned namespace.
libstdc++-v3/ChangeLog
* include/std/format: Add std qualification on isxdigit calls.
We already cache the overall normal form of a declaration's constraints
(under the assumption that it can't change over the translation unit).
But if we have something like
template<class T> concept complicated = /* ... */;
template<class T> void f() requires complicated<T> && /* ... */;
template<class T> void g() requires complicated<T> && /* ... */;
then despite this high-level caching we'd still redundantly have to
expand the concept-id complicated<T> twice, once during normalization of
f's constraints and again during normalization of g's. Ideally, we'd
reuse the previously computed normal form of complicated<T> the second
time around.
To that end this patch introduces an intermediate layer of caching
during constraint normalization -- caching of the normal form of a
concept-id -- that sits between our high-level caching of the overall
normal form of a declaration's constraints and our low-level caching of
each individual atomic constraint.
It turns out this caching generalizes normalize_concept_check's caching
of the normal form of a concept definition (which is equivalent to the
normal form of the concept-id C<gtargs> where gtargs is C's generic
arguments) so this patch unifies the caching accordingly.
gcc/cp/ChangeLog:
* constraint.cc (struct norm_entry): Define.
(struct norm_hasher): Define.
(norm_cache): Define.
(normalize_concept_check): Add function comment. Cache the
the normal form of the substituted concept-id. Canonicalize
generic arguments as NULL_TREE. Don't coerce arguments unless
they were substituted.
(normalize_concept_definition): Simplify. Use norm_cache
instead of normalized_map.
The only practical difference between coerce_innermost_template_parms
and the main function coerce_template_parms is that the former accepts
a potentially multi-level parameter list and returns an argument vector
of the same depth, whereas the latter accepts only a single level of
parameters and only returns only a single level of arguments. Both
functions accept a multi-level argument vector.
In light of this, it seems more natural to just overload the behavior of
the main function according to whether the given parameter list is
multi-level or not. And it turns out we can assume the given parms and
args have the same depth in the multi-level case, which simplifies the
overloading logic.
Besides the simplification benefit, another benefit of this unification
is that it avoids an extra copy of a multi-level args since now we can
return new_args directly from c_t_p. (And because of this, we need to
turn new_inner_args into a reference so that overwriting it also updates
new_args.)
gcc/cp/ChangeLog:
* pt.cc (coerce_template_parms): Salvage part of the function
comment from c_innermost_t_p. Handle parms being a full
template parameter list.
(coerce_innermost_template_parms): Remove.
(lookup_template_class): Use c_t_p instead of c_innermost_t_p.
(finish_template_variable): Likewise.
(tsubst_decl): Likewise.
(instantiate_alias_template): Likewise.
As the following testcase shows, the swap_rtx_condition function
in reg-stack can result in different code generation between -g and -g0.
The function is doing the changes as it goes, so does analysis and
changes together, which makes it harder to deal with DEBUG_INSNs,
where normally analysis phase ignores them and the later phase
doesn't.
swap_rtx_condition walks instructions two different ways, one is
using next_flags_user function which stops on non-call instructions
that mention the flags register, and the other is a loop on fnstsw
where it stops on instructions mentioning it and tries to find
sahf instruction that uses it (in both cases calls stop it and so
does end of basic block).
Now both of these currently stop on DEBUG_INSNs that mention
the flags register resp. the fnstsw result register.
On success the function recurses on next flags user instruction
if still live and if the recursion failed, reverts the changes
it did too and fails.
If it were just for the next_flags_user case, the fix could be
just not doing
INSN_CODE (insn) = -1;
if (recog_memoized (insn) == -1)
fail = 1;
on DEBUG_INSNs (assuming all changes to those are fine),
swap_rtx_condition_1 just changes one comparison to a different
one. But due to the possibility of fnstsw result being used
in theory before sahf in some DEBUG_INSNs, this patch takes
a different approach. swap_rtx_condition has now a new argument
and two modes. The first mode is when debug_seen is >= 0, in this
case both next_flags_user and the loop for fnstsw -> sahf will
ignore but note DEBUG_INSNs (that mention flags register or fnstsw
result). If no such DEBUG_INSN is found during the whole call
including recursive invocations (so e.g. for -g0 but probably most
often for -g as well), it behaves as before, if it returns true
all the changes are done and nothing further needs to be done later.
If any DEBUG_INSNs are seen along the way, even when returning success
all the changes are reverted, so it just reports that the function
would be successful if DEBUG_INSNs were ignored.
In this case, compare_for_stack_reg needs to call it again in
debug_seen = -1 mode, which tells the function to update everything
including DEBUG_INSNs. For the fnstsw -> sahf case which I hope
will be very rare I just reset the DEBUG_INSNs, I don't really
know how to express it easily otherwise. For the rest
swap_rtx_condition_1 is done even on the DEBUG_INSNs.
2022-11-20 Jakub Jelinek <jakub@redhat.com>
PR target/107183
* reg-stack.cc (next_flags_user): Add DEBUG_SEEN argument.
If >= 0 and a DEBUG_INSN would be otherwise returned, set
DEBUG_SEEN to 1 and ignore it.
(swap_rtx_condition): Add DEBUG_SEEN argument. In >= 0
mode only set DEBUG_SEEN to 1 if problematic DEBUG_ISNSs
were seen and revert all changes on success in that case.
Don't try to recog_memoized DEBUG_INSNs.
(compare_for_stack_reg): Adjust swap_rtx_condition caller.
If it returns true and debug_seen is 1, call swap_rtx_condition
again with debug_seen -1.
* gcc.dg/ubsan/pr107183.c: New test.
The tester started tripping this on s390-linux-gnu:
Tests that now fail, but worked before (19 tests):
gcc.dg/pr96542.c scan-tree-dump-times evrp "254" 2
The problem is we search for "254" in the dump file. The dump file contains
UIDs for function declarations. So changes in the number of predefined DECL
nodes can make the test pass or file depending on whether or not a decl with
a UID containing "254" shows up. Like this:
;; Function foo (foo, funcdef_no=0, decl_uid=2542, cgraph_uid=1, symbol_order=0)
ISTM the test wants to look for a "return 254" rather than just "254".
I added a change for that to the tester. Naturally that fixed the test on
s390 and the dozen or so targets I tested didn't show any regressions.
gcc/testsuite
* gcc.dg/pr96542.c: Avoid falsely matching DECL_UIDs with
the number 254 in them.
This makes all the [iterator.range] functions always-inline, except the
ones that construct a std::reverse_iterator, as they do a little more
work. They could probably be made always_inline too though, and maybe
the std::reverse_iterator constructor too.
This means that even for -O0 these functions have no runtime overhead
compared with calling a member of the container, or performing pointer
arithmetic for arrays.
libstdc++-v3/ChangeLog:
* include/bits/range_access.h: Add always_inline attribute to
trivial functions.
Since we use C++11 by default now, we can
use constexpr for some const decls in tree-core.h.
This patch does that and it allows for better optimizations
of GCC code with checking enabled and without LTO.
For an example generic-match.cc compiling is speed up due
to the less number of basic blocks and less debugging info
produced. I did not check the speed of compiling the same source
but rather the speed of compiling the old vs new sources here
(but with the same compiler base).
The small slow down in the parsing of the arrays in each TU
is migrated by a speed up in how much code/debugging info
is produced in the end.
Note I looked at generic-match.cc since it is one of the
compiling sources which causes parallel building to stall and
I wanted to speed it up.
OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
gcc/ChangeLog:
PR middle-end/14840
* tree-core.h (tree_code_type): Constexprify
by including all-tree.def.
(tree_code_length): Likewise.
* tree.cc (tree_code_type): Remove.
(tree_code_length): Remove.
This fixes a Doxygen warning about a mismatched parameter name. The
standard uses 'r' here, like the Doxygen comment, so use '__r' instead
of '__e'.
libstdc++-v3/ChangeLog:
* include/bits/ptr_traits.h (pointer_traits::pointer_to): Rename
parameter.
A recent nvptx-tools change: commit 886a95faf66bf66a82fc0fe7d2a9fd9e9fec2820
"ld: Don't search for input files in '-L'directories" (of
<https://github.com/MentorEmbedded/nvptx-tools/pull/38>
"Match standard 'ld' "search" behavior") in GCC/nvptx target testing
generally causes linking to fail with:
error opening crt0.o
collect2: error: ld returned 1 exit status
compiler exited with status 1
Indeed per GCC '-v' output, there is an undecorated 'crt0.o' on the linker
('collect2') command line:
[...]/build-gcc/./gcc/collect2 -o [...] crt0.o [...]
This is due to:
gcc/config/nvptx/nvptx.h:#define STARTFILE_SPEC "%{mmainkernel:crt0.o}"
..., and the fix, as used by numerous other GCC targets, is to instead use
'crt0.o%s'; for '%s' means, per 'gcc/gcc.cc', "The Specs Language":
%s current argument is the name of a library or startup file of some sort.
Search for that file in a standard list of directories
and substitute the full name found.
With that, we get the expected path to 'crt0.o'.
gcc/
* config/nvptx/nvptx.h (STARTFILE_SPEC): Fix 'crt0.o' for
'-mmainkernel'.
r7-912 copied (parts of) the valgrind annotation checks from gcc
to libcpp. The above copies the missing pieces to libcpp to diagnose
when libcpp is configured with --enable-valgrind-annotations but
valgrind is not installed.
libcpp/ChangeLog:
PR preprocessor/107691
* configure.ac: Add valgrind header checks.
* configure: Regenerate.
This allows JIT to be built with a different thread model from posix
where pthread isn't available
By renaming the acquire_mutex () and release_mutex () member functions
to lock() and unlock() we make the playback::context type meet the C++
Lockable requirements. This allows it to be used with a scoped lock
(i.e. RAII) type as std::lock_guard. This automatically releases the
mutex when leaving the scope.
Co-authored-by: LIU Hao <lh_mouse@126.com>
gcc/jit/ChangeLog:
* jit-playback.cc (playback::context::scoped_lock): Define RAII
lock type.
(playback::context::compile): Use scoped_lock to acquire mutex
for the active playback context.
(jit_mutex): Change to std::mutex.
(playback::context::acquire_mutex): Rename to ...
(playback::context::lock): ... this.
(playback::context::release_mutex): Rename to ...
(playback::context::unlock): ... this.
* jit-playback.h (playback::context): Rename members and declare
scoped_lock.
* jit-recording.cc (INCLUDE_PTHREAD_H): Remove unused define.
* libgccjit.cc (version_mutex): Change to std::mutex.
(struct jit_version_info): Use std::lock_guard to acquire and
release mutex.
gcc/ChangeLog:
* system.h [INCLUDE_MUTEX]: Include header for std::mutex.
libgomp/ChangeLog:
* config/gcn/libgomp-gcn.h: New file; contains
struct output, declared previously in plugin-gcn.c.
* config/gcn/target.c: Include it.
(GOMP_ADDITIONAL_ICVS): Declare as extern var.
(GOMP_target_ext): Handle reverse offload.
* plugin/plugin-gcn.c: Include libgomp-gcn.h.
(struct kernargs): Replace struct def by the one
from libgomp-gcn.h for output_data.
(process_reverse_offload): New.
(console_output): Call it.
On Fri, Oct 21, 2022 at 10:23:14AM +0200, Uros Bizjak wrote:
> OK, but now we have two more copies of a function that effectively
> extends BF to SF. Can you please split this utility function out and
> use it here and in cbranchbf4/cstorebf4? I'm talking about this part:
>
> + op = gen_lowpart (HImode, op1);
> + if (CONST_INT_P (op))
> + op = simplify_const_unary_operation (FLOAT_EXTEND, SFmode,
> + op1, BFmode);
> + else
> + {
> + rtx t1 = gen_reg_rtx (SImode);
> + emit_insn (gen_zero_extendhisi2 (t1, op));
> + emit_insn (gen_ashlsi3 (t1, t1, GEN_INT (16)));
> + op = gen_lowpart (SFmode, t1);
> + }
>
> Taking this a bit further, it looks like a generic function to extend
> BF to SF, when extendbfsf2 named function is not defined.
>
> The above could be a follow-up patch, the proposed patch is OK.
Sorry for the delay, only got to this now.
And I'm fixing the sNaN handling in it too. If the argument is a BFmode sNaN
constant, we want in this case just a SFmode sNaN constant, but
simplify_const_unary_operation (FLOAT_EXTEND, ...)
in that case returns NULL (as normally conversions of a sNaN to some
other float type should raise an exception). In this case we want
to bypass that, as we know the sNaN will be used immediately in the SFmode
comparison a few instructions later. The patch fixes it by just
simplifying the lowpart to HImode and its zero extension to SImode, then
force into a pseudo and do the left shift and subreg to SFmode on the
pseudo. CSE or combine can handle it later.
2022-11-19 Jakub Jelinek <jakub@redhat.com>
PR target/107628
* config/i386/i386-protos.h (ix86_expand_fast_convert_bf_to_sf):
Declare.
* config/i386/i386-expand.cc (ix86_expand_fast_convert_bf_to_sf): New
function.
* config/i386/i386.md (cbranchbf4, cstorebf4): Use it.
* gcc.target/i386/pr107628.c: New test.
The following patch implements this paper.
Per further discussions it is implemented for C++23 only, so isn't
treated as a DR, e.g. because the part of the standard the paper is
changing didn't even exist in C++20.
And we gave up on trying to implement it as a pedwarn rather than
error for C++20 and older, because of implicit constexpr lambdas or
-fimplicit-constexpr reasons.
For C++20 and older, the only change is that passing through
definitions of static or thread_local vars usable in constant expressions
is now accepted in statement expressions if they aren't inside of constexpr
or consteval functions.
2022-11-19 Jakub Jelinek <jakub@redhat.com>
gcc/c-family/
* c-cppbuiltin.cc (c_cpp_builtins): Bump __cpp_constexpr
value from 202207L to 202211L.
gcc/cp/
* constexpr.cc (cxx_eval_constant_expression): Implement C++23
P2647R1 - Permitting static constexpr variables in constexpr functions.
Allow DECL_EXPRs of decl_constant_var_p static or thread_local vars.
(potential_constant_expression_1): Similarly, except use
decl_maybe_constant_var_p instead of decl_constant_var_p if
processing_template_decl.
gcc/testsuite/
* g++.dg/cpp23/constexpr-nonlit17.C: New test.
* g++.dg/cpp23/constexpr-nonlit18.C: New test.
* g++.dg/cpp23/feat-cxx2b.C: Adjust expected __cpp_constexpr
value.
* g++.dg/ext/stmtexpr19.C: Don't expect an error.
* g++.dg/ext/stmtexpr25.C: New test.
This patch adds the library support for the experimental C++ Contracts
implementation. This now consists only of a default definition of the
violation handler, which users can override through defining their own
version. To avoid ABI stability problems with libstdc++.so this is added to
a separate -lstdc++exp static library, which the driver knows to add when it
sees -fcontracts.
Co-authored-by: Andrew Marmaduke <amarmaduke@lock3software.com>
Co-authored-by: Jason Merrill <jason@redhat.com>
libstdc++-v3/ChangeLog:
* acinclude.m4 (glibcxx_SUBDIRS): Add src/experimental.
* include/Makefile.am (experimental_headers): Add contract.
* include/Makefile.in: Regenerate.
* src/Makefile.am (SUBDIRS): Add experimental.
* src/Makefile.in: Regenerate.
* configure: Regenerate.
* src/experimental/contract.cc: New file.
* src/experimental/Makefile.am: New file.
* src/experimental/Makefile.in: New file.
* include/experimental/contract: New file.
PR analyzer/107582 reports a false +ve from
-Wanalyzer-use-of-uninitialized-value where
the analyzer's feasibility checker erroneously decides
that point (B) in the code below is reachable, with "x" being
uninitialized there:
pthread_cleanup_push(func, NULL);
while (ret != ETIMEDOUT)
ret = rand() % 1000;
/* (A): after the while loop */
if (ret != ETIMEDOUT)
x = &z;
pthread_cleanup_pop(1);
if (ret == ETIMEDOUT)
return 0;
/* (B): after not bailing out */
due to these contradictionary conditions somehow both holding:
* (ret == ETIMEDOUT), at (A) (skipping the initialization of x), and
* (ret != ETIMEDOUT), at (B)
The root cause is that after the while loop, state merger puts ret in
the exploded graph in an UNKNOWN state, and saves the diagnostic at (B).
Later, as we explore the feasibilty of reaching the enode for (B),
dynamic_call_info_t::update_model is called to push/pop the
frames for handling the call to "func" in pthread_cleanup_pop.
The "ret" at these nodes in the feasible_graph has a conjured_svalue for
"ret", and a constraint on it being either == *or* != ETIMEDOUT.
However dynamic_call_info_t::update_model blithely clobbers the
model with a copy from the exploded_graph, in which "ret" is UNKNOWN.
This patch fixes dynamic_call_info_t::update_model so that it
simulates pushing/popping a frame on the model we're working with,
preserving knowledge of the constraint on "ret", and enabling the
analyzer to "know" that the bail-out must happen.
Doing so fixes the false positive.
gcc/analyzer/ChangeLog:
PR analyzer/107582
* engine.cc (dynamic_call_info_t::update_model): Update the model
by pushing or pop a frame, rather than by clobbering it with the
model from the exploded_node's state.
gcc/testsuite/ChangeLog:
PR analyzer/107582
* gcc.dg/analyzer/feasibility-4.c: New test.
* gcc.dg/analyzer/feasibility-pr107582-1.c: New test.
* gcc.dg/analyzer/feasibility-pr107582-2.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Fix a missing check that the argument to __analyzer_dump_capacity must
be a pointer type (which would otherwise lead to an ICE).
Do so by using the known_function_manager rather than by doing lots of
string matching. Do the same for many other functions.
Doing so moves the type-checking closer to the logic that makes use
of it, by putting them in the same class, rather than splitting them
up between two source files (and sometimes three, e.g. for "pipe").
I hope this reduces the number of missing checks.
gcc/analyzer/ChangeLog:
* analyzer.cc (is_pipe_call_p): Delete.
* analyzer.h (is_pipe_call_p): Delete.
* region-model-impl-calls.cc (call_details::get_location): New.
(class kf_analyzer_break): New, adapted from
region_model::on_stmt_pre.
(region_model::impl_call_analyzer_describe): Convert to...
(class kf_analyzer_describe): ...this.
(region_model::impl_call_analyzer_dump_capacity): Convert to...
(class kf_analyzer_dump_capacity): ...this.
(region_model::impl_call_analyzer_dump_escaped): Convert to...
(class kf_analyzer_dump_escaped): ...this.
(class kf_analyzer_dump_exploded_nodes): New.
(region_model::impl_call_analyzer_dump_named_constant): Convert
to...
(class kf_analyzer_dump_named_constant): ...this.
(class dump_path_diagnostic): Move here from region-model.cc.
(class kf_analyzer_dump_path) New, adapted from
region_model::on_stmt_pre.
(class kf_analyzer_dump_region_model): Likewise.
(region_model::impl_call_analyzer_eval): Convert to...
(class kf_analyzer_eval): ...this.
(region_model::impl_call_analyzer_get_unknown_ptr): Convert to...
(class kf_analyzer_get_unknown_ptr): ...this.
(class known_function_accept): Rename to...
(class kf_accept): ...this.
(class known_function_bind): Rename to...
(class kf_bind): ...this.
(class known_function_connect): Rename to...
(class kf_connect): ...this.
(region_model::impl_call_errno_location): Convert to...
(class kf_errno_location): ...this.
(class known_function_listen): Rename to...
(class kf_listen): ...this.
(region_model::impl_call_pipe): Convert to...
(class kf_pipe): ...this.
(region_model::impl_call_putenv): Convert to...
(class kf_putenv): ...this.
(region_model::impl_call_operator_new): Convert to...
(class kf_operator_new): ...this.
(region_model::impl_call_operator_delete): Convert to...
(class kf_operator_delete): ...this.
(class known_function_socket): Rename to...
(class kf_socket): ...this.
(register_known_functions): Rename param to KFM. Break out
existing known functions into a "POSIX" section, and add "pipe",
"pipe2", and "putenv". Add debugging functions
"__analyzer_break", "__analyzer_describe",
"__analyzer_dump_capacity", "__analyzer_dump_escaped",
"__analyzer_dump_exploded_nodes",
"__analyzer_dump_named_constant", "__analyzer_dump_path",
"__analyzer_dump_region_model", "__analyzer_eval",
"__analyzer_get_unknown_ptr". Add C++ support functions
"operator new", "operator new []", "operator delete", and
"operator delete []".
* region-model.cc (class dump_path_diagnostic): Move to
region-model-impl-calls.cc.
(region_model::on_stmt_pre): Eliminate special-casing of
"__analyzer_describe", "__analyzer_dump_capacity",
"__analyzer_dump_escaped", "__analyzer_dump_named_constant",
"__analyzer_dump_path", "__analyzer_dump_region_model",
"__analyzer_eval", "__analyzer_break",
"__analyzer_dump_exploded_nodes", "__analyzer_get_unknown_ptr",
"__errno_location", "pipe", "pipe2", "putenv", "operator new",
"operator new []", "operator delete", "operator delete []"
"pipe" and "pipe2", handling them instead via the known_functions
mechanism.
* region-model.h (call_details::get_location): New decl.
(region_model::impl_call_analyzer_describe): Delete decl.
(region_model::impl_call_analyzer_dump_capacity): Delete decl.
(region_model::impl_call_analyzer_dump_escaped): Delete decl.
(region_model::impl_call_analyzer_dump_named_constant): Delete decl.
(region_model::impl_call_analyzer_eval): Delete decl.
(region_model::impl_call_analyzer_get_unknown_ptr): Delete decl.
(region_model::impl_call_errno_location): Delete decl.
(region_model::impl_call_pipe): Delete decl.
(region_model::impl_call_putenv): Delete decl.
(region_model::impl_call_operator_new): Delete decl.
(region_model::impl_call_operator_delete): Delete decl.
* sm-fd.cc: Update comments.
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/analyzer-debugging-fns-1.c: New test.
* gcc.dg/analyzer/attr-const-3.c: Increase the
"analyzer-max-svalue-depth" from 0 to 4 to ensure that
"__analyzer_eval" is recognized.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Optimize the common case of a SImode min/max against a constant
that is safe both for sign- and zero-extension.
E.g., consider the case
int f(unsigned int* a)
{
const int C = 1000;
return *a * 3 > C ? C : *a * 3;
}
where the constant C will yield the same result in DImode whether
sign- or zero-extended.
This should eventually go away once the lowering to RTL smartens up
and considers the precision/signedness and the value-ranges of the
operands to MIN_EXPR and MAX_EXPR.
gcc/ChangeLog:
* config/riscv/bitmanip.md (*minmax): Additional pattern for
min/max against constants that are extension-invariant.
* config/riscv/iterators.md (minmax_optab): Add an iterator
that has only min and max rtl.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/zbb-min-max-02.c: New test.
Use Zbs when generating a sequence for
"if ((a & twobits) == singlebit) ..."
that can be expressed as
bexti + bexti + andn.
gcc/ChangeLog:
* config/riscv/bitmanip.md
(*branch<X:mode>_mask_twobits_equals_singlebit):
Handle "if ((a & T) == C)" using Zbs, when T has 2 bits set and C
has one of these tow bits set.
* config/riscv/predicates.md (const_twobits_not_arith_operand):
New predicate.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/zbs-if_then_else-01.c: New test.
Sequences of the form "a | C" and "a ^ C" with C being the positive
half of a signed immediate's range with one extra bit set in addition
are mapped to ori/xori and one bseti/binvi to avoid using a temporary
(and a multi-insn sequence to load C into that temporary).
Something similar holds for "a & ~C" being representable as either
bclri + bclri or bclri + andi.
gcc/ChangeLog:
* config/riscv/bitmanip.md (*<or_optab>i<mode>_extrabit):
New pattern for binvi+binvi/xori and bseti+bseti/ori
(*andi<mode>_extrabit): New pattern for bclri+bclri/andi
* config/riscv/iterators.md (any_or): Match or and ior
* config/riscv/predicates.md (const_twobits_operand):
New predicate.
(uimm_extra_bit_operand): New predicate.
(uimm_extra_bit_or_twobits): New predicate.
(not_uimm_extra_bit_operand): New predicate.
(not_uimm_extra_bit_or_nottwobits): New predicate.
* config/riscv/riscv.h (UIMM_EXTRA_BIT_OPERAND):
Helper for the uimm_extra_bit_operand and
not_uimm_extra_bit_operand predicates.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/zbs-bclri.c: Rename
* gcc.target/riscv/zbs-bclri-01.c: Renamed from above.
* gcc.target/riscv/zbs-bclri-02.c: New test.
* gcc.target/riscv/zbs-binvi.c: New test.
* gcc.target/riscv/zbs-bseti.c: New test.
gcc/ChangeLog:
* config/riscv/bitmanip.md: Handle corner-cases for combine
when chaining slli(.uw)? + addw
* config/riscv/riscv-protos.h (riscv_shamt_matches_mask_p):
Define prototype.
* config/riscv/riscv.cc (riscv_shamt_matches_mask_p):
Helper for evaluating the relationship between two operands.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/zba-shNadd-04.c: New test.
When using strength-reduction, we will reduce a multiplication to a
sequence of shifts and adds. If this is performed with 32-bit types
and followed by a division, the lack of w-form sh[123]add will make
combination impossible and lead to a slli + addw being generated.
Split the sequence with the knowledge that a w-form div will perform
implicit sign-extensions.
gcc/ChangeLog:
* config/riscv/bitmanip.md: Add a define_split to optimize
slliw + addiw + divw into sh[123]add + divw.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/zba-shNadd-05.c: New test.
gcc/ChangeLog:
* config/riscv/predicates.md (shifted_const_arith_operand): New predicate.
(uimm_extra_bit_operand): New predicate.
* config/riscv/riscv.md (*branch<ANYI:mode>_shiftedarith_equals_zero):
New pattern.
(*branch<ANYI:mode>_shiftedmask_equals_zero): New pattern.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/branch-1.c: New test.
As long as the SImode operand is not a partial subreg, we can use a
bseti without postprocessing to or in a bit, as the middle end is
smart enough to stay away from the signbit.
gcc/ChangeLog:
* config/riscv/bitmanip.md (*bsetidisi): New pattern.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/zbs-bseti-02.c: New test.
Code such as:
#include __FILE__
can interact poorly with the *-prefix-map options when cross compiling. In
general you're after to remap filenames for use in target context but the
local paths should be used to find include files at compile time. Ingoring
filename remapping for directives allows avoiding such failures.
Fix this to improve such usage and then document this against file-prefix-map
(referenced by the other *-prefix-map options) to make the behaviour clear
and defined.
libcpp/ChangeLog:
* macro.cc (_cpp_builtin_macro_text): Don't remap filenames within
directives.
gcc/ChangeLog:
* doc/invoke.texi: Document prefix-maps don't affect directives.
gcc/fortran/ChangeLog:
PR fortran/107576
* interface.cc (gfc_procedure_use): Reject NULL as actual argument
when there is no explicit procedure interface.
gcc/testsuite/ChangeLog:
PR fortran/107576
* gfortran.dg/null_actual_3.f90: New test.
The problem here is after we created a call expression
in the C front-end, we replace the decl type with
an error mark node. We then end up calling
aggregate_value_p with the call expression
with the decl with the error mark as the type
and we ICE.
The fix is to check the function type
after we process the call expression inside
aggregate_value_p to get it.
OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
Thanks,
Andrew Pinski
gcc/ChangeLog:
PR middle-end/107705
* function.cc (aggregate_value_p): Return 0 if
the function type was an error operand.
gcc/testsuite/ChangeLog:
* gcc.dg/redecl-22.c: New test.
The problem here is the gimplifier returns GS_ERROR but
in some cases we don't check that soon enough and try
to do other work which could crash.
So the fix in these two cases is to return GS_ERROR
early if the gimplify_* functions had return GS_ERROR.
OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
Thanks,
Andrew Pinski
gcc/ChangeLog:
PR c/106764
PR c/106765
PR c/107307
* gimplify.cc (gimplify_compound_lval): Return GS_ERROR
if gimplify_expr had return GS_ERROR.
(gimplify_call_expr): Likewise.
gcc/testsuite/ChangeLog:
PR c/106764
PR c/106765
PR c/107307
* gcc.dg/redecl-19.c: New test.
* gcc.dg/redecl-20.c: New test.
* gcc.dg/redecl-21.c: New test.
... And another follow-up once I realised that the sign-extending load, of course,
needs to have strictly an X-reg as a destination for DImode extensions and a W-reg
for SImode ones.
Tested on aarch64-none-linux.
gcc/ChangeLog:
* config/aarch64/atomics.md (*aarch64_atomic_load<ALLX:mode>_rcpc_sext):
Use <GPI:w> for destination format.
* config/aarch64/iterators.md (w_sz): Delete.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/ldapr-sext.c: Adjust expected output.
For some test cases, it's required that the optional module mapper
"g++-mapper-server" is built. As the server is not required, the
test cases will fail if it can't be found.
gcc/testsuite/ChangeLog:
* lib/target-supports.exp (check_is_prog_name_available):
New.
* lib/target-supports-dg.exp
(dg-require-prog-name-available): New.
* g++.dg/modules/modules.exp: Verify avilability of module
mapper.
Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
Upon some further inspection I realised I had misunderstood some intricacies of the extending loads of the RCPC feature.
This patch fixes up the recent GCC support accordingly. In particular:
* The sign-extending forms are a form of LDAPURS* and are actually part of FEAT_RCPC2
that is enabled with Armv8.4-a rather than the base Armv8.3-a FEAT_RCPC.
The patch introduces a TARGET_RCPC2 macro and gates this combine pattern accordingly.
* The assembly output for the zero-extending LDAPR instruction should always use %w formatting for its destination register.
The testcase is split into zero-extending and sign-extending parts since they require different architecture pragmas.
It's also straightforward to add the rest of the FEAT_RCPC2 codegen
(with immediate offset addressing modes) but that can be done as a separate patch.
Apologies for not catching this sooner, but it hasn't been in trunk long, so no harm done.
Bootstrapped and tested on aarch64-none-linux-gnu.
gcc/ChangeLog:
* config/aarch64/aarch64.h (TARGET_RCPC2): Define.
* config/aarch64/atomics.md (*aarch64_atomic_load<ALLX:mode>_rcpc_zext):
Adjust output template.
(*aarch64_atomic_load<ALLX:mode>_rcpc_sex): Guard on TARGET_RCPC2.
Adjust output template.
* config/aarch64/iterators.md (w_sz): New mode attr.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/ldapr-ext.c: Rename to...
* gcc.target/aarch64/ldapr-zext.c: ... This. Fix expected assembly.
* gcc.target/aarch64/ldapr-sext.c: New test.
The following patch implements CWG2635.
2022-11-18 Jakub Jelinek <jakub@redhat.com>
* decl.cc (grokdeclarator): Implement
CWG2635 - Constrained structured bindings. Emit a pedwarn on
constrained auto type. Add auto_diagnostic_group for error_at
and inform for non-auto type on structured bindings declaration.
* g++.dg/cpp2a/decomp5.C: New test.
* g++.dg/cpp2a/decomp6.C: New test.
* g++.dg/cpp2a/decomp7.C: New test.
* g++.dg/cpp2a/concepts-placeholder7.C: Adjust expected diagnostics.
* g++.dg/cpp2a/concepts-placeholder8.C: Likewise.
* g++.dg/cpp2a/concepts-placeholder9.C: New test.
* g++.dg/cpp2a/concepts-placeholder10.C: New test.
Only with -ffp-contract=fast we can synthesize FMA operations like
vfmaddsub231ps, so properly guard the transform in SLP pattern
detection.
PR tree-optimization/107647
* tree-vect-slp-patterns.cc (addsub_pattern::recognize): Only
allow FMA generation with -ffp-contract=fast for FP types.
(complex_mul_pattern::matches): Likewise.
* gcc.target/i386/pr107647.c: New testcase.
We used to expand atomic_exchange_n(ptr, new, mem_order) for subword types
into something like:
{
__typeof__(*ptr) t = atomic_load_n(ptr, mem_order);
atomic_compare_exchange_n(ptr, &t, new, true, mem_order, mem_order);
return t;
}
It's incorrect because another thread may store a different value into *ptr
after atomic_load_n. Then atomic_compare_exchange_n will not store into
*ptr, but atomic_exchange_n should always perform the store.
gcc/ChangeLog:
PR target/107713
* config/loongarch/sync.md
(atomic_cas_value_exchange_7_<mode>): New define_insn.
(atomic_exchange): Use atomic_cas_value_exchange_7_si instead of
atomic_cas_value_cmp_and_7_si.
gcc/testsuite/ChangeLog:
PR target/107713
* gcc.target/loongarch/pr107713-1.c: New test.
* gcc.target/loongarch/pr107713-2.c: New test.
[dcl.constinit]: "The constinit specifier shall be applied only to
a declaration of a variable with static or thread storage duration."
Thus, this ought to be OK:
constinit void (*p)() = nullptr;
but the error message I introduced when implementing constinit was
not looking at funcdecl_p, so the code above was rejected.
Fixed thus. I'm checking constinit_p first because I think that's
far more likely to be false than funcdecl_p.
PR c++/104066
gcc/cp/ChangeLog:
* decl.cc (grokdeclarator): Check funcdecl_p before complaining
about constinit.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/constinit18.C: New test.
sbitmap is a simple bitmap and the memory allocated is not cleared
on creation; you have to clear it or set it to all ones before using
it. This is unlike bitmap which is a sparse bitmap and the entries are
cleared as created.
The code added in r13-4044-gdc95e1e9702f2f missed that.
This patch fixes that mistake.
Committed as obvious after a bootstrap and test on x86_64-linux-gnu.
gcc/ChangeLog:
PR middle-end/107734
* match.pd (perm + vector op pattern): Clear the sbitmap before
use.
The threader is creating a scenario where we are trying to solve:
[NEGATIVES] = abs(x)
While solving this we have an intermediate value of UNDEFINED because
we have no positive numbers. But then we try to union the negative
pair to the final result by querying the bounds. Since neither
UNDEFINED nor NAN have bounds, they need to be specially handled.
PR tree-optimization/107732
gcc/ChangeLog:
* range-op-float.cc (foperator_abs::op1_range): Early exit when
result is undefined.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/pr107732.c: New test.
PR analyzer/107711 reports an ICE since r13-4073-gd8aba860b34203 with
the combination of -fanalyzer and -Wunused-macros.
The issue is that in c_translation_unit::consider_macro's call to
cpp_create_reader I was passing "ident_hash" for use by the the new
reader, but that takes ownership of that hash_table, so that ident_hash
erroneously gets freed when c_translation_unit::consider_macro calls
cpp_destroy, leading to a use-after-free in -Wunused-macros, where:
(gdb) p pfile->hash_table->pfile == pfile
$23 = false
and it's instead pointing at the freed reader from consider_macro,
leading to a use-after-free ICE.
Fixed thusly.
gcc/c/ChangeLog:
PR analyzer/107711
* c-parser.cc (ana::c_translation_unit::consider_macro): Pass NULL
to cpp_create_reader, rather than ident_hash, so that the new
reader gets its own hash table.
gcc/testsuite/ChangeLog:
PR analyzer/107711
* gcc.dg/analyzer/named-constants-Wunused-macros.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>