We don't yet have a separate feature flag for FEAT_LRCPC2 (and adding
one will require extending the feature bitmask). Instead, make the
FEAT_LRCPC2 patterns available when either armv8.4-a or +rcpc3 is
specified. We already have a +rcpc flag, so this dependency can be
specified directly.
Also add an explicit dependance on +rcpc to the FEAT_LRCPC2 patterns, so
that they are disabled with armv8.4-a+norcpc.
The cpunative test needed updating because it used an invalid Features
list, since lrcpc3 requires both ilrcpc and lrcpc to be present.
Without this change, host_detect_local_cpu would return the architecture
string 'armv8-a+dotprod+crc+crypto+rcpc3+norcpc'.
gcc/ChangeLog:
* config/aarch64/aarch64-option-extensions.def: Add RCPC to
RCPC3 dependencies.
* config/aarch64/aarch64.h (AARCH64_ISA_RCPC8_4): Add test for
RCPC3 bit
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/cpunative/info_24: Include lrcpc and ilrcpc.
This ICE started with the fairly complicated r13-765. We crash in
gimplify_var_or_parm_decl because a stray VAR_DECL leaked there.
The problem is ultimately that potential_prvalue_result_of wasn't
correctly handling arrays and replace_placeholders_for_class_temp_r
replaced a PLACEHOLDER_EXPR in a TARGET_EXPR which is used in the
context of copy elision. If I have
M m[2] = { M{""}, M{""} };
then we don't invoke the M(const M&) copy-ctor.
One part of the fix is to use TARGET_EXPR_ELIDING_P rather than
potential_prvalue_result_of. That unfortunately doesn't handle the
case like
struct N { N(M); };
N arr[2] = { M{""}, M{""} };
because TARGET_EXPRs that initialize a function argument are not
marked TARGET_EXPR_ELIDING_P even though gimplify_arg drops such
TARGET_EXPRs on the floor. We can use a pset to avoid replacing
placeholders in them.
I made an attempt to use set_target_expr_eliding in
convert_for_arg_passing but that regressed constexpr-diag1.C, and does
not seem like a prudent change in stage 4 anyway.
PR c++/109966
gcc/cp/ChangeLog:
* typeck2.cc (potential_prvalue_result_of): Remove.
(replace_placeholders_for_class_temp_r): Check TARGET_EXPR_ELIDING_P.
Use a pset. Don't replace_placeholders in TARGET_EXPRs that initialize
a function argument.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1y/nsdmi-aggr20.C: New test.
* g++.dg/cpp1y/nsdmi-aggr21.C: New test.
The bug in PR101865 is the _ARCH_PWR8 predefine macro is conditional upon
TARGET_DIRECT_MOVE, which can be false for some -mcpu=power8 compiles if the
-mno-altivec or -mno-vsx options are used. The solution here is to create
a new OPTION_MASK_POWER8 mask that is true for -mcpu=power8, regardless of
Altivec or VSX enablement.
Unfortunately, the only way to create an OPTION_MASK_* mask is to create
a new option, which we have done here, but marked it as WarnRemoved since
we do not want users using it. For stage1, we will look into how we can
create ISA mask flags for use in the compiler without the need for explicit
options.
2024-04-12 Will Schmidt <will_schmidt@linux.ibm.com>
Peter Bergner <bergner@linux.ibm.com>
gcc/
PR target/101865
* config/rs6000/rs6000-builtin.cc (rs6000_builtin_is_supported): Use
TARGET_POWER8.
* config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Use
OPTION_MASK_POWER8.
* config/rs6000/rs6000-cpus.def (POWERPC_MASKS): Add OPTION_MASK_POWER8.
(ISA_2_7_MASKS_SERVER): Likewise.
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Update
comment. Use OPTION_MASK_POWER8 and TARGET_POWER8.
* config/rs6000/rs6000.h (TARGET_SYNC_HI_QI): Use TARGET_POWER8.
* config/rs6000/rs6000.md (define_attr "isa"): Add p8.
(define_attr "enabled"): Handle it.
(define_insn "prefetch"): Use TARGET_POWER8.
* config/rs6000/rs6000.opt (mpower8-internal): New.
gcc/testsuite/
PR target/101865
* gcc.target/powerpc/predefine-p7-novsx.c: New test.
* gcc.target/powerpc/predefine-p8-noaltivec-novsx.c: New test.
* gcc.target/powerpc/predefine-p8-noaltivec.c: New test.
* gcc.target/powerpc/predefine-p8-novsx.c: New test.
* gcc.target/powerpc/predefine-p8-pragma-vsx.c: New test.
* gcc.target/powerpc/predefine-p9-novsx.c: New test.
One known missing piece in the modules implementation is merging of a
streamed-in local type (class or enum) with the corresponding in-TU
version of the local type. This missing piece turns out to cause a
hard-to-reduce use-after-free GC issue due to the entity_ary not being
marked as a GC root (deliberately), and manifests as a serialization
error on stream-in as in PR99426 (see comment #6 for a reduction). It's
also reproducible on trunk when running the xtreme-header tests without
-fno-module-lazy.
This patch implements this missing piece, making us merge such local
types according to their position within the containing function's
definition, analogous to how we merge FIELD_DECLs of a class according
to their index in the TYPE_FIELDS list.
PR c++/99426
gcc/cp/ChangeLog:
* module.cc (merge_kind::MK_local_type): New enumerator.
(merge_kind_name): Update.
(trees_out::chained_decls): Move BLOCK-specific handling
of DECL_LOCAL_DECL_P decls to ...
(trees_out::core_vals) <case BLOCK>: ... here. Stream
BLOCK_VARS manually.
(trees_in::core_vals) <case BLOCK>: Stream BLOCK_VARS
manually. Handle deduplicated local types..
(trees_out::key_local_type): Define.
(trees_in::key_local_type): Define.
(trees_out::get_merge_kind) <case FUNCTION_DECL>: Return
MK_local_type for a local type.
(trees_out::key_mergeable) <case FUNCTION_DECL>: Use
key_local_type.
(trees_in::key_mergeable) <case FUNCTION_DECL>: Likewise.
(trees_in::is_matching_decl): Be flexible with type mismatches
for local entities.
(trees_in::register_duplicate): Also register the
DECL_TEMPLATE_RESULT of a TEMPLATE_DECL as a duplicate.
(depset_cmp): Return 0 for equal IDENTIFIER_HASH_VALUEs.
gcc/testsuite/ChangeLog:
* g++.dg/modules/merge-17.h: New test.
* g++.dg/modules/merge-17_a.H: New test.
* g++.dg/modules/merge-17_b.C: New test.
* g++.dg/modules/xtreme-header-7_a.H: New test.
* g++.dg/modules/xtreme-header-7_b.C: New test.
Reviewed-by: Jason Merrill <jason@redhat.com>
The second testcase in 113141 is a separate issue: we first decide that the
conversion is ill-formed, but then when recalculating the special c_cast_p
handling makes us think it's OK. We don't want that, it should continue to
fall back to the reinterpret_cast interpretation. And while we're here,
let's warn that we're not using the conversion function.
Note that the standard seems to say that in this case we should
treat (Matrix &) as const_cast<Matrix &>(static_cast<const Matrix &>(X)),
which would use the conversion operator, but that doesn't match existing
practice, so let's resolve that another day. I've raised this issue with
CWG; at the moment I lean toward never binding a temporary in a C-style cast
to reference type, which would also be a change from existing practice.
PR c++/113141
gcc/c-family/ChangeLog:
* c.opt: Add -Wcast-user-defined.
gcc/ChangeLog:
* doc/invoke.texi: Document -Wcast-user-defined.
gcc/cp/ChangeLog:
* call.cc (reference_binding): For an invalid cast, warn and don't
recalculate.
gcc/testsuite/ChangeLog:
* g++.dg/conversion/ref12.C: New test.
Co-authored-by: Patrick Palka <ppalka@redhat.com>
The original testcase in PR113141 is an instance of CWG1996; the standard
fails to consider conversion functions when initializing a reference
directly from an initializer-list of one element, but then does consider
them when initializing a temporary. I have a proposed fix for this defect,
which is implemented here.
DR 1996
PR c++/113141
gcc/cp/ChangeLog:
* call.cc (reference_binding): Check direct binding from
a single-element list.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/initlist-ref1.C: New test.
* g++.dg/cpp0x/initlist-ref2.C: New test.
* g++.dg/cpp0x/initlist-ref3.C: New test.
Co-authored-by: Patrick Palka <ppalka@redhat.com>
The middle-end warns about the ANNOTATE_EXPR added for while/for loops
if they declare a var inside of the loop condition.
This is because the assumption is that ANNOTATE_EXPR argument is used
immediately in a COND_EXPR (later GIMPLE_COND), but simplify_loop_decl_cond
wraps the ANNOTATE_EXPR inside of a TRUTH_NOT_EXPR, so it no longer
holds.
The following patch fixes that by adding the TRUTH_NOT_EXPR inside of the
ANNOTATE_EXPR argument if any.
2024-04-12 Jakub Jelinek <jakub@redhat.com>
PR c++/114691
* semantics.cc (simplify_loop_decl_cond): Use cp_build_unary_op with
TRUTH_NOT_EXPR on ANNOTATE_EXPR argument (if any) rather than
ANNOTATE_EXPR itself.
* g++.dg/ext/pr114691.C: New test.
The original PR114393 testcase is unfortunately still not accepted after
r14-9938-g081c1e93d56d35 due to return type deduction confusion when a
lambda-expr is used as a default template argument.
The below reduced testcase demonstrates the bug. Here when forming the
dependent specialization b_v<U> we substitute the default argument of F,
a lambda-expr, with _Descriptor=U. (In this case in_template_context is
true since we're in the context of the template c_v, so we don't defer.)
This substitution in turn lowers the level of the lambda's auto return
type from 2 to 1 and so later, when instantiating c_v<int, char> we wrongly
substitute this auto with the template argument at level=0,index=0, i.e.
int, instead of going through do_auto_deduction which would yield char.
One way to fix this would be to use a level-less auto to represent a
deduced return type of a lambda, but that might be too invasive of a
change at this stage, and it might be better to do this across the board
for all deduced return types.
Another way would be to pass tf_partial from coerce_template_parms during
dependent substitution into a default template argument so that the
substitution doesn't do any level-lowering, but that wouldn't do the right
thing in this case due to the tf_partial early exit in the LAMBDA_EXPR
case of tsubst_expr.
Yet another way, and the approach that this patch takes, is to just
defer all dependent substitution into a lambda-expr, building upon the
logic added in r14-9938-g081c1e93d56d35. This also helps ensure
LAMBDA_EXPR_REGEN_INFO consists only of the concrete template arguments
that were ultimately substituted into the most general lambda.
PR c++/114393
gcc/cp/ChangeLog:
* pt.cc (tsubst_lambda_expr): Also defer all dependent
substitution.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/lambda-targ2a.C: New test.
Reviewed-by: Jason Merrill <jason@redhat.com>
I had another look at this P1 PR today.
You said in the "c++: fix in-charge parm in constexpr" mail back in December
(as well as in the r14-6507 commit message):
"Since a class with vbases can't have constexpr 'tors there isn't actually
a need for an in-charge parameter in a destructor" but the ICE is because
the destructor is marked implicitly constexpr.
https://eel.is/c++draft/dcl.constexpr#3.2 says that a destructor of a class
with virtual bases is not constexpr-suitable, but we were actually
implementing this just for constructors, so clearly my fault from the
https://wg21.link/P0784R7 implementation. That paper clearly added that
sentence in there and removed similar sentence just from the constructor case.
So, the following patch makes sure the
else if (CLASSTYPE_VBASECLASSES (DECL_CONTEXT (fun)))
{
ret = false;
if (complain)
error ("%q#T has virtual base classes", DECL_CONTEXT (fun));
}
hunk is done no just for DECL_CONSTRUCTOR_P (fun), but also
DECL_DESTRUCTOR_P (fun) - in that case just for cxx_dialect >= cxx20,
as for cxx_dialect < cxx20 we already set ret = false; and diagnose
a different error, so no need to diagnose two.
2024-04-12 Jakub Jelinek <jakub@redhat.com>
PR c++/114426
* constexpr.cc (is_valid_constexpr_fn): Return false/diagnose with
complain destructors in classes with virtual bases.
* g++.dg/cpp2a/pr114426.C: New test.
* g++.dg/cpp2a/constexpr-dtor16.C: New test.
The problem is `!a?b:c` pattern will create a COND_EXPR with an 1bit signed integer
which breaks patterns like `a?~t:t`. This rejects when we have a signed operand for
both patterns.
Note for GCC 15, I am going to look at the canonicalization of `a?~t:t` where t
was a constant since I think keeping it a COND_EXPR might be more canonical and
is what VPR produces from the same IR; if anything expand should handle which one
is better.
Bootstrapped and tested on x86_64-linux-gnu with no regressions.
PR tree-optimization/114666
gcc/ChangeLog:
* match.pd (`!a?b:c`): Reject signed types for the condition.
(`a?~t:t`): Likewise.
gcc/testsuite/ChangeLog:
* gcc.c-torture/execute/bitfld-signed1-1.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
The svzero_mask_za intrinsic tried to use the shortest combination
of .b, .h, .s and .d tiles, allowing mixtures of sizes where necessary.
However, Iain S pointed out that LLVM instead requires the tiles to
have the same suffix. GAS supports both versions, so this patch
generates the LLVM-friendly form.
gcc/
* config/aarch64/aarch64.cc (aarch64_output_sme_zero_za): Require
all tiles to have the same suffix.
gcc/testsuite/
* gcc.target/aarch64/sme/acle-asm/zero_mask_za.c (zero_mask_za_ab)
(zero_mask_za_d7, zero_mask_za_bf): Expect a list of .d tiles instead
of a mixture.
As mentioned in PR114678 those failures will be fixed by
https://gcc.gnu.org/pipermail/gcc-patches/2024-March/648303.html
For GCC 14 just xfail them which should be reverted once the patch is
applied.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/range-sincos.c: Xfail for s390.
* gcc.dg/tree-ssa/vrp-float-abs-1.c: Dito.
The below testcases use a lambda-expr as a template argument and they
all trip over the below added tsubst_lambda_expr sanity check ultimately
because current_template_parms is empty which causes push_template_decl
to return error_mark_node from the call to begin_lambda_type. Were it
not for the sanity check this silent error_mark_node result leads to
nonsensical errors down the line, or silent breakage.
In the first testcase, we hit this assert during instantiation of the
dependent alias template-id c1_t<_Data> from instantiate_template, which
clears current_template_parms via push_to_top_level. Similar story for
the second testcase. For the third testcase we hit the assert during
partial instantiation of the member template from instantiate_class_template
which similarly calls push_to_top_level.
These testcases illustrate that templated substitution into a lambda-expr
is not always possible, in particular when we lost the relevant template
context. I experimented with recovering the template context by making
tsubst_lambda_expr fall back to using scope_chain->prev->template_parms if
current_template_parms is empty which worked but seemed like a hack. I
also experimented with preserving the template context by keeping
current_template_parms set during instantiate_template for a dependent
specialization which also worked but it's at odds with the fact that we
cache dependent specializations (and so they should be independent of
the template context).
So instead of trying to make such substitution work, this patch uses the
extra-args mechanism to defer templated substitution into a lambda-expr
when we lost the relevant template context.
PR c++/114393
PR c++/107457
PR c++/93595
gcc/cp/ChangeLog:
* cp-tree.h (LAMBDA_EXPR_EXTRA_ARGS): Define.
(tree_lambda_expr::extra_args): New field.
* module.cc (trees_out::core_vals) <case LAMBDA_EXPR>: Stream
LAMBDA_EXPR_EXTRA_ARGS.
(trees_in::core_vals) <case LAMBDA_EXPR>: Likewise.
* pt.cc (has_extra_args_mechanism_p): Return true for LAMBDA_EXPR.
(tree_extra_args): Handle LAMBDA_EXPR.
(tsubst_lambda_expr): Use LAMBDA_EXPR_EXTRA_ARGS to defer templated
substitution into a lambda-expr if we lost the template context.
Add sanity check for error_mark_node result from begin_lambda_type.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/lambda-targ2.C: New test.
* g++.dg/cpp2a/lambda-targ3.C: New test.
* g++.dg/cpp2a/lambda-targ4.C: New test.
Reviewed-by: Jason Merrill <jason@redhat.com>
The fallback function (gf_vsnprintf) to provide a vsnprintf function
if the system library doesn't have one would not compile due to the
variable name for the string's destination buffer not being updated
after the refactor in 2018 in edaaef601d.
This updates the internal logic of gf_vsnprintf to now use the str
variable defined in the function signature.
libgfortran/ChangeLog:
2024-04-04 Ian McInerney <i.mcinerney17@imperial.ac.uk>
* runtime/error.c (gf_vsnprintf): Fix compilation
Signed-off-by: Ian McInerney <i.mcinerney17@imperial.ac.uk>
This patch would like to fix the Werror=sign-compare similar to below:
gcc/config/riscv/riscv.cc: In function ‘void
riscv_validate_vector_type(const_tree, const char*)’:
gcc/config/riscv/riscv.cc:5614:23: error: comparison of integer
expressions of different signedness: ‘int’ and ‘unsigned int’
[-Werror=sign-compare]
5614 | if (TARGET_MIN_VLEN < required_min_vlen)
The TARGET_MIN_VLEN is *int* by default but the required_min_vlen
returned from riscv_vector_required_min_vlen is **unsigned**. Thus,
adjust the related function and reference variable(s) to int type
to avoid such kind of Werror.
The below test suite is passed for this patch.
* The rv64gcv fully regression tests.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_vector_float_type_p): Take int
as the return value instead of unsigned.
(riscv_vector_element_bitsize): Ditto.
(riscv_vector_required_min_vlen): Ditto.
(riscv_validate_vector_type): Take int type for local variable(s).
Signed-off-by: Pan Li <pan2.li@intel.com>
On s390 pr94688.c is failing due to excess error
pr94688.c:6:5: warning: allocated buffer size is not a multiple of the pointee's size [CWE-131] [-Wanalyzer-allocation-size]
This is because on s390 functions are by default aligned to an 8-byte
boundary and during function type construction size is set to function
boundary. Thus, for the assignment
a.0_1 = (void (*<T237>) ()) &a;
we have that the right-hand side is pointing to a 4-byte memory region
whereas the size of the function pointer is 8 byte and a warning is
emitted.
Since -Wanalyzer-allocation-size is not about pointers to code, bail out
early.
gcc/analyzer/ChangeLog:
* region-model.cc (region_model::check_region_size): Bail out
early on function pointers.
While translation of the verifier messages is questionable, that case is
something that ideally should never happen except to gcc developers
and so pressumably English should be fine, we use error etc. APIs and
those imply translatations and some translators translate it.
The following patch adjusts the code such that we don't emit
appel returns_twice est not first dans le bloc de base 33
in French (i.e. 2 English word in the middle of a French message).
Similarly Swedish or Ukrainian.
Note, the German translator did differentiate between these verifier
messages vs. normal user facing and translated it to:
"Interner Fehler: returns_twice call is %s in basic block %d"
so just a German prefix before English message.
2024-04-12 Jakub Jelinek <jakub@redhat.com>
* tree-cfg.cc (gimple_verify_flow_info): Make the misplaced
returns_twice diagnostics translatable.
The tree-cfg.cc verifier only diagnoses returns_twice calls preceded
by non-label/debug stmts if it is in a bb with abnormal predecessor.
The following testcase shows that if a user lies in the attributes
(a function which never returns can't be pure, and can't return
twice when it doesn't ever return at all), when we figure it out,
we can remove the abnormal edges to the "returns_twice" call and perhaps
whole .ABNORMAL_DISPATCHER etc.
edge_before_returns_twice_call then ICEs because it can't find such
an edge.
The following patch limits the special handling to calls in bbs where
the verifier requires that.
2024-04-12 Jakub Jelinek <jakub@redhat.com>
PR sanitizer/114687
* gimple-iterator.cc (gsi_safe_insert_before): Only use
edge_before_returns_twice_call if bb_has_abnormal_pred.
(gsi_safe_insert_seq_before): Likewise.
* gimple-lower-bitint.cc (bitint_large_huge::lower_call): Only
push to m_returns_twice_calls if bb_has_abnormal_pred.
* gcc.dg/asan/pr114687.c: New test.
contrib/check-params-in-docs.py is a script that checks that all options
reported with gcc --help=params are in gcc/doc/invoke.texi and vice
versa.
gcc/doc/invoke.texi lists target-specific params but gcc --help=params
doesn't. This meant that the script would mistakenly complain about
parms missing from --help=params. Previously, the script was just set
to ignore aarch64 and gcn params which solved this issue only for x86.
This patch sets the script to ignore all target-specific params.
contrib/ChangeLog:
* check-params-in-docs.py: Ignore target specific params.
Signed-off-by: Filip Kastl <fkastl@suse.cz>
Prevent loop unrolling of the innermost loop because otherwise we are
left with no loop interchange for targets like s390 which have a more
aggressive loop unrolling strategy.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/loop-interchange-16.c: Prevent loop unrolling
of the innermost loop.
This patch would like to fix one ICE when vector is not enabled
in hook TARGET_FUNCTION_VALUE_REGNO_P implementation. The vector
regno is available if and only if the TARGET_VECTOR is true. The
previous implement missed this condition and then result in ICE
when rv64gc build option without vector.
The below test suite is passed for this patch.
* The rv64gcv fully regression tests.
* The rv64gc fully regression tests.
PR target/114639
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_function_value_regno_p): Add
TARGET_VECTOR predicate for V_RETURN regno.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/pr114639-1.c: New test.
* gcc.target/riscv/pr114639-2.c: New test.
* gcc.target/riscv/pr114639-3.c: New test.
* gcc.target/riscv/pr114639-4.c: New test.
Signed-off-by: Pan Li <pan2.li@intel.com>
This patch fixes a small error that could occur in the debug comment
when emitting a type reference with btf_asm_type_ref.
While working on a previous patch, I noticed the following in the asm
output for the test btf-bitfields-4.c:
...
.long 0x39 # MEMBER 'c' idx=3
.long 0x6 # btm_type: (BTF_KIND_UNKN '')
...
.long 0x34 # TYPE 6 BTF_KIND_INT 'char'
The type for member 'c' is correct, but the comment for the member
incorrectly reads "BTF_KIND_UNKN ''". This was caused by an
incorrect type lookup in btf_asm_type_ref that could happen if the
source file has types which can be represented in CTF but not in BTF.
This patch fixes the issue by changing btf_asm_type_ref to work fully
in the CTF ID space until writing out the final BTF ID. That ensures
types are correctly identified when writing the asm debug comments,
like the following fixed comment for the above case.
...
.long 0x39 # MEMBER 'c' idx=3
.long 0x6 # btm_type: (BTF_KIND_INT 'char')
...
Note that there was no problem with the actual BTF information, the
only error was in the comment. This patch does not change the output
BTF information, and no tests were affected.
gcc/
* btfout.cc (btf_asm_type_ref): Convert IDs to BTF internally and
fix potentially looking up wrong type for asm debug comment info.
Split into...
(btf_asm_datasec_type_ref): ... This. New.
(btf_asm_datasec_entry): Call it here, instead of btf_asm_type_ref.
(btf_asm_type, btf_asm_array, btf_asm_varent, btf_asm_sou_member)
(btf_asm_func_arg, btf_asm_func_type): Adapt btf_asm_type_ref call.
This patch fixes an issue with mangled BTF that could occur when
a struct type contains a bitfield member which cannot be represented
in BTF. It is undefined what should happen in such cases, but we can
at least do something reasonable.
Commit
936dd627cd "btf: do not skip members of data type with type id
BTF_VOID_TYPEID"
made a similar change for un-representable non-bitfield members, but
had an unintended side-effect of mangling BTF for un-representable
bitfields: the struct (or union) would account for the offending
bitfield in its member count but the bitfield member itself was
not emitted, making the member count incorrect.
This change ensures that non-representable bitfield members of struct
and union types are always emitted with BTF_VOID_TYPEID. This avoids
corrupting the BTF information for the entire struct or union type.
gcc/
* btfout.cc (btf_asm_sou_member): Always emit non-representable
bitfield members as having 'void' type. Refactor slightly.
gcc/testsuite/
* gcc.dg/debug/btf/btf-bitfields-4.c: Add two new checks.
contrib/check-params-in-docs.py is a script that checks that all
options reported with ./gcc/xgcc -Bgcc --help=param are in
gcc/doc/invoke.texi and vice versa.
gcn-preferred-vectorization-factor is in the manual but normally not
reported by --help, probably because I do not have gcn offload
configured. This patch makes the script silently about this particular
fact.
contrib/ChangeLog:
2024-04-11 Martin Jambor <mjambor@suse.cz>
* check-params-in-docs.py (ignored): Add
gcn-preferred-vectorization-factor.
This patch fixes some testisms introduced by:
commit 5aa3fec38c
Author: Andre Vieira <andre.simoesdiasvieira@arm.com>
Date: Wed Apr 10 16:29:46 2024 +0100
aarch64: Add support for _BitInt
The testcases were relying on an unnecessary sign-extend that is no longer
generated.
The tested version was just slightly behind top of trunk when the patch
was committed, and the codegen had changed, for the better, by then.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/bitfield-bitint-abi-align16.c (g1, g8, g16, g1p, g8p,
g16p): Remove unnecessary sbfx.
* gcc.target/aarch64/bitfield-bitint-abi-align8.c (g1, g8, g16, g1p, g8p,
g16p): Likewise.
When we are already touching this topic, here is a patch like r13-5126
which documents the upcoming release symbol versions in the documentation.
2024-04-11 Jakub Jelinek <jakub@redhat.com>
* doc/xml/manual/abi.xml: Add latest library versions.
* doc/html/manual/abi.html: Regenerate.
While the previous patch was regeneration from 13.2 release (with hand
edits for arches I don't have libraries for but which are still well
maintained), thius one is regeneration from the trunk (this time for
hand edits everywhere for the PR114692
https://gcc.gnu.org/pipermail/libstdc++/2024-April/058570.html
patch; plus again hand edits for arches I don't have libraries for).
2024-04-11 Jakub Jelinek <jakub@redhat.com>
* config/abi/post/x86_64-linux-gnu/baseline_symbols.txt: Update.
* config/abi/post/x86_64-linux-gnu/32/baseline_symbols.txt: Update.
* config/abi/post/i486-linux-gnu/baseline_symbols.txt: Update.
* config/abi/post/m68k-linux-gnu/baseline_symbols.txt: Update.
* config/abi/post/aarch64-linux-gnu/baseline_symbols.txt: Update.
* config/abi/post/s390x-linux-gnu/baseline_symbols.txt: Update.
* config/abi/post/riscv64-linux-gnu/baseline_symbols.txt: Update.
* config/abi/post/powerpc64le-linux-gnu/baseline_symbols.txt: Update.
* config/abi/post/powerpc64-linux-gnu/baseline_symbols.txt: Update.
Some architecture features have been combined under a single command
line flag, but have been assigned multiple FMV feature names with the
command line flag name enabling only a subset of these features in
the FMV specification. I've proposed reallocating names in the FMV
specification to match the command line flags [1], but for GCC 14 we'll
just remove them from the FMV feature list.
[1] https://github.com/ARM-software/acle/pull/315
gcc/ChangeLog:
* config/aarch64/aarch64-option-extensions.def:
Remove "memtag", "memtag2", "ssbs", "ssbs2", "ls64", "ls64_v"
and "ls64_accdata" FMV features.
It currently isn't possible to support function multiversioning features
properly in GCC without also enabling the extension in the command line
options (with the exception of features such as "rpres" that do not
require assembler support). We therefore remove unsupported features
from GCC's list of FMV features.
Some of these features ("fcma", "jscvt", "frintts", "flagm2", "wfxt",
"rcpc2", and perhaps "dpb" and "dpb2") will be added back in the future
once support for the command line option has been added.
The rest of the removed features I have proposed removing from the ACLE
specification as well, since it doesn't seem worthwhile to include support
for them; see the ACLE pull request for more detailed justification:
https://github.com/ARM-software/acle/pull/315
gcc/ChangeLog:
* config/aarch64/aarch64-option-extensions.def:
Remove "flagm2", "sha1", "pmull", "dit", "dpb", "dpb2", "jscvt",
"fcma", "rcpc2", "frintts", "dgh", "ebf16", "sve-bf16",
"sve-ebf16", "sve-i8mm", "sve2-pmull128", "memtag3", "bti" and
"wfxt" entries.
There was an assumption in some places that the aarch64_fmv_feature_data
array contained FEAT_MAX elements. While this assumption held up till
now, it is safer and more flexible to use the array size directly.
Also fix the lower bound in compare_feature_masks to use ">=0" instead
of ">0", and add a test using the features at index 0 and 1. However,
the test already passed, because the earlier popcount check makes it
impossible to reach the loop if the masks differ in exactly one
location.
gcc/ChangeLog:
* config/aarch64/aarch64.cc (compare_feature_masks):
Use ARRAY_SIZE and >=0 for iteration bounds.
(aarch64_mangle_decl_assembler_name): Use ARRAY_SIZE.
gcc/testsuite/ChangeLog:
* g++.target/aarch64/mv-1.C: New test.
Some higher priority FMV features were dependent subsets of lower
priority features. Fix this, using the new priorities specified in
https://github.com/ARM-software/acle/pull/279.
gcc/ChangeLog:
* config/aarch64/aarch64-option-extensions.def: Reorder FMV entries.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/cpunative/native_cpu_21.c: Reorder features.
* gcc.target/aarch64/cpunative/native_cpu_22.c: Ditto.
I added this new symbol in the wrong version. GLIBCXX_3.4.32 was
already used for the GCC 13.2.0 release, so the new symbol should have
been in a new GLIBCXX_3.4.33 version.
Additionally, the pattern doesn't need to use [cw] because we only ever
use __basic_file<char>, even for std::basic_filebuf<wchar_t>.
libstdc++-v3/ChangeLog:
PR libstdc++/114692
* config/abi/pre/gnu.ver (GLIBCXX_3.4.32): Move new exports for
__basic_file::native_handle to ...
(GLIBCXX_3.4.33): ... here. Adjust to not match wchar_t
specialization, which isn't used.
* testsuite/util/testsuite_abi.cc: Add GLIBCXX_3.4.33 and update
latest version check.
r13-6452-g341e6cd8d603a3 made build_extra_args walk evaluated contexts
first so that we prefer processing a local specialization in an evaluated
context even if its first use is in an unevaluated context. But this
means we need to avoid walking a tree that already has extra args/specs
saved because the list of saved specs appears to be an evaluated
context which we'll now walk first. It seems then that we should be
calculating the saved specs from scratch each time, rather than
potentially walking the saved specs list from an earlier partial
instantiation when calling build_extra_args a second time around.
PR c++/114303
gcc/cp/ChangeLog:
* constraint.cc (tsubst_requires_expr): Clear
REQUIRES_EXPR_EXTRA_ARGS before calling build_extra_args.
* pt.cc (tree_extra_args): Define.
(extract_locals_r): Assert *_EXTRA_ARGS is empty.
(tsubst_stmt) <case IF_STMT>: Clear IF_SCOPE on the new
IF_STMT. Call build_extra_args on the new IF_STMT instead
of t which might already have IF_STMT_EXTRA_ARGS.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1z/constexpr-if-lambda6.C: New test.
Reviewed-by: Jason Merrill <jason@redhat.com>
This patch introduces a small modula-2 language section to the
Language Standards Supported by GCC node.
gcc/ChangeLog:
* doc/standards.texi (Language Standards Supported by GCC):
Add Modula-2 language section.
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
The following patch regenerates the ABI files for 13 branch (I've only changed
the Linux files which were updated in r13-7289, all but m68k, riscv64 and
powerpc64 are from actual Fedora 39 gcc builds, the rest hand edited).
We've added one symbol very early in the 13.2 cycle, but then added 2
further ones very soon afterwards, quite a long time before 13.2 release
and haven't regenerated. The patch applies cleanly to trunk as well.
2024-04-11 Jakub Jelinek <jakub@redhat.com>
* config/abi/post/x86_64-linux-gnu/baseline_symbols.txt: Update.
* config/abi/post/x86_64-linux-gnu/32/baseline_symbols.txt: Update.
* config/abi/post/i486-linux-gnu/baseline_symbols.txt: Update.
* config/abi/post/m68k-linux-gnu/baseline_symbols.txt: Update.
* config/abi/post/aarch64-linux-gnu/baseline_symbols.txt: Update.
* config/abi/post/s390x-linux-gnu/baseline_symbols.txt: Update.
* config/abi/post/riscv64-linux-gnu/baseline_symbols.txt: Update.
* config/abi/post/powerpc64le-linux-gnu/baseline_symbols.txt: Update.
* config/abi/post/powerpc64-linux-gnu/baseline_symbols.txt: Update.
On Tue, Mar 26, 2024 at 02:08:02PM +0800, liuhongt wrote:
> > > So, try to add some other variable with larger size and smaller alignment
> > > to the frame (and make sure it isn't optimized away).
> > >
> > > alignb above is the alignment of the first partition's var, if
> > > align_frame_offset really needs to depend on the var alignment, it probably
> > > should be the maximum alignment of all the vars with alignment
> > > alignb * BITS_PER_UNIT <=3D MAX_SUPPORTED_STACK_ALIGNMENT
> > >
>
> In asan_emit_stack_protection, when it allocated fake stack, it assume
> bottom of stack is also aligned to alignb. And the place violated this
> is the first var partition. which is 32 bytes offsets, it should be
> BIGGEST_ALIGNMENT / BITS_PER_UNIT.
> So I think we need to use MAX (BIGGEST_ALIGNMENT /
> BITS_PER_UNIT, ASAN_RED_ZONE_SIZE) for the first var partition.
Your first patch aligned offsets[0] to maximum of alignb and
ASAN_RED_ZONE_SIZE. But as I wrote in the reply to that mail, alignb there
is the alignment of just a single variable which is the first one to appear
in the sorted list and is placed in the highest spot in the stack frame.
That is not necessarily the largest alignment, the sorting ensures that it
is a variable with the largest size in the frame (and only if several of
them have equal size, largest alignment from the same sized ones). Your
second patch used maximum of BIGGEST_ALIGNMENT / BITS_PER_UNIT and
ASAN_RED_ZONE_SIZE. That doesn't change anything at all when using
-mno-avx512f - offsets[0] is still just 32-byte aligned in that case
relative to top of frame, just changes the -mavx512f case to be 64-byte
aligned offsets[0] (aka offsets[0] is then either 0 or -64 instead of either
0 or -32). That will not help if any variable in the frame needs 128-byte,
256-byte, 512-byte ... 4096-byte alignment. If you want to fix the bug in
the spot you've touched, you'd need to walk all the
stack_vars[stack_vars_sorted[si2]] for si2 [si + 1, n - 1] and for those
where the loop would do anything (i.e.
stack_vars[i2].representative == i2
&& TREE_CODE (decl2) == SSA_NAME
? SA.partition_to_pseudo[var_to_partition (SA.map, decl2)] == NULL_RTX
: DECL_RTL (decl2) == pc_rtx
and the pred applies (but that means also walking the earlier ones!
because with -fstack-protector* the vars can be processed in several calls) and
alignb2 * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT
and compute maximum of those alignments.
That maximum is already computed,
data->asan_alignb = MAX (data->asan_alignb, alignb);
computes that, but you get the final result only after you do all the
expand_stack_vars calls. You'd need to compute it before.
Though, that change would be still in the wrong place.
The thing is, it would be a waste of the precious stack space when it isn't
needed at all (e.g. when asan will not at compile time do the use after
return checking, or if it won't do it at runtime, or even if it will do at
runtime it will waste the space on the stack).
The following patch fixes it solely for the __asan_stack_malloc_N
allocations, doesn't enlarge unnecessarily further the actual stack frame.
Because asan is only supported on FRAME_GROWS_DOWNWARD architectures
(mips, rs6000 and xtensa are conditional FRAME_GROWS_DOWNWARD arches, which
for -fsanitize=address or -fstack-protector* use FRAME_GROWS_DOWNWARD 1,
otherwise 0, others supporting asan always just use 1), the assumption for
the dynamic stack realignment is that the top of the stack frame (aka offset
0) is aligned to alignb passed to the function (which is the maximum of alignb
of all the vars in the frame). As checked by the assertion in the patch,
offsets[0] is 0 most of the time and so that assumption is correct, the only
case when it is not 0 is if -fstack-protector* is on together with
-fsanitize=address and cfgexpand.cc (create_stack_guard) created a stack
guard. That is the only variable which is allocated in the stack frame
right away, for all others with -fsanitize=address defer_stack_allocation
(or -fstack-protector*) returns true and so they aren't allocated
immediately but handled during the frame layout phases. So, the original
frame_offset of 0 is changed because of the stack guard to
-pointer_size_in_bytes and later at the
if (data->asan_vec.is_empty ())
{
align_frame_offset (ASAN_RED_ZONE_SIZE);
prev_offset = frame_offset.to_constant ();
}
to -ASAN_RED_ZONE_SIZE. The asan_emit_stack_protection code wasn't
taking this into account though, so essentially assumed in the
__asan_stack_malloc_N allocated memory it needs to align it such that
pointer corresponding to offsets[0] is alignb aligned. But that isn't
correct if alignb > ASAN_RED_ZONE_SIZE, in that case it needs to ensure that
pointer corresponding to frame offset 0 is alignb aligned.
The following patch fixes that. Unlike the previous case where
we knew that asan_frame_size + base_align_bias falls into the same bucket
as asan_frame_size, this isn't in some cases true anymore, so the patch
recomputes which bucket to use and if going to bucket 11 (because there is
no __asan_stack_malloc_11 function in the library) disables the after return
sanitization.
2024-04-11 Jakub Jelinek <jakub@redhat.com>
PR middle-end/110027
* asan.cc (asan_emit_stack_protection): Assert offsets[0] is
zero if there is no stack protect guard, otherwise
-ASAN_RED_ZONE_SIZE. If alignb > ASAN_RED_ZONE_SIZE and there is
stack pointer guard, take the ASAN_RED_ZONE_SIZE bytes allocated at
the top of the stack into account when computing base_align_bias.
Recompute use_after_return_class from asan_frame_size + base_align_bias
and set to -1 if that would overflow to 11.
* gcc.dg/asan/pr110027.c: New test.
The following fixes an omission in r14-162-gcda246f8b421ba causing
wrong-debug and a bunch of guality regressions.
PR tree-optimization/109596
* tree-ssa-loop-ch.cc (ch_base::copy_headers): Propagate
debug stmts to nonexit->dest rather than exit->dest.
When inlining a gcond it can map to multiple stmts, esp. with
non-call EH. The following makes sure to pick up the remapped
condition when dealing with condition coverage.
PR middle-end/114681
* tree-inline.cc (copy_bb): Key on the remapped stmt
to identify gconds to have condition coverage data remapped.
* gcc.misc-tests/gcov-pr114681.c: New testcase.
The following testcase ICEs starting with the r14-4229 PR111529
change which moved ANNOTATE_EXPR handling from tsubst_expr to
tsubst_copy_and_build.
ANNOTATE_EXPR is only allowed in the IL to wrap a loop condition,
and the loop condition of while/for loops can be a COMPOUND_EXPR
with DECL_EXPR in the first operand and the corresponding VAR_DECL
in the second, as created by finish_cond
else if (!empty_expr_stmt_p (cond))
expr = build2 (COMPOUND_EXPR, TREE_TYPE (expr), cond, expr);
Since then Patrick reworked the instantiation, so that we have now
tsubst_stmt and tsubst_expr and ANNOTATE_EXPR ended up in the latter,
while only tsubst_stmt can handle DECL_EXPR.
Now, the reason why the while/for loops with variable declaration
in the condition works in templates without the pragmas (i.e. without
ANNOTATE_EXPR) is that both the FOR_STMT and WHILE_STMT handling uses
RECUR aka tsubst_stmt in handling of the *_COND operand:
case FOR_STMT:
stmt = begin_for_stmt (NULL_TREE, NULL_TREE);
RECUR (FOR_INIT_STMT (t));
finish_init_stmt (stmt);
tmp = RECUR (FOR_COND (t));
finish_for_cond (tmp, stmt, false, 0, false);
and
case WHILE_STMT:
stmt = begin_while_stmt ();
tmp = RECUR (WHILE_COND (t));
finish_while_stmt_cond (tmp, stmt, false, 0, false);
Therefore, it will handle DECL_EXPR embedded in COMPOUND_EXPR of the
{WHILE,FOR}_COND just fine.
But if that COMPOUND_EXPR with DECL_EXPR is wrapped with one or more
ANNOTATE_EXPRs, because ANNOTATE_EXPR is now done solely in tsubst_expr
and uses RECUR there (i.e. tsubst_expr), it will ICE on DECL_EXPR in there.
This could be fixed by keeping ANNOTATE_EXPR handling in tsubst_expr but
using tsubst_stmt for the first operand, but this patch instead
moves ANNOTATE_EXPR handling to tsubst_stmt (and uses tsubst_expr for the
second/third operand).
2024-04-11 Jakub Jelinek <jakub@redhat.com>
PR c++/114409
* pt.cc (tsubst_expr) <case ANNOTATE_EXPR>: Move to ...
(tsubst_stmt) <case ANNOTATE_EXPR>: ... here. Use tsubst_expr
instead of RECUR for the last 2 arguments.
* g++.dg/ext/pr114409-2.C: New test.
This patch would like to fix a ICE in mode sw for below example code.
during RTL pass: mode_sw
test.c: In function ‘vbool16_t j(vuint64m4_t)’:
test.c:15:1: internal compiler error: in create_pre_exit, at
mode-switching.cc:451
15 | }
| ^
0x3978f12 create_pre_exit
__RISCV_BUILD__/../gcc/mode-switching.cc:451
0x3979e9e optimize_mode_switching
__RISCV_BUILD__/../gcc/mode-switching.cc:849
0x397b9bc execute
__RISCV_BUILD__/../gcc/mode-switching.cc:1324
extern size_t get_vl ();
vbool16_t
test (vuint64m4_t a)
{
unsigned long b;
return __riscv_vmsne_vx_u64m4_b16 (a, b, get_vl ());
}
The create_pre_exit would like to find a return value copy. If
not, there will be a reason in assert but not available for above
sample code when vector calling convension is enabled by default.
This patch would like to override the TARGET_FUNCTION_VALUE_REGNO_P
for vector register and then we will have hard_regno_nregs for copy_num,
aka there is a return value copy.
As a side-effect of allow vector in TARGET_FUNCTION_VALUE_REGNO_P, the
TARGET_GET_RAW_RESULT_MODE will have vector mode and which is sizeless
cannot be converted to fixed_size_mode. Thus override the hook
TARGET_GET_RAW_RESULT_MODE and return VOIDmode when the regno is-not-a
fixed_size_mode.
The below tests are passed for this patch.
* The fully riscv regression tests.
* The reproducing test in bugzilla PR114639.
PR target/114639
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_function_value_regno_p): New func
impl for hook TARGET_FUNCTION_VALUE_REGNO_P.
(riscv_get_raw_result_mode): New func imple for hook
TARGET_GET_RAW_RESULT_MODE.
(TARGET_FUNCTION_VALUE_REGNO_P): Impl the hook.
(TARGET_GET_RAW_RESULT_MODE): Ditto.
* config/riscv/riscv.h (V_RETURN): New macro for vector return.
(GP_RETURN_FIRST): New macro for the first GPR in return.
(GP_RETURN_LAST): New macro for the last GPR in return.
(FP_RETURN_FIRST): Diito but for FPR.
(FP_RETURN_LAST): Ditto.
(FUNCTION_VALUE_REGNO_P): Remove as deprecated and replace by
TARGET_FUNCTION_VALUE_REGNO_P.
gcc/testsuite/ChangeLog:
* g++.target/riscv/rvv/base/pr114639-1.C: New test.
* gcc.target/riscv/rvv/base/pr114639-1.c: New test.
Signed-off-by: Pan Li <pan2.li@intel.com>
The previous fix in gen_ctf_sou_type () exposes an issue in BTF
generation, however: BTF emission was currently decrementing the vlen
(indicating the number of members) to skip members of type CTF_K_UNKNOWN
altogether, but still emitting the BTF for the corresponding member (in
output_asm_btf_sou_fields ()).
One can see malformed BTF by executing the newly added CTF testcase
(gcc.dg/debug/ctf/ctf-bitfields-5.c) with -gbtf instead or even existing
btf-struct-2.c without this patch.
To fix the issue, it makes sense to rather _not_ skip members of data
type of type id BTF_VOID_TYPEID.
gcc/ChangeLog:
* btfout.cc (btf_asm_type): Do not skip emitting members of
unknown type.
gcc/testsuite/ChangeLog:
* gcc.dg/debug/btf/btf-bitfields-4.c: Update the vlen check.
* gcc.dg/debug/btf/btf-struct-2.c: Check that member named 'f'
with void data type is emitted.
PR debug/112878: ICE: in ctf_add_slice, at ctfc.cc:499 with
_BitInt > 255 in a struct and -gctf1
The CTF generation in GCC does not have a mechanism to roll-back an
already added type. In this testcase presented in the PR, we hit a
representation limit in CTF slices (for a member of a struct) and ICE,
after the type for struct (CTF_K_STRUCT) has already been added to the
container.
To exit gracefully instead, we now check for both the offset and size of
the bitfield to be explicitly <= 255. If the check fails, we emit the
member with type CTF_K_UNKNOWN. Note that, the value 255 stems from the
existing binutils libctf checks which were motivated to guard against
malformed inputs.
Although it is not accurate to say that this is a CTF representation
limit, mark the code with TBD_CTF_REPRESENTATION_LIMIT for now so that
this can be taken care of with the next format version bump, when
libctf's checks for the slice data can be lifted as well.
gcc/ChangeLog:
PR debug/112878
* dwarf2ctf.cc (gen_ctf_sou_type): Check for conditions before
call to ctf_add_slice. Use CTF_K_UNKNOWN type if fail.
gcc/testsuite/ChangeLog:
PR debug/112878
* gcc.dg/debug/ctf/ctf-bitfields-5.c: New test.