The addition of PLUS/MINUS/MULT/RDIV_EXPR frange handlers causes
miscompilation of some of the libm routines, resulting in lots of
glibc test failures. A part of them is purely PR107608 fold-overflow-1.c
etc. issues, say when the code does
return -0.5 / 0.0;
and expects division by zero to be emitted, but we propagate -Inf
and avoid the operation.
But there are also various tests where we end up with different computed
value from the expected ones. All those cases are like:
is: inf inf
should be: 1.18973149535723176502e+4932 0xf.fffffffffffffff0p+16380
is: inf inf
should be: 1.18973149535723176508575932662800701e+4932 0x1.ffffffffffffffffffffffffffffp+16383
is: inf inf
should be: 1.7976931348623157e+308 0x1.fffffffffffffp+1023
is: inf inf
should be: 3.40282346e+38 0x1.fffffep+127
and the corresponding source looks like:
static const double huge = 1.0e+300;
double whatever (...) {
...
return huge * huge;
...
}
which for rounding to nearest or +inf should and does return +inf, but
for rounding to -inf or 0 should instead return nextafter (inf, -inf);
The rules IEEE754 has are that operations on +-Inf operands are exact
and produce +-Inf (except for the invalid ones that produce NaN) regardless
of rounding mode, while overflows:
"a) roundTiesToEven and roundTiesToAway carry all overflows to ∞ with the
sign of the intermediate result.
b) roundTowardZero carries all overflows to the format’s largest finite
number with the sign of the intermediate result.
c) roundTowardNegative carries positive overflows to the format’s largest
finite number, and carries negative overflows to −∞.
d) roundTowardPositive carries negative overflows to the format’s most
negative finite number, and carries positive overflows to +∞."
The behavior around overflows to -Inf or nextafter (-inf, inf) was actually
handled correctly, we'd construct [-INF, -MAX] ranges in those cases
because !real_less (&value, &result) in that case - value is finite
but larger in magnitude than what the format can represent (but GCC
internal's format can), while result is -INF in that case.
But for the overflows to +Inf or nextafter (inf, -inf) was handled
incorrectly, it tested real_less (&result, &value) rather than
!real_less (&result, &value), the former test is true when already the
rounding value -> result rounded down and in that case we shouldn't
round again, we should round down when it didn't.
So, in theory this could be fixed just by adding one ! character,
- if ((mode_composite || (real_isneg (&inf) ? real_less (&result, &value)
+ if ((mode_composite || (real_isneg (&inf) ? !real_less (&result, &value)
: !real_less (&value, &result)))
but the following patch goes further. The distance between
nextafter (inf, -inf) and inf is large (infinite) and expressions like
1.0e+300 * 1.0e+300 always produce +inf in round to nearest mode by far,
so I think having low bound of nextafter (inf, -inf) in that case is
unnecessary. But if it isn't multiplication but say addition and we are
inexact and very close to the boundary between rounding to nearest
maximum representable vs. rounding to nearest +inf, still using [MAX, +INF]
etc. ranges seems safer because we don't know exactly what we lost in the
inexact computation.
The following patch implements that.
2022-12-08 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/107967
* range-op-float.cc (frange_arithmetic): Fix a thinko - if
inf is negative, use nextafter if !real_less (&result, &value)
rather than if real_less (&result, &value). If result is +-INF
while value is finite and -fno-rounding-math, don't do rounding
if !inexact or if result is significantly above max representable
value or below min representable value.
* gcc.dg/pr107967-1.c: New test.
* gcc.dg/pr107967-2.c: New test.
* gcc.dg/pr107967-3.c: New test.
The constraints on auto in C2x disallow use with other storage-class
specifiers unless the type is inferred from an initializer. That
includes constexpr; add the missing checks for this case (the
combination of auto, constexpr and a type specifier).
Bootstrapped with no regressions for x86_64-pc-linux-gnu.
gcc/c/
* c-decl.cc (declspecs_add_type, declspecs_add_scspec): Check for
auto, constexpr and a type used together.
gcc/testsuite/
* gcc.dg/c2x-constexpr-1.c: Do not use auto, constexpr and a type
together.
* gcc.dg/c2x-constexpr-3.c: Add tests of auto, constexpr and type
used together.
Also pass threaded=1 to __glibcxx_backtrace_create_state and remove some
of the namespace scope declarations in the header.
Co-authored-by: François Dumont <frs.dumont@gmail.com>
libstdc++-v3/ChangeLog:
* include/debug/formatter.h [_GLIBCXX_DEBUG_BACKTRACE]
(_Error_formatter::_Error_formatter): Pass error handler to
__glibcxx_backtrace_create_state. Pass 1 for threaded argument.
(_Error_formatter::_S_err): Define empty function.
* src/c++11/debug.cc (_Error_formatter::_M_error): Pass error
handler to __glibcxx_backtrace_full.
Add a test for the case of auto with implicit int in C90 mode, which
is incompatible with C2x semantics (I missed adding such a test when
implementing C2x auto).
Tested for x86_64-pc-linux-gnu.
* gcc.dg/c90-auto-1.c: New test.
C2x supports __VA_OPT__, so adjust libcpp not to pedwarn for uses of
it (or of not passing any variable arguments to a variable-arguments
macro) in standard C2x mode.
I didn't try to duplicate existing tests for the details of the
feature, just verified -pedantic-errors handling is as expected. And
there's a reasonable argument (bug 98859) that __VA_OPT__ shouldn't be
diagnosed in older standard modes at all (as opposed to not passing
any variable arguments to a variable-arguments macro, for which older
versions of the C standard require a diagnostic as a constraint
violation); that argument applies to C as much as to C++, but I
haven't made any changes in that regard.
Bootstrapped with no regressions for x86_64-pc-linux-gnu.
libcpp/
* init.cc (lang_defaults): Enable va_opt for STDC2X.
* lex.cc (maybe_va_opt_error): Adjust diagnostic message for C.
* macro.cc (_cpp_arguments_ok): Update comment.
gcc/testsuite/
* gcc.dg/cpp/c11-vararg-1.c, gcc.dg/cpp/c2x-va-opt-1.c: New tests.
Now that gcc provides __XCHAL_* definitions use them instead of XCHAL_*
definitions from the include/xtensa-config.h. That makes libgcc
dynamically configurable for the target xtensa core.
libgcc/
* config/xtensa/crti.S (xtensa-config.h): Replace #inlcude with
xtensa-config-builtin.h.
* config/xtensa/crtn.S: Likewise.
* config/xtensa/lib1funcs.S: Likewise.
* config/xtensa/lib2funcs.S: Likewise.
* config/xtensa/xtensa-config-builtin.h: New File.
Import include/xtensa-dynconfig.h that defines XCHAL_* macros as fields
of a structure returned from the xtensa_get_config_v<x> function call.
Define that structure and fill it with default parameter values
specified in the include/xtensa-config.h.
Define reusable function xtensa_load_config that tries to load
configuration and return an address of an exported object from it.
Define the function xtensa_get_config_v1 that uses xtensa_load_config
to get structure xtensa_config_v1, either dynamically configured or the
default.
Provide essential XCHAL_* configuration parameters as __XCHAL_* built-in
macros. This way it will be possible to use them in libgcc and libc
without need to patch libgcc or libc source for the specific xtensa core
configuration.
gcc/
* config.gcc (xtensa*-*-*): Add xtensa-dynconfig.o to extra_objs.
* config/xtensa/t-xtensa (TM_H): Add xtensa-dynconfig.h.
(xtensa-dynconfig.o): New rule.
* config/xtensa/xtensa-dynconfig.c: New file.
* config/xtensa/xtensa-protos.h (xtensa_get_config_strings): New
declaration.
* config/xtensa/xtensa.h (xtensa-config.h): Replace #include
with xtensa-dynconfig.h
(XCHAL_HAVE_MUL32_HIGH, XCHAL_HAVE_RELEASE_SYNC)
(XCHAL_HAVE_S32C1I, XCHAL_HAVE_THREADPTR)
(XCHAL_HAVE_FP_POSTINC): Drop definitions.
(TARGET_DIV32): Replace with __XCHAL_HAVE_DIV32.
(TARGET_CPU_CPP_BUILTINS): Add new 'builtin' variable and loop
through string array returned by the xtensa_get_config_strings
function call.
include/
* xtensa-dynconfig.h: New file.
Ensure we only pass SI/DImode which fixes the assert.
gcc/
PR target/108006
* config/aarch64/aarch64.cc (aarch64_expand_sve_const_vector):
Fix call to aarch64_move_imm to use SI/DI.
If we are building PIC/PIE host executables, and we are building dependent
libs (e.g. GMP) in-tree those libs need to be configured to generate PIC code.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
ChangeLog:
* Makefile.def: Pass host_libs_picflag to host dependent library
configures.
* Makefile.in: Regenerate.
* configure: Regenerate.
* configure.ac (host_libs_picflag): New configure variable set to
'--with-pic' when building 'host_shared'.
When a function is declared const (even though it technically
accesses memory), ipa-modref discovering pureness shouldn't end
up suggesting that attribute. The following thus exempts
'const' functions from ipa_make_function_pure handling.
PR ipa/105676
* ipa-pure-const.cc (ipa_make_function_pure): Skip also
for functions already being const.
* gcc.dg/pr105676.c: New testcase.
For Alderlake there is similar issue like PR 81616, enable
avoid_fma256_chain will also benefit on Intel latest platforms
Alderlake and Sapphire Rapids.
gcc/ChangeLog:
* config/i386/x86-tune.def (X86_TUNE_AVOID_256FMA_CHAINS): Add
m_SAPPHIRERAPIDS, m_ALDERLAKE and m_CORE_ATOM.
gcc/ChangeLog:
PR target/107920
* config/aarch64/aarch64-sve-builtins-base.cc: Use
gsi_replace_with_seq_vops to handle virtual operands, and gate
the transform on !flag_non_call_exceptions.
* gimple-fold.cc (gsi_replace_with_seq_vops): Make function non static.
* gimple-fold.h (gsi_replace_with_seq_vops): Declare.
gcc/testsuite/ChangeLog:
PR target/107920
* gcc.target/aarch64/sve/acle/general/pr107920.c: New test.
* g++.target/aarch64/sve/pr107920.C: Likewise.
PR analyzer/107882 reports an ICE, due to trying to get a compound svalue
for this binding:
cluster for: a:
key: {bytes 0-3}
value: {UNKNOWN()}
key: {empty}
value: {UNKNOWN()}
key: {bytes 4-7}
value: {UNKNOWN()}
where there's an binding to the unknown value of zero bits in size
"somewhere" within "a" (perhaps between bits 3 and 4?)
This makes no sense, so this patch adds an assertion that we never
attempt to create a binding key for an empty region, and adds early
rejection of attempts to get or set the values of such regions, fixing
the ICE.
gcc/analyzer/ChangeLog:
PR analyzer/107882
* region-model.cc (region_model::get_store_value): Return an
unknown value for empty regions.
(region_model::set_value): Bail on empty regions.
* region.cc (region::empty_p): New.
* region.h (region::empty_p): New decl.
* state-purge.cc (same_binding_p): Bail if either region is empty.
* store.cc (binding_key::make): Assert that a concrete binding's
bit_size must be > 0.
(binding_cluster::mark_region_as_unknown): Bail on empty regions.
(binding_cluster::get_binding): Likewise.
(binding_cluster::remove_overlapping_bindings): Likewise.
(binding_cluster::on_unknown_fncall): Don't conjure values for
empty regions.
(store::fill_region): Bail on empty regions.
* store.h (class concrete_binding): Update comment to reflect that
the range of bits must be non-empty.
(concrete_binding::concrete_binding): Assert that bit range is
non-empty.
gcc/testsuite/ChangeLog:
PR analyzer/107882
* gcc.dg/analyzer/memcpy-pr107882.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
This test was recently modified to check that the library doesn't use
__unused anywhere, because that's a macro in newlib. But it's also a
macro in old versions of glibc, so the test now fails for those targets.
Disable that check for those targets as well as for newlib.
libstdc++-v3/ChangeLog:
PR libstdc++/107979
* testsuite/17_intro/names.cc: Do not check __unused on old
Glibc versions.
This uses a single byte for the minutes and seconds members, and places
the bool member next to those single bytes. This means we do not need 40
bytes to store a time that can fit in a single 8-byte integer.
When there is no subsecond precision we can do away with the _M_ss
member altogether. If the subsecond precision is coarse enough, we can
use a smaller representation for _M_ss, e.g. hh_mm_ss<milliseconds> only
needs uint_least32_t for _M_ss, and hh_mm_ss<duration<long, ratio<1,10>>
and hh_mm_ss<duration<int8_t, nano>> only need a single byte. In the
latter case the type can only ever represent up to 255ns anyway, so we
don't need a larger representation type (in such cases, we could even
remove the _M_h, _M_m and _M_s members, but it's a very unlikely
scenario that isn't worth optimizing for).
Except for specializations with a floating-point rep or using higher
precision than nanoseconds, hh_mm_ss should now fit in 16 bytes, or even
12 bytes for x86-32 where alignof(long long) == 4.
libstdc++-v3/ChangeLog:
* include/std/chrono (chrono::hh_mm_ss): Do not use 64-bit
representations for all four duration members. Reorder members.
(hh_mm_ss::hh_mm_ss()): Define as defaulted.
(hh_mm_ss::hh_mm_ss(Duration)): Delegate to a new private
constructor, instead of calling chrono::abs repeatedly.
* testsuite/std/time/hh_mm_ss/1.cc: Check floating-point
representations. Check default constructor. Check sizes.
The PR shows a bogus warning where jump threading generates code for the
undefined case that the insertion point is a value-initialized iterator
but _M_finish and _M_end_of_storage are unequal (so at least one must be
non-null). Using __builtin_unreachable() removes the bogus warning. Also
add an assertion to diagnose undefined misuses of a null iterator here,
so we don't just silently optimize that undefined code to something
unsafe.
libstdc++-v3/ChangeLog:
PR c++/106434
* include/bits/vector.tcc (insert(const_iterator, const T&)):
Add assertion and optimization hint that the iterator for the
insertion point must be non-null.
Fix digit grouping for integers formatted with "{:#Lx}" which were
including the "0x" prefix in the grouped digits. This resulted in output
like "0,xff,fff" instead of "0xff,fff".
Also change std:::basic_format_parse_context to not throw for an arg-id
that is larger than the actual number of format arguments. I clarified
with Victor Zverovich that this is the intended behaviour for the
run-time format-string checks. An out-of-range arg-id should be
diagnosed at compile-time (as clarified by LWG 3825) but not run-time.
The formatting function will still throw at run-time when args.arg(id)
returns an empty basic_format_arg.
libstdc++-v3/ChangeLog:
* include/std/format (basic_format_parse_context::next_arg_id):
Only check arg-id is in range during constant evaluation.
* testsuite/std/format/functions/format.cc: Check "{:#Lx}".
* testsuite/std/format/parse_ctx.cc: Adjust expected results for
format-strings using an out-of-range arg-id.
Simplify, refactor and improve various move immediate functions.
Allow 32-bit MOVN/I as a valid 64-bit immediate which removes special
cases in aarch64_internal_mov_immediate. Add new constraint so the movdi
pattern only needs a single alternative for move immediate.
gcc/
* config/aarch64/aarch64.cc (aarch64_bitmask_imm): Use unsigned type.
(aarch64_is_mov_xn_imm): New function.
(aarch64_move_imm): Refactor, assert mode is SImode or DImode.
(aarch64_internal_mov_immediate): Assert mode is SImode or DImode.
Simplify special cases.
(aarch64_uimm12_shift): Simplify code.
(aarch64_clamp_to_uimm12_shift): Likewise.
(aarch64_movw_imm): Rename to aarch64_is_movz.
(aarch64_float_const_rtx_p): Pass either SImode or DImode to
aarch64_internal_mov_immediate.
(aarch64_rtx_costs): Likewise.
* config/aarch64/aarch64.md (movdi_aarch64): Merge 'N' and 'M'
constraints into single 'O'.
(mov<mode>_aarch64): Likewise.
* config/aarch64/aarch64-protos.h (aarch64_move_imm): Use unsigned.
(aarch64_bitmask_imm): Likewise.
(aarch64_uimm12_shift): Likewise.
(aarch64_is_mov_xn_imm): New prototype.
* config/aarch64/constraints.md: Add 'O' for 32/64-bit immediates,
limit 'N' to 64-bit only moves.
A. add the following to clarify the relationship between -Warray-bounds
and the LEVEL of -fstrict-flex-array:
By default, the trailing array of a structure will be treated as a
flexible array member by '-Warray-bounds' or '-Warray-bounds=N' if
it is declared as either a flexible array member per C99 standard
onwards ('[]'), a GCC zero-length array extension ('[0]'), or an
one-element array ('[1]'). As a result, out of bounds subscripts
or offsets into zero-length arrays or one-element arrays are not
warned by default.
You can add the option '-fstrict-flex-arrays' or
'-fstrict-flex-arrays=LEVEL' to control how this option treat
trailing array of a structure as a flexible array member.
when LEVEL<=1, no change to the default behavior.
when LEVEL=2, additional warnings will be issued for out of bounds
subscripts or offsets into one-element arrays;
when LEVEL=3, in addition to LEVEL=2, additional warnings will be
issued for out of bounds subscripts or offsets into zero-length
arrays.
B. change -Warray-bounds=2 to exclude its control on how to treat
trailing arrays as flexible array members:
'-Warray-bounds=2'
This warning level also warns about the intermediate results
of pointer arithmetic that may yield out of bounds values.
This warning level may give a larger number of false positives
and is deactivated by default.
gcc/ChangeLog:
* attribs.cc (strict_flex_array_level_of): New function.
* attribs.h (strict_flex_array_level_of): Prototype for new function.
* doc/invoke.texi: Update -Warray-bounds by specifying the impact from
-fstrict-flex-arrays. Also update -Warray-bounds=2 by eliminating its
impact on treating trailing arrays as flexible array members.
* gimple-array-bounds.cc (get_up_bounds_for_array_ref): New function.
(check_out_of_bounds_and_warn): New function.
(array_bounds_checker::check_array_ref): Update with call to the above
new functions.
* tree.cc (array_ref_flexible_size_p): Add one new argument.
(component_ref_sam_type): New function.
(component_ref_size): Control with level of strict-flex-array.
* tree.h (array_ref_flexible_size_p): Update prototype.
(enum struct special_array_member): Add two new enum values.
(component_ref_sam_type): New prototype.
gcc/c/ChangeLog:
* c-decl.cc (is_flexible_array_member_p): Call new function
strict_flex_array_level_of.
gcc/testsuite/ChangeLog:
* gcc.dg/Warray-bounds-11.c: Update warnings for -Warray-bounds=2.
* gcc.dg/Warray-bounds-flex-arrays-1.c: New test.
* gcc.dg/Warray-bounds-flex-arrays-2.c: New test.
* gcc.dg/Warray-bounds-flex-arrays-3.c: New test.
* gcc.dg/Warray-bounds-flex-arrays-4.c: New test.
* gcc.dg/Warray-bounds-flex-arrays-5.c: New test.
* gcc.dg/Warray-bounds-flex-arrays-6.c: New test.
PR analyzer/106325 reports false postives from
-Wanalyzer-null-dereference on code like this:
__attribute__((nonnull))
void foo_a (Foo *p)
{
foo_b (p);
switch (p->type)
{
/* ... */
}
}
where foo_b (p) has a:
g_return_if_fail (p);
that expands to:
if (!p)
{
return;
}
The analyzer "sees" the comparison against NULL in foo_b, and splits the
analysis into the NULL and not-NULL cases; later, back in foo_a, at
switch (p->type)
it complains that p is NULL.
Previously we were only using __attribute__((nonnull)) as something to
complain about when it was violated; we weren't using it as a source of
knowledge.
This patch fixes things by making the analyzer respect
__attribute__((nonnull)) at the top-level of the analysis: any such
params are now assumed to be non-NULL, so that the analyzer assumes the
g_return_if_fail inside foo_b doesn't fail when called from foo_a
Doing so fixes the false positives.
gcc/analyzer/ChangeLog:
PR analyzer/106325
* region-model-manager.cc
(region_model_manager::get_or_create_null_ptr): New.
* region-model-manager.h
(region_model_manager::get_or_create_null_ptr): New decl.
* region-model.cc (region_model::on_top_level_param): Add
"nonnull" param and make use of it.
(region_model::push_frame): When handling a top-level entrypoint
to the analysis, determine which params __attribute__((nonnull))
applies to, and pass to on_top_level_param.
* region-model.h (region_model::on_top_level_param): Add "nonnull"
param.
gcc/testsuite/ChangeLog:
PR analyzer/106325
* gcc.dg/analyzer/attr-nonnull-pr106325.c: New test.
* gcc.dg/analyzer/attribute-nonnull.c (test_6): New.
(test_7): New.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
gcc/ChangeLog:
* Makefile.in (ANALYZER_OBJS): Add analyzer/call-details.o,
analyzer/kf-analyzer.o, and kf-lang-cp.o.
gcc/analyzer/ChangeLog:
* analyzer.h (register_known_analyzer_functions): New decl.
(register_known_functions_lang_cp): New decl.
* call-details.cc: New file, split out from
region-model-impl-calls.cc.
* call-details.h: New file, split out from region-model.h.
* call-info.cc: Include "analyzer/call-details.h".
* call-summary.h: Likewise.
* kf-analyzer.cc: New file, split out from
region-model-impl-calls.cc.
* kf-lang-cp.cc: Likewise.
* known-function-manager.cc: Include "analyzer/call-details.h".
* region-model-impl-calls.cc: Move definitions of call_details's
member functions to call-details.cc. Move class kf_analyzer_* to
kf-analyzer.cc. Move kf_operator_new and kf_operator_delete to
kf-lang-cp.cc. Refresh #includes accordingly.
(register_known_functions): Replace registration of __analyzer_*
functions with a call to register_known_analyzer_functions.
Replace registration of C++ support functions with a call to
register_known_functions_lang_cp.
* region-model.h (class call_details): Move to new call-details.h.
* sm-fd.cc: Include "analyzer/call-details.h".
* sm-file.cc: Likewise.
* sm-malloc.cc: Likewise.
* varargs.cc: Likewise.
gcc/testsuite/ChangeLog:
* gcc.dg/plugin/analyzer_kernel_plugin.c: Include
"analyzer/call-details.h".
* gcc.dg/plugin/analyzer_known_fns_plugin.c: Likewise.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
This patch adds support for omp_get_max_teams, omp_set_num_teams, and
omp_{gs}et_teams_thread_limit on offload devices. That includes the usage of
device-specific ICV values (specified as environment variables or changed on a
device). In order to reuse device-specific ICV values, a copy back mechanism is
implemented that copies ICV values back from device to the host.
Additionally, a limitation of the number of teams on gcn offload devices is
implemented. The number of teams is limited by twice the number of compute
units (one team is executed on one compute unit). This avoids queueing
unnessecary many teams and a corresponding allocation of large amounts of
memory. Without that limitation the memory allocation for a large number of
user-specified teams can result in an "memory access fault".
A limitation of the number of teams is already also implemented for nvptx
devices (see nvptx_adjust_launch_bounds in libgomp/plugin/plugin-nvptx.c).
gcc/ChangeLog:
* gimplify.cc (optimize_target_teams): Set initial num_teams_upper
to "-2" instead of "1" for non-existing num_teams clause in order to
disambiguate from the case of an existing num_teams clause with value 1.
libgomp/ChangeLog:
* config/gcn/icv-device.c (omp_get_teams_thread_limit): Added to
allow processing of device-specific values.
(omp_set_teams_thread_limit): Likewise.
(ialias): Likewise.
* config/nvptx/icv-device.c (omp_get_teams_thread_limit): Likewise.
(omp_set_teams_thread_limit): Likewise.
(ialias): Likewise.
* icv-device.c (omp_get_teams_thread_limit): Likewise.
(ialias): Likewise.
(omp_set_teams_thread_limit): Likewise.
* icv.c (omp_set_teams_thread_limit): Removed.
(omp_get_teams_thread_limit): Likewise.
(ialias): Likewise.
* libgomp.texi: Updated documentation for nvptx and gcn corresponding
to the limitation of the number of teams.
* plugin/plugin-gcn.c (limit_teams): New helper function that limits
the number of teams by twice the number of compute units.
(parse_target_attributes): Limit the number of teams on gcn offload
devices.
* target.c (get_gomp_offload_icvs): Added teams_thread_limit_var
handling.
(gomp_load_image_to_device): Added a size check for the ICVs struct
variable.
(gomp_copy_back_icvs): New function that is used in GOMP_target_ext to
copy back the ICV values from device to host.
(GOMP_target_ext): Update the number of teams and threads in the kernel
args also considering device-specific values.
* testsuite/libgomp.c-c++-common/icv-4.c: Fixed an error in the reading
of OMP_TEAMS_THREAD_LIMIT from the environment.
* testsuite/libgomp.c-c++-common/icv-5.c: Extended.
* testsuite/libgomp.c-c++-common/icv-6.c: Extended.
* testsuite/libgomp.c-c++-common/icv-7.c: Extended.
* testsuite/libgomp.c-c++-common/icv-9.c: New test.
* testsuite/libgomp.fortran/icv-5.f90: New test.
* testsuite/libgomp.fortran/icv-6.f90: New test.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/target-teams-1.c: Adapt expected values for
num_teams from "1" to "-2" in cases without num_teams clause.
* g++.dg/gomp/target-teams-1.C: Likewise.
* gfortran.dg/gomp/defaultmap-4.f90: Likewise.
* gfortran.dg/gomp/defaultmap-5.f90: Likewise.
* gfortran.dg/gomp/defaultmap-6.f90: Likewise.
SPARK RM now allow the property No_Caching on volatile types, to
indicate that they should be considered volatile for compilation, but
not by GNATprove's analysis.
gcc/ada/
* contracts.adb (Add_Contract_Item): Allow No_Caching on types.
(Check_Type_Or_Object_External_Properties): Check No_Caching.
Check that non-effectively volatile types does not contain an
effectively volatile component (instead of just a volatile
component).
(Analyze_Object_Contract): Remove shared checking of No_Caching.
* sem_prag.adb (Analyze_External_Property_In_Decl_Part): Adapt checking
of No_Caching for types.
(Analyze_Pragma): Allow No_Caching on types.
* sem_util.adb (Has_Effectively_Volatile_Component): New query function.
(Is_Effectively_Volatile): Type with Volatile and No_Caching is not
effectively volatile.
(No_Caching_Enabled): Remove assertion to apply to all entities.
* sem_util.ads: Same.
Like in Exp_Ch4, we do not want to give warnings in Sem_Warn on a membership
test with a mark for a subtype that is predicated.
gcc/ada/
* sem_warn.adb (Warn_On_Constant_Valid_Condition): Bail out for a
membership test with a mark for a subtype that is predicated.
The problem is that the computation of early call regions skips freeze nodes
but scenarios involving procedures declared as actions of these freeze nodes
are taken into account. As a consequence if a subprogram body, typically of
an expression function, is placed just after a freeze node, its early call
region depends on whether the construct just before the freeze node can be
preelaborated or not; in other words, the legality of calls made from the
actions of this freeze node to the subprogram depends on what happens ahead
of the freeze node, which may be totally unrelated to the situation.
This change disables the ABE diagnostics in this case, as is done in a few
other similar cases leading to bogus errors too.
gcc/ada/
* sem_elab.adb (Processing_In_State): Add Within_Freezing_Actions
component.
(Process_Conditional_ABE_Call): Compute its value.
(Process_Conditional_ABE_Call_SPARK): For a call and a target in
the main unit, do not emit any ABE diagnostics if the call occurs
in a freezing actions context.
This implements elision of the copy operation for extended return statements
in the case of nonlimited by-reference types (the copy operation is already
elided for limited types by the front-end and nonlimited non-by-reference
types by the code generator), which comprise controlled and tagged types.
The implementation partly reuses the machinery implemented for limited types
(the build-in-place machinery) to allocate the return object directly on the
primary or the secondary stack, depending on whether the result type of the
function is constrained or not.
This requires further special-casing for the allocators generated by this
machinery as well as an adjustment to the implementation of a specific case
of string concatenation.
gcc/ada/
* einfo.ads (Actual_Subtype): Document additional usage.
* exp_aggr.adb (Expand_Array_Aggregate): Replace test on
Is_Build_In_Place_Return_Object with Is_Special_Return_Object.
* exp_ch3.adb (Expand_N_Object_Declaration): Factor out parts of the
processing done for build-in-place return objects and reuse them to
implement a similar processing for specific return objects.
* exp_ch4.adb (Expand_Allocator_Expression): Do not generate a tag
assignment or an adjustment if the allocator was made for a special
return object.
(Expand_Concatenate): If the result is allocated on the secondary
stack, use an unconstrained allocation.
* exp_ch6.ads (Apply_CW_Accessibility_Check): New declaration.
(Is_By_Reference_Return_Object): Likewise.
(Is_Secondary_Stack_Return_Object): Likewise.
(Is_Special_Return_Object): Likewise.
* exp_ch6.adb (Expand_Ctrl_Function_Call): Do not bail out for the
expression in the declaration of a special return object.
(Expand_N_Extended_Return_Statement): Add missing guard and move
the class-wide accessibility check to Expand_N_Object_Declaration.
(Expand_Simple_Function_Return): Delete obsolete commentary.
Skip the special processing for types that require finalization or
are returned on the secondary stack if the return originally comes
from an extended return statement. Add missing Constant_Present.
(Is_By_Reference_Return_Object): New predicate.
(Is_Secondary_Stack_Return_Object): Likewise.
(Is_Special_Return_Object): Likewise.
* exp_util.adb (Is_Related_To_Func_Return): Also return true if the
parent of the expression is the renaming declaration generated for
the expansion of a return object.
* gen_il-fields.ads (Opt_Field_Enum): Replace Alloc_For_BIP_Return
with For_Special_Return_Object.
* gen_il-gen-gen_nodes.adb (N_Allocator): Likewise.
* gen_il-internals.adb (Image): Remove Alloc_For_BIP_Return.
* sem_ch3.adb (Check_Return_Subtype_Indication): New procedure
moved from sem_ch6.adb.
(Analyze_Object_Declaration): Call it on a return object.
* sem_ch4.adb: Add with and use clauses for Rtsfind.
(Analyze_Allocator): Test For_Special_Return_Object to skip checks
for allocators made for special return objects.
Do not report restriction violations for the return stack pool.
* sem_ch5.adb (Analyze_Assignment.Set_Assignment_Type): Return the
Actual_Subtype for return objects that live on the secondary stack.
* sem_ch6.adb (Check_Return_Subtype_Indication): Move procedure to
sem_ch3.adb.
(Analyze_Function_Return): Do not call above procedure.
* sem_res.adb (Resolve_Allocator): Replace Alloc_For_BIP_Return
with For_Special_Return_Object.
* sinfo.ads: Likewise.
* treepr.adb (Image): Remove Alloc_For_BIP_Return.
* gcc-interface/trans.cc (gnat_to_gnu): Do not convert to the result
type in the unconstrained array type case if the parent is a simple
return statement.
It's needed because, in GNAT, universal_integer does not cover all the
values of all the supported integer types.
gcc/ada/
* sem_res.adb (Resolve_Membership_Op): Adjust latest change.
When a membership test is applied to a nonstatic expression of a universal
type, for example an attribute whose type is universal_integer and whose
prefix is not static, the operation is performed using the tested type that
is determined by the choice list. In particular, a check that the value of
the expression lies in the range of the tested type may be generated before
the test is actually performed.
This goes against the spirit of membership tests, which are typically used
to guard a specific operation and ought not to fail a check in doing so.
Therefore the resolution of the operands of membership tests is changed in
this case to use the universal type instead of the tested type. The final
computation of the type used to actually perform the test is left to the
expander, which already has the appropriate circuitry.
This nevertheless requires fixing an irregularity in the expansion of the
subtype_mark form of membership tests, which was dependent on the presence
of predicates for the subtype; the confusing name of a routine used by this
expansion is also changed in the process.
gcc/ada/
* exp_ch4.adb (Expand_N_In) <Substitute_Valid_Check>: Rename to...
<Substitute_Valid_Test>: ...this.
Use Is_Entity_Name to test for the presence of entity references.
Do not warn or substitute a valid test for a test with a mark for
a subtype that is predicated.
Apply the same transformation for a test with a mark for a subtype
that is predicated as for a subtype that is not.
Remove useless return statement.
* sem_res.adb (Resolve_Membership_Op): Perform a special resolution
if the left operand is of a universal numeric type.
This patch performs a large reorganization of accessibility related sources,
and also corrects some latent issues with accessibility checks - namely the
calculation of accessibility levels for expanded iterators and type
conversions.
gcc/ada/
* accessibility.adb, accessibility.ads
(Accessibility_Message): Moved from sem_attr.
(Apply_Accessibility_Check): Moved from checks.
(Apply_Accessibility_Check_For_Allocator): Moved from exp_ch4 and
renamed
(Check_Return_Construct_Accessibility): Moved from sem_ch6.
(Innermost_Master_Scope_Depth): Moved from sem_util. Add condition
to detect expanded iterators.
(Prefix_With_Safe_Accessibility_Level): Moved from sem_attr.
(Static_Accessibility_Level): Moved from sem_util.
(Has_Unconstrained_Access_Discriminants): Likewise.
(Has_Anonymous_Access_Discriminant): Likewise.
(Is_Anonymous_Access_Actual): Likewise.
(Is_Special_Aliased_Formal_Access): Likewise.
(Needs_Result_Accessibility_Level): Likewise.
(Subprogram_Access_Level): Likewise.
(Type_Access_Level): Likewise.
(Deepest_Type_Access_Level): Likewise.
(Effective_Extra_Accessibility): Likewise.
(Get_Dynamic_Accessibility): Likewise.
(Has_Access_Values): Likewise.
(Accessibility_Level): Likewise.
* exp_attr.adb (Access_Cases): Obtain the proper enclosing object
which applies to a given 'Access by looking through type
conversions.
* exp_ch4.adb (Apply_Accessibility_Check): Moved to accessibility.
* exp_ch5.adb: Likewise.
* exp_ch6.adb: Likewise.
* exp_ch9.adb: Likewise.
* exp_disp.adb: Likewise.
* gen_il-fields.ads: Add new flag Comes_From_Iterator.
* gen_il-gen-gen_nodes.adb: Add new flag Comes_From_Iterator for
N_Object_Renaming_Declaration.
* sem_ch5.adb (Analyze_Iterator_Specification): Mark object
renamings resulting from iterator expansion with the new flag
Comes_From_Iterator.
* sem_aggr.adb (Resolve_Container_Aggregate): Refine test.
* sem_ch13.adb: Add dependence on the accessibility package.
* sem_ch3.adb: Likewise.
* sem_ch4.adb: Likewise.
* sem_ch9.adb: Likewise.
* sem_res.adb: Likewise.
* sem_warn.adb: Likewise.
* exp_ch3.adb: Likewise.
* sem_attr.adb (Accessibility_Message): Moved to accessibility.
(Prefix_With_Safe_Accessibility_Level): Likewise.
* checks.adb, checks.ads (Apply_Accessibility_Check): Likewise.
* sem_ch6.adb (Check_Return_Construct_Accessibility): Likewise.
* sem_util.adb, sem_util.ads
(Accessibility_Level): Likewise.
(Deepest_Type_Access_Level): Likewise.
(Effective_Extra_Accessibility): Likewise.
(Get_Dynamic_Accessibility): Likewise.
(Has_Access_Values): Likewise.
(Has_Anonymous_Access_Discriminant): Likewise.
(Static_Accessibility_Level): Likewise.
(Has_Unconstrained_Access_Discriminants): Likewise.
(Is_Anonymous_Access_Actual): Likewise.
(Is_Special_Aliased_Formal_Access): Likewise.
(Needs_Result_Accessibility_Level): Likewise.
(Subprogram_Access_Level): Likewise.
(Type_Access_Level): Likewise.
* sinfo.ads: Document new flag Comes_From_Iterator.
* gcc-interface/Make-lang.in: Add entry for new Accessibility package.
This patch simplify the TO_C code to have a single branch for
raising exception. Furthermore, adding pragma annotate for codepeer
to ignore uninitialized value since this is caused because we have
input check before the initialization.
gcc/ada/
* libgnat/i-c.adb (To_C): Simplify code for having a single
exception raise. Add pragma annotate about uninitialized value
which happen only on exception raising.
This patch surrounds the scalar operand of the MVE vcmp patterns with a
vec_duplicate to ensure both operands of the comparision operator have the same
(vector) mode.
gcc/ChangeLog:
PR target/107987
* config/arm/mve.md (mve_vcmp<mve_cmp_op>q_n_<mode>,
@mve_vcmp<mve_cmp_op>q_n_f<mode>): Apply vec_duplicate to scalar
operand.
gcc/testsuite/ChangeLog:
* gcc.target/arm/mve/pr107987.c: New test.
With -msoft-float we ICE on __bf16 comparisons, because the
insns we want to use under the hood (cbranchsf4 and cstoresf4)
after performing the fast extensions aren't available.
The following patch copies the conditions from the c*sf4 expanders
to the corresponding c*bf4 expanders.
2022-12-06 Jakub Jelinek <jakub@redhat.com>
PR target/107969
* config/i386/i386.md (cbranchbf4, cstorebf4): Guard expanders
with the same condition as cbranchsf4 or cstoresf4 expanders.
* gcc.target/i386/pr107969.c: New test.
add_options_for_ieee has:
if { [istarget alpha*-*-*]
|| [istarget sh*-*-*] } {
return "$flags -mieee"
}
if { [istarget rx-*-*] } {
return "$flags -mnofpu"
}
but ieee.exp doesn't use add_options_for_ieee, instead it has:
if { [istarget "alpha*-*-*"]
|| [istarget "sh*-*-*"] } then {
lappend additional_flags "-mieee"
}
among other things (plus -ffloat-store on some arches etc.).
The following patch adds the rx -mnofpu similarly in the hope
of fixing ieee.exp FAILs on rx.
Preapproved in the PR by Jeff, committed to trunk.
2022-12-06 Jakub Jelinek <jakub@redhat.com>
PR testsuite/107046
* gcc.c-torture/execute/ieee/ieee.exp: For rx-*-* append
-mnofpu.
When we end up isolating a nullptr path it happens we diagnose
accesses to offsetted nullptr objects. The current diagnostics
have no good indication that this happens so the following records
the fact that our heuristic detected a nullptr based access in
the access_ref structure and sets up diagnostics to inform
of that detail. The diagnostic itself could probably be
improved here but its API is twisted and the necessary object
isn't passed around.
Instead of just
...bits/atomic_base.h:655:34: warning: 'unsigned int __atomic_fetch_and_4(volatile void*, unsigned int, int)' writing 4 bytes into a region of size 0 overflows the destination [-Wstringop-overflow=]
we now add
In member function 'void QFutureInterfaceBase::setThrottled(bool)':
cc1plus: note: destination object is likely at address zero
PR tree-optimization/104475
* pointer-query.h (access_ref::ref_nullptr_p): New flag.
* pointer-query.cc (access_ref::access_ref): Initialize
ref_nullptr_p.
(compute_objsize_r): Set ref_nullptr_p if we treat it that way.
(access_ref::inform_access): If ref was treated as nullptr
based, indicate that.
On Mon, Dec 05, 2022 at 02:29:36PM +0100, Aldy Hernandez wrote:
> > So like this for multiplication op1/2_range if it passes bootstrap/regtest?
> > For division I'll need to go to a drawing board...
>
> Sure, looks good to me.
Ulrich just filed PR107972, so in the light of that PR the following patch
attempts to do that differently.
As for testcase, I've tried both attached testcases, but unfortunately it
seems that in neither of the cases we actually figure out that res range
is finite (or for last function non-zero ordered). So there is further
work needed on that.
2022-12-06 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/107972
* range-op-float.cc (frange_drop_infs): New function.
(float_binary_op_range_finish): Add DIV_OP2 argument. If DIV_OP2 is
false and lhs is finite or if DIV_OP2 is true and lhs is non-zero and
not NAN, r must be finite too.
(foperator_div::op2_range): Pass true to DIV_OP2 of
float_binary_op_range_finish.
According to https://gcc.gnu.org/pipermail/gcc-regression/2022-December/077258.html
my patch caused some ICEs, e.g. the following testcase ICEs.
The problem is that lower_bound and upper_bound methods on a france assert
that the range isn't VR_NAN or VR_UNDEFINED.
All the op1_range/op2_range methods already return early if lhs.undefined_p,
but the other cases (when lhs is VR_NAN or the other opN is VR_NAN or
VR_UNDEFINED) aren't. float_binary_op_range_finish will DTRT for those
cases already.
2022-12-06 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/107975
* range-op-float.cc (foperator_mult::op1_range,
foperator_div::op1_range, foperator_div::op2_range): Just
return float_binary_op_range_finish result if lhs is known
NAN, or the other operand is known NAN or UNDEFINED.
* gcc.dg/pr107975.c: New test.