* tree-loop-distribution.c (struct builtin_info): New struct.
(struct partition): Refactor fields into struct builtin_info.
(partition_free): Free struct builtin_info.
(build_size_arg_loc, build_addr_arg_loc): Delete.
(generate_memset_builtin, generate_memcpy_builtin): Get memory range
information from struct builtin_info.
(find_single_drs): New function refactored from classify_partition.
Also moved builtin validity checks to this function.
(compute_access_range, alloc_builtin): New functions.
(classify_builtin_st, classify_builtin_ldst): New functions.
(classify_partition): Refactor code into functions find_single_drs,
classify_builtin_st and classify_builtin_ldst.
(distribute_loop): Don't do runtime alias check when distributing
loop nest.
(find_seed_stmts_for_distribution): New function.
(pass_loop_distribution::execute): Refactor code finding seed
stmts into above function. Support distribution for the innermost
two-level loop nest. Adjust dump information.
gcc/testsuite
* gcc.dg/tree-ssa/ldist-28.c: New test.
* gcc.dg/tree-ssa/ldist-29.c: New test.
* gcc.dg/tree-ssa/ldist-30.c: New test.
* gcc.dg/tree-ssa/ldist-31.c: New test.
From-SVN: r253680
* tree-loop-distribution.c: Adjust the general comment.
(NUM_PARTITION_THRESHOLD): New macro.
(ssa_name_has_uses_outside_loop_p): Support loop nest distribution.
(classify_partition): Skip builtin pattern of loop nest's inner loop.
(merge_dep_scc_partitions): New parameter ignore_alias_p and use it
in call to build_partition_graph.
(finalize_partitions): New parameter. Make loop distribution more
conservative by fusing more partitions.
(distribute_loop): Don't do runtime alias check in case of loop nest
distribution.
(find_seed_stmts_for_distribution): New function.
(prepare_perfect_loop_nest): New function.
(pass_loop_distribution::execute): Refactor code finding seed stmts
and loop nest into above functions. Support loop nest distribution.
Adjust dump information accordingly.
gcc/testsuite
* gcc.dg/tree-ssa/ldist-7.c: Adjust test string.
* gcc.dg/tree-ssa/ldist-16.c: Ditto.
* gcc.dg/tree-ssa/ldist-25.c: Ditto.
* gcc.dg/tree-ssa/ldist-33.c: New test.
From-SVN: r253679
TARGET_ISEL64 just means TARGET_ISEL && TARGET_POWERPC64. Since
everywhere it is used uses :GPR already, we can just as well use
TARGET_ISEL always.
* config/rs6000/rs6000.h (TARGET_ISEL64): Delete.
* config/rs6000/rs6000.md (sel): Delete mode attribute.
(mov<mode>cc, isel_signed_<mode>, isel_unsigned_<mode>,
*isel_reversed_signed_<mode>, *isel_reversed_unsigned_<mode>): Use
TARGET_ISEL instead of TARGET_ISEL<sel>.
From-SVN: r253671
This removes output_isel. Instead, the define_insn's now output the
isel instructions directly.
It adds a reg_or_zero operand predicate, too, because the reg_or_cint
predicate is too lax here. Also use it in the "reversed" variants of
the instructions.
* config/rs6000/predicates.md (zero_constant, all_ones_constant):
Move up in file.
(reg_or_cint_operand): Fix comment.
(reg_or_zero_operand): New predicate.
* config/rs6000/rs6000-protos.h (output_isel): Delete.
* config/rs6000/rs6000.c (output_isel): Delete.
* config/rs6000/rs6000.md (isel_signed_<mode>): Use reg_or_zero_operand
instead of reg_or_cint_operand. Output instruction directly (not via
output_isel).
(isel_unsigned_<mode>): Ditto.
(*isel_reversed_signed_<mode>): Use reg_or_zero_operand instead of
gpc_reg_operand. Add an instruction alternative for this. Output
instruction directly.
(*isel_reversed_unsigned_<mode>): Ditto.
From-SVN: r253665
* profile-count.h (slow_safe_scale_64bit): New function.
(safe_scale_64bit): New inline.
(profile_count::max_safe_multiplier): Remove; use safe_scale_64bit.
* profile-count.c: Include wide-int.h
(slow_safe_scale_64bit): New.
From-SVN: r253652
* config.gcc (i386, x86_64): Add extra objects.
* i386/i386-protos.h (ix86_rip_relative_addr_p): Declare.
(ix86_min_insn_size): Declare.
(ix86_issue_rate): Declare.
(ix86_adjust_cost): Declare.
(ia32_multipass_dfa_lookahead): Declare.
(ix86_macro_fusion_p): Declare.
(ix86_macro_fusion_pair_p): Declare.
(ix86_bd_has_dispatch): Declare.
(ix86_bd_do_dispatch): Declare.
(ix86_core2i7_init_hooks): Declare.
(ix86_atom_sched_reorder): Declare.
* i386/i386.c Move all CPU cost tables to x86-tune-costs.h.
(COSTS_N_BYTES): Move to x86-tune-costs.h.
(DUMMY_STRINGOP_ALGS):x86-tune-costs.h.
(rip_relative_addr_p): Rename to ...
(ix86_rip_relative_addr_p): ... this one; export.
(memory_address_length): Update.
(ix86_issue_rate): Move to x86-tune-sched.c.
(ix86_flags_dependent): Move to x86-tune-sched.c.
(ix86_agi_dependent): Move to x86-tune-sched.c.
(exact_dependency_1): Move to x86-tune-sched.c.
(exact_store_load_dependency): Move to x86-tune-sched.c.
(ix86_adjust_cost): Move to x86-tune-sched.c.
(ia32_multipass_dfa_lookahead): Move to x86-tune-sched.c.
(ix86_macro_fusion_p): Move to x86-tune-sched.c.
(ix86_macro_fusion_pair_p): Move to x86-tune-sched.c.
(do_reorder_for_imul): Move to x86-tune-sched-atom.c.
(swap_top_of_ready_list): Move to x86-tune-sched-atom.c.
(ix86_sched_reorder): Move to x86-tune-sched-atom.c.
(core2i7_first_cycle_multipass_init): Move to x86-tune-sched-core.c.
(core2i7_dfa_post_advance_cycle): Move to x86-tune-sched-core.c.
(min_insn_size): Rename to ...
(ix86_min_insn_size): ... this one; export.
(core2i7_first_cycle_multipass_begin): Move to x86-tune-sched-core.c.
(core2i7_first_cycle_multipass_issue): Move to x86-tune-sched-core.c.
(core2i7_first_cycle_multipass_backtrack): Move to x86-tune-sched-core.c.
(core2i7_first_cycle_multipass_end): Move to x86-tune-sched-core.c.
(core2i7_first_cycle_multipass_fini): Move to x86-tune-sched-core.c.
(ix86_sched_init_global): Break up logic to ix86_core2i7_init_hooks.
(ix86_avoid_jump_mispredicts): Update.
(TARGET_SCHED_DISPATCH): Move to ix86-tune-sched-bd.c.
(TARGET_SCHED_DISPATCH_DO): Move to ix86-tune-sched-bd.c.
(TARGET_SCHED_REORDER): Move to ix86-tune-sched-bd.c.
(DISPATCH_WINDOW_SIZE): Move to ix86-tune-sched-bd.c.
(MAX_DISPATCH_WINDOWS): Move to ix86-tune-sched-bd.c.
(MAX_INSN): Move to ix86-tune-sched-bd.c.
(MAX_IMM): Move to ix86-tune-sched-bd.c.
(MAX_IMM_SIZE): Move to ix86-tune-sched-bd.c.
(MAX_IMM_32): Move to ix86-tune-sched-bd.c.
(MAX_IMM_64): Move to ix86-tune-sched-bd.c.
(MAX_LOAD): Move to ix86-tune-sched-bd.c.
(MAX_STORE): Move to ix86-tune-sched-bd.c.
(BIG): Move to ix86-tune-sched-bd.c.
(enum dispatch_group): Move to ix86-tune-sched-bd.c.
(enum insn_path): Move to ix86-tune-sched-bd.c.
(get_mem_group): Move to ix86-tune-sched-bd.c.
(is_cmp): Move to ix86-tune-sched-bd.c.
(dispatch_violation): Move to ix86-tune-sched-bd.c.
(is_branch): Move to ix86-tune-sched-bd.c.
(is_prefetch): Move to ix86-tune-sched-bd.c.
(init_window): Move to ix86-tune-sched-bd.c.
(allocate_window): Move to ix86-tune-sched-bd.c.
(init_dispatch_sched): Move to ix86-tune-sched-bd.c.
(is_end_basic_block): Move to ix86-tune-sched-bd.c.
(process_end_window): Move to ix86-tune-sched-bd.c.
(allocate_next_window): Move to ix86-tune-sched-bd.c.
(find_constant): Move to ix86-tune-sched-bd.c.
(get_num_immediates): Move to ix86-tune-sched-bd.c.
(has_immediate): Move to ix86-tune-sched-bd.c.
(get_insn_path): Move to ix86-tune-sched-bd.c.
(get_insn_group): Move to ix86-tune-sched-bd.c.
(count_num_restricted): Move to ix86-tune-sched-bd.c.
(fits_dispatch_window): Move to ix86-tune-sched-bd.c.
(add_insn_window): Move to ix86-tune-sched-bd.c.
(add_to_dispatch_window): Move to ix86-tune-sched-bd.c.
(debug_dispatch_window_file): Move to ix86-tune-sched-bd.c.
(debug_dispatch_window): Move to ix86-tune-sched-bd.c.
(debug_insn_dispatch_info_file): Move to ix86-tune-sched-bd.c.
(debug_ready_dispatch): Move to ix86-tune-sched-bd.c.
(do_dispatch): Move to ix86-tune-sched-bd.c.
(has_dispatch): Move to ix86-tune-sched-bd.c.
* i386/t-i386: Add new object files.
* i386/x86-tune-costs.h: New file.
* i386/x86-tune-sched-atom.c: New file.
* i386/x86-tune-sched-bd.c: New file.
* i386/x86-tune-sched-core.c: New file.
* i386/x86-tune-sched.c: New file.
From-SVN: r253646
2017-10-11 Liu Hao <lh_mouse@126.com>
* pretty-print.c [_WIN32] (colorize_init): Remove. Use
the generic version below instead.
(should_colorize): Recognize Windows consoles as terminals
for MinGW targets.
* pretty-print.c [__MINGW32__] (write_all): New function.
[__MINGW32__] (find_esc_head): Likewise.
[__MINGW32__] (find_esc_terminator): Likewise.
[__MINGW32__] (eat_esc_sequence): Likewise.
[__MINGW32__] (mingw_ansi_fputs): New function that handles
ANSI escape codes.
(pp_write_text_to_stream): Use mingw_ansi_fputs instead of fputs
for MinGW targets.
From-SVN: r253645
2017-10-11 Richard Biener <rguenther@suse.de>
* tree-ssa-loop-niter.c (infer_loop_bounds_from_pointer_arith):
Properly call analyze_scalar_evolution with the loop of the stmt.
From-SVN: r253644
2017-10-11 Martin Liska <mliska@suse.cz>
* c-c++-common/ubsan/ptr-overflow-sanitization-1.c: Scan
optimized dump rather than assembly.
From-SVN: r253636
Similar to other architectures with IFUNC binutils/glibc support, this
patch enables the ifunc attribute for ARM GNU/Linux. Although not
required for build master GLIBC, the intention is to allow refactor
its assembly implementation to C.
Tested compilation of glibc (in conjunction with a glibc patch to
support using the attribute on ARM) with build-many-glibcs.py (with
a patch to add a armv7 variant which enables multiarch). I have
not run the GCC tests for ARM.
* config.gcc (default_gnu_indirect_function): Default to yes for
arm*-*-linux* with glibc.
From-SVN: r253635
2017-10-11 Richard Biener <rguenther@suse.de>
* tree-scalar-evolution.c (get_scalar_evolution): Handle
default-defs and types we do not want to analyze.
(interpret_loop_phi): Replace unreachable code with an assert.
(compute_scalar_evolution_in_loop): Remove and inline ...
(analyze_scalar_evolution_1): ... here, replacing condition with
what makes the intent clearer. Remove handling of cases
get_scalar_evolution now handles.
From-SVN: r253629
gcc/
PR rtl-optimization/81434
* haifa-sched.c (prune_ready_list): Init min_cost_group to 0. Update
comment for main loop. In sched_group_found if, also add checks for
pass and min_cost_group.
From-SVN: r253628
This adds an implementation of the insn_cost hook to rs6000.
This implementations is very minimal (so far). It is mostly based on
how many machine instructions are generated by an RTL insn, and it also
looks at the instruction type. Floating point insns are costed as if
all machine instructions it generates are floating point; the other
insns are treated as if all but one are integer insns (and one is the
specified type). Load instructions are treated as costing twice as
much, and load locked and sync insns as three times as much (just like
the original costs), and integer div and mul are handled as well.
Each define_insn (etc.) can set a "cost" attribute to override this
general cost. With optimization for size, the cost is set equal to the
value of the "length" attribute.
With this, the majority of cost differences between old and new are
where the old was wrong. Also, benchmarks show a slight win (if
anything). Some refinements are obviously needed.
* config/rs6000/rs6000.c (TARGET_INSN_COST): New.
(rs6000_insn_cost): New function.
* config/rs6000/rs6000.md (cost): New attribute.
From-SVN: r253624