mirror/gcc - gcc - Collaboration & Inovation

mirror/gcc

mirror of git://gcc.gnu.org/git/gcc.git synced 2025-04-08 22:51:45 +08:00

Author	SHA1	Message	Date
Maciej W. Rozycki	2b39f5137a	VAX: Fix predicates for widening multiply and multiply-add insns It makes no sense for insn operand predicates, as long as they accept a register operand, to be more restrictive than the set of the associated constraints, because expand will choose the insn based on the relevant operand being a pseudo register then and reload will keep it happily as an immediate if a constraint permits it. So the restriction posed by such a predicate will be happily ignored, and moreover if a splitter is added, such as required for MODE_CC support, the new instructions will reject the original operands supplied, causing an ICE like below: .../gcc/testsuite/gfortran.dg/graphite/PR67518.f90:44:0: Error: could not split insn (insn 90 662 663 (set (reg:DI 10 %r10 [orig:97 _235 ] [97]) (mult:DI (sign_extend:DI (mem/c:SI (plus:SI (reg/f:SI 13 %fp) (const_int -800 [0xfffffffffffffce0])) [14 %sfp+-800 S4 A32])) (sign_extend:DI (const_int -51 [0xffffffffffffffcd])))) 299 {mulsidi3} (expr_list:REG_EQUAL (mult:DI (sign_extend:DI (subreg:SI (mem/c:DI (plus:SI (reg/f:SI 13 %fp) (const_int -800 [0xfffffffffffffce0])) [14 %sfp+-800 S8 A32]) 0)) (const_int -51 [0xffffffffffffffcd])) (nil))) during RTL pass: final .../gcc/testsuite/gfortran.dg/graphite/PR67518.f90:44:0: internal compiler error: in final_scan_insn_1, at final.c:3073 Please submit a full bug report, with preprocessed source if appropriate. See <https://gcc.gnu.org/bugs/> for instructions. Change the predicates used with the widening multiply and multiply-add insns to allow immediates then, just as the constraints and the machine instructions produced permit. Also give the insns names, for easier reference here and elsewhere. gcc/ * config/vax/vax.md (mulsidi3): Fix the multiplicand predicates. (maddsidi4, maddsidi4_const): Likewise. Name insns.	2020-12-05 18:26:26 +00:00
Maciej W. Rozycki	d38f8441be	VAX: Fix predicates and constraints for bit-field comparison insns It makes no sense for insn operand predicates, as long as they accept a register operand, to be more restrictive than the set of the associated constraints, because expand will choose the insn based on the relevant operand being a pseudo register then and reload keep it happily as a memory reference if a constraint permits it. So the restriction posed by such a predicate will be happily ignored, and moreover if a splitter is added, such as required for MODE_CC support, the new instructions will reject the original operands supplied, causing an ICE. An actual example will be given with a subsequent change. Therefore, similarly to EXTV/EXTZV/INSV insns, remove inconsistencies with predicates and constraints of bit-field comparison insns, observing that a bit-field located in memory is byte-addressed by the respective machine instructions and therefore SImode may only be used with a register or an offsettable memory operand (i.e. not an indexed, pre-decremented, or post-incremented one). Also give the insns names, for easier reference here and elsewhere. gcc/ * config/vax/vax.md (cmpv_2): Name insn. (cmpv, cmpzv, cmpzv_2): Likewise. Fix location predicate and constraint.	2020-12-05 18:26:26 +00:00
Maciej W. Rozycki	0a9ea215f7	VAX: Make `extv' an expander matching the remaining bit-field operations We have matching insns defined for `sign_extract' and `zero_extract' expressions, so make the three named patterns for bit-field operations consistent and make `extv' an expander rather than an insn taking a SImode, a QImode, and a SImode general operand for the LOC, SIZE, and POS operands respectively, like with the `extzv' and `insv' patterns, matching the machine instructions and giving the middle end more choice as to which actual insn to choose in a given situation. Given this program: typedef struct { int f0:1; int f1:7; int f8:8; int f16:16; } bit_t; typedef struct { unsigned int f0:1; unsigned int f1:7; unsigned int f8:8; unsigned int f16:16; } ubit_t; typedef union { bit_t b; int i; } bit_u; typedef union { ubit_t b; unsigned int i; } ubit_u; int ins1 (bit_u x, int y) { asm volatile ("" : "+r" (x), "+r" (y)); x.b.f1 = y; return x.i; } int ext1 (bit_u x) { asm volatile ("" : "+r" (x)); return x.b.f1; } unsigned int extz1 (ubit_u x) { asm volatile ("" : "+r" (x)); return x.b.f1; } int ins8 (bit_u x, int y) { asm volatile ("" : "+r" (x), "+r" (y)); x.b.f8 = y; return x.i; } int ext8 (bit_u x) { asm volatile ("" : "+r" (x)); return x.b.f8; } unsigned int extz8 (ubit_u x) { asm volatile ("" : "+r" (x)); return x.b.f8; } int ins16 (bit_u x, int y) { asm volatile ("" : "+r" (x), "+r" (y)); x.b.f16 = y; return x.i; } int ext16 (bit_u x) { asm volatile ("" : "+r" (x)); return x.b.f16; } unsigned int extz16 (ubit_u x) { asm volatile ("" : "+r" (x)); return x.b.f16; } this results in the following code change: @@ -16,12 +16,12 @@ ins1: .globl ext1 .type ext1, @function ext1: - .word 0 # 19 [c=0] procedure_entry_mask - subl2 $4,%sp # 20 [c=32] addsi3 + .word 0 # 18 [c=0] procedure_entry_mask + subl2 $4,%sp # 19 [c=32] addsi3 movl 4(%ap),%r0 # 2 [c=16] movsi_2 - cvtbl %r0,%r0 # 7 [c=4] extendqisi2 - ashl $-1,%r0,%r0 # 14 [c=40] vax.md:624 - ret # 24 [c=0] return + extv $1,$7,%r0,%r0 # 7 [c=60] extv_non_const + cvtbl %r0,%r0 # 13 [c=4] extendqisi2 + ret # 23 [c=0] return .size ext1, .-ext1 .align 1 .globl extz1 @@ -49,12 +49,12 @@ ins8: .globl ext8 .type ext8, @function ext8: - .word 0 # 20 [c=0] procedure_entry_mask - subl2 $4,%sp # 21 [c=32] addsi3 + .word 0 # 18 [c=0] procedure_entry_mask + subl2 $4,%sp # 19 [c=32] addsi3 movl 4(%ap),%r0 # 2 [c=16] movsi_2 - cvtwl %r0,%r0 # 7 [c=4] extendhisi2 - ashl $-8,%r0,%r0 # 15 [c=40] vax.md:624 - ret # 25 [c=0] return + rotl $24,%r0,%r0 # 13 [c=60] extv_non_const + cvtbl %r0,%r0 + ret # 23 [c=0] return .size ext8, .-ext8 .align 1 .globl extz8 If there is a performance degradation with the replacement sequences, then it can and should be sorted within `extv_non_const'. gcc/ * config/vax/vax.md (extv): Rename insn to... (*extv): ... this. (extv): New expander.	2020-12-05 18:26:26 +00:00
Maciej W. Rozycki	b3f3bba3fa	VAX: Ensure PIC mode address is adjustable with aligned bit-field insns With the `insv_aligned', `extzv_aligned' and `extv_aligned' insns we are going to adjust the bit-field location if it is in memory, so only allow such location addresses that can be offset, excluding external symbol references in the PIC mode in particular. This fixes an ICE like: during RTL pass: final In file included from .../gcc/testsuite/gcc.dg/torture/vshuf-v16qi.c:11: .../gcc/testsuite/gcc.dg/torture/vshuf-main.inc: In function 'test_13': .../gcc/testsuite/gcc.dg/torture/vshuf-main.inc:27:1: internal compiler error: in change_address_1, at emit-rtl.c:2275 .../gcc/testsuite/gcc.dg/torture/vshuf-16.inc:16:1: note: in expansion of macro 'T' .../gcc/testsuite/gcc.dg/torture/vshuf-main.inc:28:1: note: in expansion of macro 'TESTS' 0x10a34b33 change_address_1 .../gcc/emit-rtl.c:2275 0x10a358af adjust_address_1(rtx_def, machine_mode, poly_int<1u, long>, int, int, int, poly_int<1u, long>) .../gcc/emit-rtl.c:2409 0x11d2505f output_97 .../gcc/config/vax/vax.md:806 0x10adec4b get_insn_template(int, rtx_insn) .../gcc/final.c:2070 0x10ae1c5b final_scan_insn_1 .../gcc/final.c:3039 0x10ae2257 final_scan_insn(rtx_insn, _IO_FILE, int, int, int) .../gcc/final.c:3152 0x10ade9a3 final_1 .../gcc/final.c:2020 0x10ae6157 rest_of_handle_final .../gcc/final.c:4658 0x10ae6697 execute .../gcc/final.c:4736 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. compiler exited with status 1 FAIL: gcc.dg/torture/vshuf-v16qi.c -O2 (internal compiler error) triggered by an RTL instruction like: (insn 97 96 98 (set (reg:SI 5 %r5 [88]) (zero_extract:SI (mem/c:SI (symbol_ref:SI ("b") <var_decl 0x7ffff7f801b0 b>) [0 b+0 S4 A128]) (const_int 8 [0x8]) (const_int 24 [0x18]))) ".../gcc/testsuite/gcc.dg/torture/vshuf-main.inc":28:1 97 {extzv_aligned} (nil)) and removes these regressions: FAIL: gcc.dg/torture/vshuf-v16qi.c -O2 (internal compiler error) FAIL: gcc.dg/torture/vshuf-v16qi.c -O2 (test for excess errors) FAIL: gcc.dg/torture/vshuf-v4hi.c -O2 (internal compiler error) FAIL: gcc.dg/torture/vshuf-v4hi.c -O2 (test for excess errors) FAIL: gcc.dg/torture/vshuf-v8hi.c -O2 (internal compiler error) FAIL: gcc.dg/torture/vshuf-v8hi.c -O2 (test for excess errors) FAIL: gcc.dg/torture/vshuf-v8qi.c -O2 (internal compiler error) FAIL: gcc.dg/torture/vshuf-v8qi.c -O2 (test for excess errors) However expand typically presents pseudo-registers rather than memory references to these insns, so a further rework is required to make a better use of the code variant they are supposed to produce. This at least fixes the problem at hand. gcc/ config/vax/vax.md (insv_aligned, extzv_aligned) (*extv_aligned): Also make sure the memory address of a bit-field location can be adjusted in the PIC mode.	2020-12-05 18:26:26 +00:00
Maciej W. Rozycki	8a8de7507e	VAX: Remove EXTV/EXTZV/INSV instruction use from aligned case insns The INSV machine instruction is the only computational operation in the VAX ISA that keeps condition codes intact. In preparation to MODE_CC transition keep patterns apart then that make or do not make use of said instruction. For consistency update EXTV and EXTZV instruction uses accordingly. In expand SUBREGs will be presented as operands, so handle that possibility in the insn condition. This actually yields better code by avoiding EXTV/EXTZV instructions in pseudo-aligned register cases previously resorting to those instructions: @@ -42,7 +42,7 @@ ins8: subl2 $4,%sp # 21 [c=32] addsi3 movl 4(%ap),%r0 # 2 [c=16] movsi_2 movl 8(%ap),%r1 # 17 [c=16] movsi_2 - insv %r1,$8,$8,%r0 # 9 [c=4] insv_aligned + insv %r1,$8,$8,%r0 # 9 [c=4] insv_2 ret # 25 [c=0] return .size ins8, .-ins8 .align 1 @@ -60,12 +60,12 @@ ext8: .globl extz8 .type extz8, @function extz8: - .word 0 # 19 [c=0] procedure_entry_mask - subl2 $4,%sp # 20 [c=32] addsi3 + .word 0 # 18 [c=0] procedure_entry_mask + subl2 $4,%sp # 19 [c=32] addsi3 movl 4(%ap),%r0 # 2 [c=16] movsi_2 - extzv $8,$8,%r0,%r1 # 13 [c=60] extzv_aligned - movl %r1,%r0 # 18 [c=4] movsi_2 - ret # 24 [c=0] return + rotl $24,%r0,%r0 # 13 [c=60] extzv_non_const + movzbl %r0,%r0 + ret # 23 [c=0] return .size extz8, .-extz8 .align 1 .globl ins16 @@ -75,7 +75,7 @@ ins16: subl2 $4,%sp # 21 [c=32] addsi3 movl 4(%ap),%r0 # 2 [c=16] movsi_2 movl 8(%ap),%r1 # 17 [c=16] movsi_2 - insv %r1,$16,$16,%r0 # 9 [c=4] insv_aligned + insv %r1,$16,$16,%r0 # 9 [c=4] insv_2 ret # 25 [c=0] return .size ins16, .-ins16 .align 1 @@ -94,8 +94,9 @@ ext16: extz16: .word 0 # 18 [c=0] procedure_entry_mask subl2 $4,%sp # 19 [c=32] addsi3 - movl 4(%ap),%r1 # 2 [c=16] movsi_2 - extzv $16,$16,%r1,%r0 # 7 [c=60] extzv_aligned + movl 4(%ap),%r0 # 2 [c=16] movsi_2 + rotl $16,%r0,%r0 # 7 [c=60] extzv_non_const + movzwl %r0,%r0 movzwl %r0,%r0 # 13 [c=4] zero_extendhisi2 ret # 23 [c=0] return .size extz16, .-extz16 demonstrated with this program: typedef struct { int f0:1; int f1:7; int f8:8; int f16:16; } bit_t; typedef struct { unsigned int f0:1; unsigned int f1:7; unsigned int f8:8; unsigned int f16:16; } ubit_t; typedef union { bit_t b; int i; } bit_u; typedef union { ubit_t b; unsigned int i; } ubit_u; int ins1 (bit_u x, int y) { asm volatile ("" : "+r" (x), "+r" (y)); x.b.f1 = y; return x.i; } int ext1 (bit_u x) { asm volatile ("" : "+r" (x)); return x.b.f1; } unsigned int extz1 (ubit_u x) { asm volatile ("" : "+r" (x)); return x.b.f1; } int ins8 (bit_u x, int y) { asm volatile ("" : "+r" (x), "+r" (y)); x.b.f8 = y; return x.i; } int ext8 (bit_u x) { asm volatile ("" : "+r" (x)); return x.b.f8; } unsigned int extz8 (ubit_u x) { asm volatile ("" : "+r" (x)); return x.b.f8; } int ins16 (bit_u x, int y) { asm volatile ("" : "+r" (x), "+r" (y)); x.b.f16 = y; return x.i; } int ext16 (bit_u x) { asm volatile ("" : "+r" (x)); return x.b.f16; } unsigned int extz16 (ubit_u x) { asm volatile ("" : "+r" (x)); return x.b.f16; } It also papers over a regression: FAIL: gcc.dg/pr83623.c (internal compiler error) FAIL: gcc.dg/pr83623.c (test for excess errors) from an ICE like: during RTL pass: final .../gcc/testsuite/gcc.dg/pr83623.c: In function 'foo': .../gcc/testsuite/gcc.dg/pr83623.c:13:1: internal compiler error: in change_address_1, at emit-rtl.c:2275 0x10a056e3 change_address_1 .../gcc/emit-rtl.c:2275 0x10a0645f adjust_address_1(rtx_def, machine_mode, poly_int<1u, long>, int, int, int, poly_int<1u, long>) .../gcc/emit-rtl.c:2409 0x11cb588f output_97 .../gcc/config/vax/vax.md:808 0x10aafb2f get_insn_template(int, rtx_insn) .../gcc/final.c:2070 0x10ab2b3f final_scan_insn_1 .../gcc/final.c:3039 0x10ab313b final_scan_insn(rtx_insn, _IO_FILE, int, int, int) .../gcc/final.c:3152 0x10aaf887 final_1 .../gcc/final.c:2020 0x10ab703b rest_of_handle_final .../gcc/final.c:4658 0x10ab757b execute .../gcc/final.c:4736 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. compiler exited with status 1 FAIL: gcc.dg/pr83623.c (internal compiler error) triggered by an RTL instruction like: (insn 17 14 145 (set (reg:SI 1 %r1) (zero_extract:SI (mem/c:SI (symbol_ref:SI ("x") <var_decl 0x7ffff7f80120 x>) [1 x+0 S4 A128]) (const_int 16 [0x10]) (const_int 16 [0x10]))) ".../gcc/testsuite/gcc.dg/pr83623.c":12:9 97 {extzv_aligned} (nil)) (where the address cannot be adjusted by 2 for PIC code as requested here as it would create an offset external symbol reference) otherwise caused by the patterns modified here, addressed next. This indicates a further rework is warranted here, but at least problems at hand have been fixed. gcc/ * config/vax/vax.md (insv_aligned, extzv_aligned) (*extv_aligned): Reject register bit-field locations that are not aligned to the least significant bit; update output statement accordingly.	2020-12-05 18:26:26 +00:00
Maciej W. Rozycki	4c293413ca	VAX: Fix predicates and constraints for EXTV/EXTZV/INSV insns It makes no sense for insn operand predicates, as long as they accept a register operand, to be more restrictive than the set of the associated constraints, because expand will choose the insn based on the relevant operand being a pseudo register then and reload keep it happily as a memory reference if a constraint permits it. So the restriction posed by such a predicate will be happily ignored, and moreover if a splitter is added, such as required for MODE_CC support, the new instructions will reject the original operands supplied, causing an ICE. An actual example will be given with a subsequent change. Remove such inconsistencies we have with the EXTV/EXTZV/INSV insns then, observing that a bit-field located in memory is byte-addressed by the respective machine instructions and therefore SImode may only be used with a register or an offsettable memory operand (i.e. not an indexed, pre-decremented, or post-incremented one), which has already been taken into account with the constraints currently used, except for `insv_2'. The QI machine mode may be used for the bit-field location with any kind of memory operand, but we got the constraint wrong, although harmlessly in reality, with `insv'. Fix that for consistency though. Also give the insns names, for easier reference here and elsewhere. gcc/ * config/vax/vax.md (insv_aligned, extzv_aligned) (extv_aligned, extv_non_const, extzv_non_const): Name insns. Fix location predicate. (extzv): Name insn. (insv): Likewise. Fix location constraint. (insv_2): Likewise, and the predicate.	2020-12-05 18:26:26 +00:00
Maciej W. Rozycki	e93fbce844	VAX: Add the `movmemhi' instruction The MOVC3 machine instruction has `memmove' semantics[1]: "The operation of the instruction is such that overlap of the source and destination strings does not affect the result." so use it to provide the `movmemhi' instruction as well. References: [1] DEC STD 032-0 "VAX Architecture Standard", Digital Equipment Corporation, A-DS-EL-00032-00-0 Rev J, December 15, 1989, Section 3.10 "Character-String Instructions", p. 3-162 gcc/ * config/vax/vax.md (cpymemhi1): Rename insn to... (movmemhi1): ... this. (cpymemhi): Update accordingly. Remove constraints. (movmemhi): New expander. gcc/testsuite/ * gcc.target/vax/movmem.c: New test.	2020-12-05 18:26:26 +00:00
Maciej W. Rozycki	2c45dc7928	VAX: Add a test for the `cpymemhi' instruction gcc/testsuite/ * gcc.target/vax/cpymem.c: New test.	2020-12-05 18:26:26 +00:00
Maciej W. Rozycki	b9240a4abc	VAX: Actually produce QImode and HImode `ctz' operations The middle end does not refer to `ctzqi2'/`ctzhi2' or `ffsqi2'/`ffshi2' patterns by name where `__builtin_ctz' or `__builtin_ffs' respectively is invoked for an argument of the QImode or HImode type, and instead it extends the data type before passing it to `ctzsi2' or `ffssi2'. Avoid the redundant operation and use a peephole2 to convert it to the right RTL expression that will collapse the two operations into a single machine instruction instead unless we need the extended intermediate result for another purpose. gcc/ * config/vax/builtins.md: Add a peephole2 for QImode and HImode `ctz' operations. (any_extend): New code iterator. gcc/testsuite/ * gcc.target/vax/ctzhi.c: New test. * gcc.target/vax/ctzqi.c: New test. * gcc.target/vax/ffshi.c: New test. * gcc.target/vax/ffsqi.c: New test.	2020-12-05 18:26:25 +00:00
Maciej W. Rozycki	273ffa3a6f	VAX: Also provide QImode and HImode `ctz' and` ffs' operations The FFS machine instruction provides for arbitrary input bit-field widths so take advantage of this and convert `ffssi2' and `ctzsi2' to templates for all the three of QI, HI, SI machine modes. Test cases will be added separately. gcc/ * config/vax/builtins.md (width): New mode attribute. (ffssi2): Rework expander into... (ffs<mode>2): ... this. (ctzsi2): Rework insn into... (ctz<mode>2): ... this.	2020-12-05 18:26:25 +00:00
Maciej W. Rozycki	a17ab4b6ad	VAX: Provide the `ctz' operation Our `ffssi2_internal' pattern and the machine FFS instruction, which technically is a bit-field operation, match the `ctz' operation exactly, with the result produced for the bit-field source operand of zero equal to its width as specified with another machine instruction operand, not directly expressed in RTL and currently hardcoded in the assembly code produced. In our terms this is the bit size of the machine mode used, and although it's SImode now let's be flexible for an upcoming change. The operation also sets the Z condition code according to the value of the source operand. gcc/ * config/vax/builtins.md (ffssi2_internal): Rename insn to... (ctzsi2): ... this. Update the RTL operation. (ffssi2): Update accordingly. * config/vax/vax.c (vax_notice_update_cc): Handle CTZ. * config/vax/vax.h (CTZ_DEFINED_VALUE_AT_ZERO): New macro. gcc/testsuite/ * gcc.target/vax/ctzsi.c: New test.	2020-12-05 18:26:25 +00:00
Maciej W. Rozycki	da076a8b12	VAX: Add tests for `sync_lock_test_and_set' and` sync_lock_release' Based on gcc.dg/pr61756.c. gcc/testsuite/ * gcc.target/vax/bbcci.c: New test. * gcc.target/vax/bbssi.c: New test.	2020-12-05 18:26:25 +00:00
Maciej W. Rozycki	65eee57a8c	VAX: Add a test for the SImode `ffs' operation gcc/testsuite/ * gcc.target/vax/ffssi.c: New test.	2020-12-05 18:26:25 +00:00
Maciej W. Rozycki	ea84baeb19	VAX: Actually enable `builtins.md' now that it is fully functional Test cases will follow. gcc/ * config/vax/vax.md: Include `builtins.md'.	2020-12-05 18:26:25 +00:00
Maciej W. Rozycki	fbe575958c	VAX: Correct `sync_lock_test_and_set' and` sync_lock_release' builtins Remove an ICE like: during RTL pass: expand .../libatomic/tas_n.c: In function 'libat_test_and_set_1': .../libatomic/tas_n.c:39:1: internal compiler error: in patch_jump_insn, at cfgrtl.c:1298 39 \| } \| ^ 0x108a09ff patch_jump_insn .../gcc/cfgrtl.c:1298 0x108a0b07 redirect_branch_edge .../gcc/cfgrtl.c:1325 0x108a124b rtl_redirect_edge_and_branch .../gcc/cfgrtl.c:1458 0x1087f6d3 redirect_edge_and_branch(edge_def, basic_block_def) .../gcc/cfghooks.c:373 0x11d6264b try_forward_edges .../gcc/cfgcleanup.c:562 0x11d6b0eb try_optimize_cfg .../gcc/cfgcleanup.c:2960 0x11d6ba4f cleanup_cfg(int) .../gcc/cfgcleanup.c:3174 0x10870b3f execute .../gcc/cfgexpand.c:6763 triggered with an RTL pattern like: (jump_insn 8 7 20 2 (parallel [ (set (pc) (if_then_else (ne (zero_extract:SI (mem/v:QI (mem/f/c:SI (reg/f:SI 16 virtual-incoming-args) [1 mptr+0 S4 A32]) [-1 S1 A8]) (const_int 1 [0x1]) (const_int 0 [0])) (const_int 0 [0])) (label_ref 10) (pc))) (set (zero_extract:SI (mem/v:QI (mem/f/c:SI (reg/f:SI 16 virtual-incoming-args) [1 mptr+0 S4 A32]) [-1 S1 A8]) (const_int 1 [0x1]) (const_int 0 [0])) (const_int 1 [0x1])) ]) ".../libatomic/tas_n.c":38:12 -1 (nil) -> 10) caused by a volatile memory reference used that is not accepted by the `memory_operand' predicate of the `jbbssiqi' insn explicitly referred from the `sync_lock_test_and_setqi' expander. Also seen with: FAIL: gcc.dg/pr61756.c (internal compiler error) Define a new `any_memory_operand' predicate accepting both ordinary and volatile memory references and use it with the `jbb<ccss>i<mode>' insn, so as to address the ICE. Also remove useless operations from the `sync_lock_test_and_set<mode>' and `sync_lock_release<mode>' expanders as those always either complete or fail and therefore never fall through to using their template other than to match operands. Wrap `jbb<ccss>i<mode>' into `unspec_volatile' instead so that the jump does not get removed or reordered. Share one index to avoid a complication around the iterators since the index is nowhere referred to anyway and the pattern required pulled by its name. Test cases will be added separately. gcc/ * config/vax/predicates.md (volatile_mem_operand) (any_memory_operand): New predicates. * config/vax/builtins.md (VUNSPEC_UNLOCK): Remove constant. (sync_lock_test_and_set<mode>): Remove `set' and `unspec' operations, match operands only. Reformat. (sync_lock_release<mode>): Likewise. Remove cruft. (jbb<ccss>i<mode>): Wrap into `unspec_volatile', use `any_memory_operand' predicate.	2020-12-05 18:26:25 +00:00
Maciej W. Rozycki	2500add25b	VAX: Use an int iterator to produce individual interlocked branches With mode-specific interlocked branch insns already folded into iterated templates now fold the two templates into one too, observing that the only difference between them is the value of the bit branched on, which is of course reflected both in the RTL expression and the instruction produced. Use an int iterator to iterate over the bit value, making use of the newly-added wide integer support, and substituting patterns as necessary to produce equivalent individual insns. No functional change. gcc/ * config/vax/builtins.md (bit): New int iterator. (ccss): New int attribute. (jbbssi<mode>, jbbcci<mode>): Fold insns into... (jbb<ccss>i<mode>): ... this.	2020-12-05 18:26:25 +00:00
Maciej W. Rozycki	47d524a636	VAX: Use a mode iterator to produce individual interlocked branches Regardless of the machine mode all the interlocked branches of the same kind, one of the two provided by the ISA, use the same RTL patterns and machine instructions, except for the memory operand's constraint. Remove code duplication then and make use of a mode iterator combined with an attribute to expand the same insn patterns with the constraint suitably substituted from a single template. No functional change. gcc/ * config/vax/builtins.md (bb_mem): New mode attribute. (jbbssiqi, jbbssihi, jbbssisi): Fold insns into... (jbbssi<mode>): ... this. (jbbcciqi, jbbccihi, jbbccisi): Likewise... (jbbcci<mode>): ... this.	2020-12-05 18:26:25 +00:00
Maciej W. Rozycki	630c9a4d54	jump: Also handle jumps wrapped in UNSPEC or UNSPEC_VOLATILE VAX has interlocked branch instructions used for atomic operations and we want to have them wrapped in UNSPEC_VOLATILE so as not to have code carried across. This however breaks with jump optimization and leads to an ICE in the build of libbacktrace like: .../libbacktrace/mmap.c:190:1: internal compiler error: in fixup_reorder_chain, at cfgrtl.c:3934 190 \| } \| ^ 0x1087d46b fixup_reorder_chain .../gcc/cfgrtl.c:3934 0x1087f29f cfg_layout_finalize() .../gcc/cfgrtl.c:4447 0x1087c74f execute .../gcc/cfgrtl.c:3662 on RTL like: (jump_insn 18 17 150 4 (unspec_volatile [ (set (pc) (if_then_else (eq (zero_extract:SI (mem/v:SI (reg/f:SI 23 [ _2 ]) [-1 S4 A32]) (const_int 1 [0x1]) (const_int 0 [0])) (const_int 1 [0x1])) (label_ref 20) (pc))) (set (zero_extract:SI (mem/v:SI (reg/f:SI 23 [ _2 ]) [-1 S4 A32]) (const_int 1 [0x1]) (const_int 0 [0])) (const_int 1 [0x1])) ] 101) ".../libbacktrace/mmap.c":135:14 158 {jbbssisi} (nil) -> 20) when those branches are enabled with a follow-up change. Also showing with: FAIL: gcc.dg/pr61756.c (internal compiler error) Handle branches wrapped in UNSPEC_VOLATILE then and, for consistency, also in UNSPEC. The presence of UNSPEC_VOLATILE will prevent such branches from being removed as they won't be accepted by `onlyjump_p', we just need to let them through. gcc/ * jump.c (pc_set): Also accept a jump wrapped in UNSPEC or UNSPEC_VOLATILE. (any_uncondjump_p, any_condjump_p): Update comment accordingly.	2020-12-05 18:26:25 +00:00
Maciej W. Rozycki	4b70b2e07a	loop-doloop: Add missing call to `onlyjump_p' Keep any jump that has side effects as those must not be removed. gcc/ * loop-doloop.c (add_test): Only remove the jump if `onlyjump_p'.	2020-12-05 18:26:25 +00:00
Maciej W. Rozycki	64880a7c49	cfgrtl: Add missing call to `onlyjump_p' If any unconditional jumps within a block have side effects then the block cannot be considered empty. gcc/ * cfgrtl.c (rtl_block_empty_p): Return false if `!onlyjump_p' too.	2020-12-05 18:26:24 +00:00
Maciej W. Rozycki	4ec78ef483	sel-sched-ir: Add missing call to `onlyjump_p' Do not try to remove a conditional jump if it has side effects. gcc/ * sel-sched-ir.c (maybe_tidy_empty_bb): Only try to remove a conditional jump if `onlyjump_p'.	2020-12-05 18:26:24 +00:00
Maciej W. Rozycki	a2bd4e52cf	loop-iv: Add missing calls to `onlyjump_p' Ignore jumps that have side effects in loop processing as pasting the body of a loop multiple times within is semantically equivalent to jump deletion (between the iterations unrolled) even if we do not physically delete the jump RTL insn. gcc/ * loop-iv.c (simplify_using_initial_values): Only process jumps that match `onlyjump_p'. (check_simple_exit): Likewise.	2020-12-05 18:26:24 +00:00
Maciej W. Rozycki	94f336768e	ifcvt: Add missing call to `onlyjump_p' Do not convert a conditional jump into conditional execution (and remove the jump as a consequence) if the jump has side effects. gcc/ * ifcvt.c (dead_or_predicable) [!IFCVT_MODIFY_TESTS]: Bail out if `!onlyjump_p'.	2020-12-05 18:26:24 +00:00
Maciej W. Rozycki	da749b98cf	RTL: Also support HOST_WIDE_INT with int iterators Add wide integer aka 'w' rtx format support to int iterators so that machine description can iterate over `const_int' expressions. This is made by expanding standard integer aka 'i' format support, observing that any standard integer already present in any of our existing RTL code will also fit into HOST_WIDE_INT, so there is no need for a separate handler. Any truncation of the number parsed is made by the caller. An assumption is made however that no place relies on capping out of range values to INT_MAX. Now the 'p' format is handled explicitly rather than being implied by rtx being a SUBREG, so actually assert that it is, just to play safe. gcc/ * read-rtl.c: Add a page-feed separator at the start of iterator code. (struct iterator_group): Change the return type to HOST_WIDE_INT for the `find_builtin' member. Likewise the second parameter type for the `apply_iterator' member. (atoll) [!HAVE_ATOQ]: Reorder. (find_mode, find_code): Change the return type to HOST_WIDE_INT. (apply_mode_iterator, apply_code_iterator) (apply_subst_iterator): Change the second parameter type to HOST_WIDE_INT. (find_int): Handle input suitable for HOST_WIDE_INT output. (apply_int_iterator): Rewrite in terms of explicit format interpretation. (rtx_reader::read_rtx_operand) <'w'>: Fold into... <'i', 'n', 'p'>: ... this. * doc/md.texi (Int Iterators): Document 'w' rtx format support.	2020-12-05 18:26:24 +00:00
Maciej W. Rozycki	8c18e22afb	VAX: Correct fatal issues with the `ffs' builtin The `builtins.md' machine description fragment is not included anywhere and is therefore dead code, which has become bitrotten due to non-use. If actually enabled, it does not build due to the use of an unknown `t' constraint: .../gcc/config/vax/builtins.md:42:1: error: undefined machine-specific constraint at this point: "t" .../gcc/config/vax/builtins.md:42:1: note: in operand 1 which came from commit becb93d02cc1 ("builtins.md (ffssi2_internal): Correct constraint."), which was not applied as posted and reviewed; `T' was meant to be used instead. Once this has been fixed this code still fails building: .../gcc/config/vax/builtins.md: In function 'rtx_def* gen_ffssi2(rtx, rtx)': .../gcc/config/vax/builtins.md:35:19: error: 'gen_bne' was not declared in this scope; did you mean 'gen_use'? 35 \| emit_jump_insn (gen_bne (label)); \| ^~~~~~~ \| gen_use make[2]: *** [Makefile:1122: insn-emit.o] Error 1 Finally the FFS machine instruction sets the Z condition code according to the comparison of the value held in the source operand against zero rather than the value held in the target operand. If the source operand is found hold zero, then the target operand is set to the width of the source operand, 32 for SImode (FFS supports arbitrary widths). Correct the build issues then and update RTL to match the operation of the machine instruction. A test case will be added separately. gcc/ * config/vax/builtins.md (ffssi2): Make preparation statements actually buildable. (ffssi2_internal): Fix input constraints; make the RTL pattern match reality for `cc0'.	2020-12-05 18:26:24 +00:00
Maciej W. Rozycki	dfb21f37fd	VAX: Rationalize expression and address costs Expression costs are required to be given in terms of COSTS_N_INSNS (n), which is defined to stand for the count of single fast instructions, and actually returns `n * 4'. The VAX backend however instead operates on naked numbers, causing an anomaly for the integer const zero rtx, where the cost given is 4 as opposed to 1 for integers in the [1:63] range, as well as -1 for comparisons. This is because the value of 0 returned by `vax_rtx_costs' is converted to COSTS_N_INSNS (1) in `pattern_cost': return cost > 0 ? cost : COSTS_N_INSNS (1); Consequently, where feasible, 1 or -1 are preferred over 0 by the middle end causing code pessimization, e.g. rather than producing this: subl2 $4,%sp movl 4(%ap),%r0 jgtr .L2 addl2 $2,%r0 .L2: ret or this: subl2 $4,%sp addl3 4(%ap),8(%ap),%r0 jlss .L6 addl2 $2,%r0 .L6: ret code is produced like this: subl2 $4,%sp movl 4(%ap),%r0 cmpl %r0,$1 jgeq .L2 addl2 $2,%r0 .L2: ret or this: subl2 $4,%sp addl3 4(%ap),8(%ap),%r0 cmpl %r0,$-1 jleq .L6 addl2 $2,%r0 .L6: ret from this: int compare_mov (int x) { if (x > 0) return x; else return x + 2; } and this: int compare_add (int x, int y) { int z; z = x + y; if (z < 0) return z; else return z + 2; } respectively, which is slower and larger both at a time. Furthermore once the backend is converted to MODE_CC this anomaly makes it usually impossible to remove redundant comparisons in the comparison elimination pass, because most VAX instructions set the condition codes as per the relation of the instruction's result to 0 and not -1. The middle end has some other assumptions as to rtx costs being given in terms of COSTS_N_INSNS, so wrap all the VAX rtx costs then as they stand into COSTS_N_INSNS invocations, effectively scaling the costs by 4 while preserving their relative values, except for the integer const zero rtx given the value of `COSTS_N_INSNS (1) / 2', half of a fast instruction (this can be further halved if needed in the future). Adjust address costs likewise so that they remain proportional to the new absolute values of rtx costs. Code size stats are as follows, collected from 17639 executables built in `check-c' GCC testing: samples average median -------------------------------------- regressions 1420 0.400% 0.195% unchanged 13811 0.000% 0.000% progressions 2408 -0.504% -0.201% -------------------------------------- total 17639 -0.037% 0.000% with a small number of outliers only (over 5% size change): old new change %change filename ---------------------------------------------------- 4991 5249 258 5.1693 981001-1.exe 2637 2777 140 5.3090 interchange-6.exe 2187 2307 120 5.4869 sprintf.x7 3969 4197 228 5.7445 pr28982a.exe 8264 8816 552 6.6795 vector-compare-1.exe 5199 5575 376 7.2321 pr28982b.exe 2113 2411 298 14.1031 20030323-1.exe 2113 2411 298 14.1031 20030323-1.exe 2113 2411 298 14.1031 20030323-1.exe so it seems we are looking good, and we have complementing reductions to compensate: old new change %change filename ---------------------------------------------------- 2919 2631 -288 -9.8663 pr57521.exe 3427 3167 -260 -7.5868 sabd_1.exe 2985 2765 -220 -7.3701 ssad-run.exe 2985 2765 -220 -7.3701 ssad-run.exe 2985 2765 -220 -7.3701 usad-run.exe 2985 2765 -220 -7.3701 usad-run.exe 4509 4253 -256 -5.6775 vshuf-v2sf.exe 4541 4285 -256 -5.6375 vshuf-v2si.exe 4673 4417 -256 -5.4782 vshuf-v2df.exe 2993 2841 -152 -5.0785 abs-2.x4 2993 2841 -152 -5.0785 abs-3.x4 This actually causes `loop-8.c' to regress: FAIL: gcc.dg/loop-8.c scan-rtl-dump-times loop2_invariant "Decided" 1 FAIL: gcc.dg/loop-8.c scan-rtl-dump-not loop2_invariant "without introducing a new temporary register" but upon a closer inspection this is a red herring. Old code looks as follows: .file "loop-8.c" .text .align 1 .globl f .type f, @function f: .word 0 subl2 $4,%sp movl 4(%ap),%r2 movl 8(%ap),%r3 movl $42,(%r2) clrl %r0 movl $42,%r1 movl %r1,%r4 jbr .L2 .L5: movl %r4,%r1 .L2: movl %r1,(%r3)[%r0] incl %r0 cmpl %r0,$100 jeql .L6 movl $42,(%r2)[%r0] bicl3 $-2,%r0,%r1 jeql .L5 movl %r0,%r1 jbr .L2 .L6: ret .size f, .-f while new one is like below: .file "loop-8.c" .text .align 1 .globl f .type f, @function f: .word 0 subl2 $4,%sp movl 4(%ap),%r2 movl $42,(%r2)+ movl 8(%ap),%r1 clrl %r0 movl $42,%r3 movzbl $100,%r4 movl %r3,%r5 jbr .L2 .L5: movl %r5,%r3 .L2: movl %r3,(%r1)+ incl %r0 cmpl %r0,%r4 jeql .L6 movl $42,(%r2)+ bicl3 $-2,%r0,%r3 jeql .L5 movl %r0,%r3 jbr .L2 .L6: ret .size f, .-f and is clearly better: not only it is smaller, but it also uses the post-increment rather than indexed addressing mode in the loop, of which the former comes for free in terms of both performance and code size while the latter causes an extra byte per operand to be produced for the index register and also incurs an execution penalty for the extra address calculation. Exclude the case from VAX testing then, as already done for some other targets and discussed with commit d242fdaec186 ("gcc.dg/loop-8.c: Skip for mmix."). gcc/ * config/vax/vax.c (vax_address_cost): Express the cost in terms of COSTS_N_INSNS. (vax_rtx_costs): Likewise. gcc/testsuite/ * gcc.dg/loop-8.c: Exclude for `vax--'. * gcc.target/vax/compare-add-zero.c: New test. * gcc.target/vax/compare-mov-zero.c: New test.	2020-12-05 18:26:24 +00:00
Maciej W. Rozycki	7920fe3d81	VAX/testsuite: Run target testing over all the usual optimization levels It makes sense to use what other targets do and run all the VAX test cases over all the usual optimization levels, so make `vax.exp' use our `gcc-dg-runtest' rather than the generic `dg-runtest' test driver. This breaks `pr56875.c' however, which is optimized away at levels above `-O0' as a result of how it has been written for calculations to make no effect: FAIL: gcc.target/vax/pr56875.c -O1 scan-assembler ashq .,\\$0xffffffffffffffff, FAIL: gcc.target/vax/pr56875.c -O2 scan-assembler ashq .,\\$0xffffffffffffffff, FAIL: gcc.target/vax/pr56875.c -O3 -g scan-assembler ashq .,\\$0xffffffffffffffff, FAIL: gcc.target/vax/pr56875.c -Os scan-assembler ashq .,\\$0xffffffffffffffff, FAIL: gcc.target/vax/pr56875.c -O2 -flto -fno-use-linker-plugin -flto-partition=none scan-assembler ashq .,\\$0xffffffffffffffff, FAIL: gcc.target/vax/pr56875.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects scan-assembler ashq .,\\$0xffffffffffffffff, Rather than keeping it at `-O0' update the test case for its code to do make effect while retaining its sense. Also reformat it according to our requirements. gcc/testsuite/ * gcc.target/vax/vax.exp: Use `gcc-dg-runtest' rather than `dg-runtest'. * gcc.target/vax/pr56875.c (dg-options): Make empty. (a): Rewrite for calculations to make effect. Reformat.	2020-12-05 18:26:24 +00:00
Maciej W. Rozycki	85f5a7d6ac	VAX: Define LEGITIMATE_PIC_OPERAND_P The VAX ELF psABI does not permit the use of all hardware operand modes for PIC symbol references due to the need to use PC-relative addressing for symbols that end up local and the need to make references indirect symbols that end up global. Therefore symbols referred as immediates may only be used with the move and push address (MOVA and PUSHA) instructions and their PC-relative displacement address mode, as there is no genuine PC-relative immediate available that all the other instructions would have to use. Furthermore global symbol references must not have an offset applied, which has to be added with a separate instruction, because there is no support now for GOT entries for external `symbol+offset' references, so any indirect GOT references made by the static linker from the original direct symbol references must not have an addend applied. Consequently no addend is allowed even if a given external symbol turns out local, for whatever reason, at the static link time. Define the LEGITIMATE_PIC_OPERAND_P macro then, a corresponding function and predicate to exclude the relevant expressions as required, and then a constraint so that reloads are produced where needed, and use the new facilities in the machine description, folding corresponding duplicated patterns for local and external symbols together. Rewrite predicates to make use of the new function, rename them to match their sense and also remove ones no longer used. All this fixing an ICE like this: during RTL pass: postreload .../gcc/testsuite/gcc.c-torture/execute/20040709-2.c: In function 'testE': .../gcc/testsuite/gcc.c-torture/execute/20040709-2.c:89:1: internal compiler error: in reload_combine_note_use, at postreload.c:1559 .../gcc/testsuite/gcc.c-torture/execute/20040709-2.c:96:65: note: in expansion of macro 'T' 0x10fe84cb reload_combine_note_use .../gcc/postreload.c:1559 0x10fe8857 reload_combine_note_use .../gcc/postreload.c:1621 0x10fe8303 reload_combine_note_use .../gcc/postreload.c:1517 0x10fe7c7b reload_combine .../gcc/postreload.c:1408 0x10fe3417 reload_cse_regs .../gcc/postreload.c:67 0x10feaf9f execute .../gcc/postreload.c:2358 due to the presence of a pseudo register post-reload: (insn 435 228 229 13 (set (reg:SI 1 %r1) (mem/c:SI (reg/f:SI 341) [25 sE+12 S4 A8])) ".../gcc/testsuite/gcc.c-torture/execute/20040709-2.c":96:65 12 {movsi_2} (nil)) (due to the use of an offset `sE+12' symbol reference) and removing these regressions: FAIL: gcc.c-torture/execute/20040709-2.c -O2 (internal compiler error) FAIL: gcc.c-torture/execute/20040709-2.c -O2 (test for excess errors) FAIL: gcc.c-torture/execute/20040709-2.c -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (internal compiler error) FAIL: gcc.c-torture/execute/20040709-2.c -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) FAIL: gcc.c-torture/execute/20040709-2.c -O3 -g (internal compiler error) FAIL: gcc.c-torture/execute/20040709-2.c -O3 -g (test for excess errors) FAIL: gcc.c-torture/execute/20040709-2.c -Os (internal compiler error) FAIL: gcc.c-torture/execute/20040709-2.c -Os (test for excess errors) FAIL: gcc.c-torture/execute/20040709-2.c -O2 -flto -fno-use-linker-plugin -flto-partition=none (internal compiler error) FAIL: gcc.c-torture/execute/20040709-2.c -O2 -flto -fno-use-linker-plugin -flto-partition=none (test for excess errors) FAIL: gcc.c-torture/execute/20040709-3.c -O2 (internal compiler error) FAIL: gcc.c-torture/execute/20040709-3.c -O2 (test for excess errors) FAIL: gcc.c-torture/execute/20040709-3.c -O3 -g (internal compiler error) FAIL: gcc.c-torture/execute/20040709-3.c -O3 -g (test for excess errors) FAIL: gcc.c-torture/execute/20040709-3.c -Os (internal compiler error) FAIL: gcc.c-torture/execute/20040709-3.c -Os (test for excess errors) FAIL: gcc.c-torture/execute/20040709-3.c -O2 -flto -fno-use-linker-plugin -flto-partition=none (internal compiler error) FAIL: gcc.c-torture/execute/20040709-3.c -O2 -flto -fno-use-linker-plugin -flto-partition=none (test for excess errors) FAIL: gcc.dg/torture/pr52028.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects (internal compiler error) FAIL: gcc.dg/torture/pr52028.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects (test for excess errors) gcc/ * config/vax/constraints.md (A): New constraint. * config/vax/predicates.md (external_symbolic_operand) (external_const_operand): Remove predicates. (local_symbolic_operand): Rename to... (pic_symbolic_operand): ... this, and rework. (external_memory_operand): Rename to... (non_pic_external_memory_operand): ... this, and rework. (illegal_blk_memory_operand, illegal_addsub_di_memory_operand): Update accordingly. * config/vax/vax-protos.h (vax_acceptable_pic_operand_p): New prototype. * config/vax/vax.c (vax_acceptable_pic_operand_p): New function. (vax_output_int_add): Update according to predicate rework. * config/vax/vax.h (LEGITIMATE_PIC_OPERAND_P): New macro. * config/vax/vax.md (pushlclsymreg, pushextsymreg): Fold together, and rename to... (pushsymreg): ... this. Use the `pic_symbolic_operand' predicate and the `A' constraint for the displacement operand. (movlclsymreg, movextsymreg): Fold together, and rename to... (movsymreg): ... this. Use the `pic_symbolic_operand' predicate and the `A' constraint for the displacement operand. (pushextsym, pushlclsym): Fold together, and rename to... (pushsym): ... this. Use the `pic_symbolic_operand' predicate and the `A' constraint for the displacement operand. (movextsym, movlclsym): Fold together, and rename to... (movsym): ... this. Use the `pic_symbolic_operand' predicate and the `A' constraint for the displacement operand.	2020-12-05 18:26:24 +00:00
Maciej W. Rozycki	91ae8fbc5a	VAX: Remove `c' operand format specifier overload The `c' operand format specifier is handled directly by the middle end in `output_asm_insn': %cN means require operand N to be a constant and print the constant expression with no punctuation. however it resorts to the target for constants that are not valid addresses: else if (letter == 'c') { if (CONSTANT_ADDRESS_P (operands[opnum])) output_addr_const (asm_out_file, operands[opnum]); else output_operand (operands[opnum], 'c'); } The VAX backend expects the fallback never to happen and overloads `c' with the branch condition code. This is confusing however and it is not like we are short of letters, so instead make the branch condition code use `k', and then for consistency make `K' the reverse branch condition code format specifier. This is safe to do as we provide no means to use a computed branch condition code in user `asm'. gcc/ * config/vax/vax.c (print_operand): Replace `c' and `C' with `k' and `K' respectively. * config/vax/vax.md (branch, branch_reversed): Update accordingly.	2020-12-05 18:26:24 +00:00
Matt Thomas	a27d5f9a73	PR target/58901: reload: Handle SUBREG of MEM with a mode-dependent address Fix an ICE with the handling of RTL expressions like: (subreg:QI (mem/c:SI (plus:SI (plus:SI (mult:SI (reg/v:SI 0 %r0 [orig:67 i ] [67]) (const_int 4 [0x4])) (reg/v/f:SI 7 %r7 [orig:59 doacross ] [59])) (const_int 40 [0x28])) [1 MEM[(unsigned int )doacross_63 + 40B + i_106 4]+0 S4 A32]) 0) that causes the compilation of libgomp to fail: during RTL pass: reload .../libgomp/ordered.c: In function 'GOMP_doacross_wait': .../libgomp/ordered.c:507:1: internal compiler error: in change_address_1, at emit-rtl.c:2275 507 \| } \| ^ 0x10a3462b change_address_1 .../gcc/emit-rtl.c:2275 0x10a353a7 adjust_address_1(rtx_def, machine_mode, poly_int<1u, long>, int, int, int, poly_int<1u, long>) .../gcc/emit-rtl.c:2409 0x10ae2993 alter_subreg(rtx_def, bool) .../gcc/final.c:3368 0x10ae25cf cleanup_subreg_operands(rtx_insn) .../gcc/final.c:3322 0x110922a3 reload(rtx_insn, int) .../gcc/reload1.c:1232 0x10de2bf7 do_reload .../gcc/ira.c:5812 0x10de3377 execute .../gcc/ira.c:5986 in a `vax-netbsdelf' build, where an attempt is made to change the mode of the contained memory reference to the mode of the containing SUBREG. Such RTL expressions are produced by the VAX shift and rotate patterns (`ashift', `ashiftrt', `rotate', `rotatert') where the count operand always has the QI mode regardless of the mode, either SI or DI, of the datum shifted or rotated. Such a mode change cannot work where the memory reference uses the indexed addressing mode, where a multiplier is implied that in the VAX ISA depends on the width of the memory access requested and therefore changing the machine mode would change the address calculation as well. Avoid the attempt then by forcing the reload of any SUBREGs containing a mode-dependent memory reference, also fixing these regressions: FAIL: gcc.c-torture/compile/pr46883.c -Os (internal compiler error) FAIL: gcc.c-torture/compile/pr46883.c -Os (test for excess errors) FAIL: gcc.c-torture/execute/20120808-1.c -O2 (internal compiler error) FAIL: gcc.c-torture/execute/20120808-1.c -O2 (test for excess errors) FAIL: gcc.c-torture/execute/20120808-1.c -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (internal compiler error) FAIL: gcc.c-torture/execute/20120808-1.c -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) FAIL: gcc.c-torture/execute/20120808-1.c -O3 -g (internal compiler error) FAIL: gcc.c-torture/execute/20120808-1.c -O3 -g (test for excess errors) FAIL: gcc.c-torture/execute/20120808-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none (internal compiler error) FAIL: gcc.c-torture/execute/20120808-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none (test for excess errors) FAIL: gcc.c-torture/execute/20120808-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects (internal compiler error) FAIL: gcc.c-torture/execute/20120808-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects (test for excess errors) FAIL: gcc.dg/20050629-1.c (internal compiler error) FAIL: gcc.dg/20050629-1.c (test for excess errors) FAIL: c-c++-common/torture/pr53505.c -Os (internal compiler error) FAIL: c-c++-common/torture/pr53505.c -Os (test for excess errors) FAIL: gfortran.dg/coarray_failed_images_1.f08 -Os (internal compiler error) FAIL: gfortran.dg/coarray_stopped_images_1.f08 -Os (internal compiler error) With test case #0 included it causes a reload with: (insn 15 14 16 4 (set (reg:SI 31) (ashift:SI (const_int 1 [0x1]) (subreg:QI (reg:SI 30 [ MEM[(int )s_8(D) + 4B + _5 * 4] ]) 0))) "pr58901-0.c":15:12 94 {ashlsi3} (expr_list:REG_DEAD (reg:SI 30 [ MEM[(int )s_8(D) + 4B + _5 4] ]) (nil))) as follows: Reloads for insn # 15 Reload 0: reload_in (SI) = (reg:SI 30 [ MEM[(int )s_8(D) + 4B + _5 4] ]) ALL_REGS, RELOAD_FOR_INPUT (opnum = 2) reload_in_reg: (reg:SI 30 [ MEM[(int )s_8(D) + 4B + _5 4] ]) reload_reg_rtx: (reg:SI 5 %r5) resulting in: (insn 37 14 15 4 (set (reg:SI 5 %r5) (mem/c:SI (plus:SI (plus:SI (mult:SI (reg/v:SI 1 %r1 [orig:25 i ] [25]) (const_int 4 [0x4])) (reg/v/f:SI 4 %r4 [orig:29 s ] [29])) (const_int 4 [0x4])) [1 MEM[(int )s_8(D) + 4B + _5 4]+0 S4 A32])) "pr58901-0.c":15:12 12 {movsi_2} (nil)) (insn 15 37 16 4 (set (reg:SI 2 %r2 [31]) (ashift:SI (const_int 1 [0x1]) (reg:QI 5 %r5))) "pr58901-0.c":15:12 94 {ashlsi3} (nil)) and assembly like: .L3: movl 4(%r4)[%r1],%r5 ashl %r5,$1,%r2 xorl2 %r2,%r0 incl %r1 cmpl %r1,%r3 jneq .L3 produced for the loop, providing optimization has been enabled. Likewise with test case #1 the reload of: (insn 17 16 18 4 (set (reg:SI 34) (and:SI (subreg:SI (reg/v:DI 27 [ t ]) 4) (const_int 1 [0x1]))) "pr58901-1.c":18:20 77 {andsi_const_int} (expr_list:REG_DEAD (reg/v:DI 27 [ t ]) (nil))) is as follows: Reloads for insn # 17 Reload 0: reload_in (DI) = (reg/v:DI 27 [ t ]) reload_out (SI) = (reg:SI 2 %r2 [34]) ALL_REGS, RELOAD_OTHER (opnum = 0) reload_in_reg: (reg/v:DI 27 [ t ]) reload_out_reg: (reg:SI 2 %r2 [34]) reload_reg_rtx: (reg:DI 4 %r4) resulting in: (insn 40 16 17 4 (set (reg:DI 4 %r4) (mem/c:DI (plus:SI (mult:SI (reg/v:SI 1 %r1 [orig:26 i ] [26]) (const_int 8 [0x8])) (reg/v/f:SI 3 %r3 [orig:30 s ] [30])) [1 MEM[(const struct s )s_13(D) + _7 * 8]+0 S8 A32])) "pr58901-1.c":18:20 11 {movdi} (nil)) (insn 17 40 41 4 (set (reg:SI 4 %r4) (and:SI (reg:SI 5 %r5 [+4 ]) (const_int 1 [0x1]))) "pr58901-1.c":18:20 77 {andsi_const_int} (nil)) and assembly like: .L3: movq (%r3)[%r1],%r4 bicl3 $-2,%r5,%r4 addl2 %r4,%r0 jaoblss %r0,%r1,.L3 First posted at: <https://gcc.gnu.org/ml/gcc/2014-06/msg00060.html>. 2020-12-05 Matt Thomas <matt@3am-software.com> Maciej W. Rozycki <macro@linux-mips.org> gcc/ PR target/58901 reload.c (push_reload): Also reload the inner expression of a SUBREG for pseudos associated with a mode-dependent memory reference. (find_reloads): Force a reload likewise. 2020-12-05 Maciej W. Rozycki <macro@linux-mips.org> gcc/testsuite/ PR target/58901 * gcc.c-torture/compile/pr58901-0.c: New test. * gcc.c-torture/compile/pr58901-1.c: New test.	2020-12-05 18:26:23 +00:00
Roman Zhuykov	4eb8f93d02	modulo-sched: Carefully process loop counter initialization [PR97421] Do not allow direct adjustment of pre-header initialization instruction for count register if is read in some instruction below in that basic block. gcc/ChangeLog: PR rtl-optimization/97421 * modulo-sched.c (generate_prolog_epilog): Remove forward declaration, adjust last argument name and type. (const_iteration_count): Add bool pointer parameter to return whether count register is read in pre-header after its initialization. (sms_schedule): Fix count register initialization adjustment procedure according to what const_iteration_count said. gcc/testsuite/ChangeLog: PR rtl-optimization/97421 * gcc.c-torture/execute/pr97421-1.c: New test. * gcc.c-torture/execute/pr97421-2.c: New test. * gcc.c-torture/execute/pr97421-3.c: New test.	2020-12-05 18:45:27 +03:00
Paul Thomas	7ae210d5db	Fortran: flag formal argument before resolving an array spec [PR98016]. 2020-12-05 Paul Thomas <pault@gcc.gnu.org> gcc/fortran PR fortran/98016 * resolve.c (resolve_symbol): Set formal_arg_flag before resolving an array spec and restore value afterwards. gcc/testsuite/ PR fortran/98016 * gfortran.dg/pr98016.f90: New test.	2020-12-05 14:14:19 +00:00
Iain Sandoe	1352bc88a0	Darwin : Update libtool and dependencies for Darwin20 [PR97865] The change in major version (and the increment from Darwin19 to 20) caused libtool tests to fail which resulted in incorrect build settings for shared libraries. We take this opportunity to sort out the shared undefined symbols state rather than propagating the current unsound behaviour into a new rev. This change means that we default to the case that missing symbols are considered an error, and if one wants to allow this intentionally, the confiuration for that case should be set appropriately. Three existing cases need undefined dynamic lookup: libitm, where there is already a configuration mechanism to add the flags. libcc1, where we add simple configuration to add the flags for Darwin. libsanitizer, where we can add to the existing extra flags. libcc1/ChangeLog: PR target/97865 * Makefile.am: Add dynamic_lookup to LD flags for Darwin. * configure.ac: Test for Darwin host and set a flag. * Makefile.in: Regenerate. * configure: Regenerate. libitm/ChangeLog: PR target/97865 * configure.tgt: Add dynamic_lookup to XLDFLAGS for Darwin. * configure: Regenerate. libsanitizer/ChangeLog: PR target/97865 * configure.tgt: Add dynamic_lookup to EXTRA_CXXFLAGS for Darwin. * configure: Regenerate. ChangeLog: PR target/97865 * libtool.m4: Update handling of Darwin platform link flags for Darwin20. gcc/ChangeLog: PR target/97865 * configure: Regenerate. libatomic/ChangeLog: PR target/97865 * configure: Regenerate. libbacktrace/ChangeLog: PR target/97865 * configure: Regenerate. libffi/ChangeLog: PR target/97865 * configure: Regenerate. libgfortran/ChangeLog: PR target/97865 * configure: Regenerate. libgomp/ChangeLog: PR target/97865 * configure: Regenerate. libhsail-rt/ChangeLog: PR target/97865 * configure: Regenerate. libobjc/ChangeLog: PR target/97865 * configure: Regenerate. libphobos/ChangeLog: PR target/97865 * configure: Regenerate. libquadmath/ChangeLog: PR target/97865 * configure: Regenerate. libssp/ChangeLog: PR target/97865 * configure: Regenerate. libstdc++-v3/ChangeLog: PR target/97865 * configure: Regenerate. libvtv/ChangeLog: PR target/97865 * configure: Regenerate. zlib/ChangeLog: PR target/97865 * configure: Regenerate.	2020-12-05 08:43:20 +00:00
Venkataramanan Kumar	3e2ae3ee28	X86_64: Enable support for next generation AMD Zen3 CPU. 2020-12-03 Venkataramanan Kumar <Venkataramanan.Kumar@amd.com> Sharavan Kumar <Shravan.Kumar@amd.com> gcc/ChangeLog: * common/config/i386/cpuinfo.h (get_amd_cpu) recognize znver3. * common/config/i386/i386-common.c (processor_names): Add znver3. (processor_alias_table): Add znver3 and AMDFAM19H entry. * common/config/i386/i386-cpuinfo.h (processor_types): Add AMDFAM19H. (processor_subtypes): AMDFAM19H_ZNVER3. * config.gcc (i[34567]86--linux \| ...): Likewise. * config/i386/driver-i386.c: (host_detect_local_cpu): Let -march=native recognize znver3 processors. * config/i386/i386-c.c (ix86_target_macros_internal): Add znver3. * config/i386/i386-options.c (m_znver3): New definition. (m_ZNVER): Include m_znver3. (processor_cost_table): Add znver3. * config/i386/i386.c (ix86_reassociation_width): Likewise. * config/i386/i386.h (TARGET_znver3): New definition. (enum processor_type): Add PROCESSOR_ZNVER3. * config/i386/i386.md (define_attr "cpu"): Add znver3. * config/i386/x86-tune-sched.c: (ix86_issue_rate): Likewise. (ix86_adjust_cost): Likewise. * config/i386/x86-tune.def (X86_TUNE_AVOID_256FMA_CHAINS: Likewise. * config/i386/znver1.md: Add new reservations for znver3. * doc/extend.texi: Add details about znver3. * doc/invoke.texi: Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/funcspec-56.inc: Handle new march. * g++.target/i386/mv29.C: New file.	2020-12-05 11:19:35 +05:30
Jakub Jelinek	625e002396	i386: Combine splitters followup [PR96226] Here is the patch to simplify the newly added combine splitters, when we split into 2 insns anyway, no reason to split into the masking define_insn_and_split we'd be splitting shortly after. 2020-12-05 Jakub Jelinek <jakub@redhat.com> PR target/96226 * config/i386/i386.md (splitter after <rotate_insn><mode>3_mask, splitter after <rotate_insn><mode>3_mask_1): Drop the masking from the patterns to split into.	2020-12-05 01:31:08 +01:00
Jakub Jelinek	43e84ce7d6	c++: Fix constexpr access to union member through pointer-to-member [PR98122] We currently incorrectly reject the first testcase, because cxx_fold_indirect_ref_1 doesn't attempt to handle UNION_TYPEs. As the second testcase shows, it isn't that easy, because I believe we need to take into account the active member and prefer that active member over other members, because if we pick a non-active one, we might reject valid programs. 2020-12-05 Jakub Jelinek <jakub@redhat.com> PR c++/98122 * constexpr.c (cxx_union_active_member): New function. (cxx_fold_indirect_ref_1): Add ctx argument, pass it through to recursive call. Handle UNION_TYPE. (cxx_fold_indirect_ref): Add ctx argument, pass it to recursive calls and cxx_fold_indirect_ref_1. (cxx_eval_indirect_ref): Adjust cxx_fold_indirect_ref calls. * g++.dg/cpp1y/constexpr-98122.C: New test. * g++.dg/cpp2a/constexpr-98122.C: New test.	2020-12-05 01:30:08 +01:00
GCC Administrator	c5fd8a9157	Daily bump.	2020-12-05 00:16:39 +00:00
Ian Lance Taylor	918a5b84a2	runtime: update type descriptor name in fieldtrack C support code We were using the old name, but nothing noticed because it is a weak reference that is permitted to be nil, so that it works with code that does not use the field tracking library. Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/275449	2020-12-04 14:51:09 -08:00
Jason Merrill	a95753214b	c++: Fix deduction from auto template parameter [PR93083] The check in do_class_deduction to handle passing one class placeholder template parm as an argument for itself needed to be extended to also handle equivalent parms from other templates. gcc/cp/ChangeLog: PR c++/93083 * pt.c (convert_template_argument): Handle equivalent placeholders. (do_class_deduction): Look through EXPR_PACK_EXPANSION, too. gcc/testsuite/ChangeLog: PR c++/93083 * g++.dg/cpp2a/nontype-class40.C: New test.	2020-12-04 17:47:05 -05:00
Jason Merrill	df933e307b	vec: Simplify use with C++11 range-based 'for'. It looks cleaner if we can use a vec* directly as a range for the C++11 range-based 'for' loop, without needing to indirect from it, and also works with null pointers. The change in cp_parser_late_parsing_default_args is an example of how this can be used to simplify a simple loop over a vector. Reverse or subset iteration will require adding range adaptors. I deliberately didn't format the new overloads for etags since they are trivial. gcc/ChangeLog: * vec.h (begin, end): Add overloads for vec. tree.c (build_constructor_from_vec): Remove . gcc/cp/ChangeLog: decl2.c (clear_consteval_vfns): Remove . pt.c (do_auto_deduction): Remove . parser.c (cp_parser_late_parsing_default_args): Change loop to use range 'for'.	2020-12-04 14:45:25 -05:00
David Edelsohn	b96802994a	rs6000: fix PTR_SIZE in rs6000.c The recent change to rs6000.c for DWARF in AIX references the macro PTR_SIZE that only is defined in dwarf2out.c. This patch changes the reference to the equivalent POINTER_SIZE_UNITS defined in defaults.h. gcc/ChangeLog: * config/rs6000/rs6000.c (rs6000_option_override_internal): Change PTR_SIZE to POINTER_SIZE_UNITS.	2020-12-04 14:33:06 -05:00
Hans-Peter Nilsson	eb79f4db49	doc/implement-c.texi: About same-as-scalar-type volatile aggregate accesses, PR94600 We say very little about reads and writes to aggregate / compound objects, just scalar objects (i.e. assignments don't cause reads). Let's lets say something safe about aggregate objects, but only for those that are the same size as a scalar type. There's an equal-sounding section (Volatiles) in extend.texi, but this seems a more appropriate place, as specifying the behavior of a standard qualifier. gcc: 2020-12-04 Hans-Peter Nilsson <hp@axis.com> Martin Sebor <msebor@redhat.com> PR middle-end/94600 * doc/implement-c.texi (Qualifiers implementation): Add blurb about access to the whole of a volatile aggregate object, only for same-size as a scalar object.	2020-12-04 20:29:27 +01:00
Jakub Jelinek	78c4a9fece	gimple: Return fnspec only for replaceable new/delete operators called from new/delete [PR98130] As mentioned in the PR, we shouldn't treat non-replaceable operator new/delete (e.g. with the placement new) as replaceable ones. There is some pending discussion that perhaps operator delete called from delete if not replaceable should return some other fnspec, but can we handle that incrementally, fix this wrong-code and then deal with a missed optimization? I really don't know what exactly should be returned. 2020-12-04 Jakub Jelinek <jakub@redhat.com> PR c++/98130 * gimple.c (gimple_call_fnspec): Only return ".co " for replaceable operator delete or ".mC" for replaceable operator new called from new/delete. * g++.dg/opt/pr98130.C: New test.	2020-12-04 19:10:56 +01:00
Jakub Jelinek	ac2a6962b9	i386: Add combine splitters to allow combining multiple insns into reg1 = const; reg2 = rotate (reg1, reg3 & cst) [PR96226] As mentioned in the PR, we can combine ~(1 << x) into -2 r<< x, but we give up in the ~(1 << (x & 31)) cases, as <rotate_insn><mode>3_mask don't allow immediate operand 1 and find_split_point prefers to split (x & 31) instead of the constant. With these combine splitters we help combine decide how to split those insns. 2020-12-04 Jakub Jelinek <jakub@redhat.com> PR target/96226 * config/i386/i386.md (splitter after <rotate_insn><mode>3_mask, splitter after <rotate_insn><mode>3_mask_1): New combine splitters. * gcc.target/i386/pr96226.c: New test.	2020-12-04 18:44:31 +01:00
Jakub Jelinek	33be07be9e	fold-const: Don't use build_constructor for non-aggregate types in native_encode_initializer [PR93121] The following testcase is rejected, because when trying to encode a zeroing CONSTRUCTOR, the code was using build_constructor to build initializers for the elements but when recursing the function handles CONSTRUCTOR only for aggregate types. The following patch fixes that by using build_zero_cst instead for non-aggregates. Another option would be add handling CONSTRUCTOR for non-aggregates in native_encode_initializer. Or we can do both, I guess the middle-end generally doesn't like CONSTRUCTORs for scalar variables, but am not 100% sure if the FE doesn't produce those sometimes. 2020-12-04 Jakub Jelinek <jakub@redhat.com> PR libstdc++/93121 * fold-const.c (native_encode_initializer): Use build_zero_cst instead of build_constructor. * g++.dg/cpp2a/bit-cast6.C: New test.	2020-12-04 18:02:10 +01:00
Nathan Sidwell	5a26d4a204	c++: Revert dependent-array changes [PR 98116] The changes reverted here are exposing an existing problem with alias template comparisons. The typename_type changes are also incomplete, possibly for similar reasons. It seems safer to revert them, fix the underlying issue and then move forwards. The testcases is adjusted to more robustly check the specialization table, and ICEs with and without the c++ changes. Revert: 62fb1b9e0da c++: Fix array type dependency [PR 98107] 07589ca2b2c c++: typename_type structural comparison 29ae1d7751 c++: Extend build_array_type API PR c++/98116 gcc/cp/ * cp-tree.h (comparing_typenames): Delete. (cplus_build_array_type): Remove default parm. * pt.c (comparing_typenames): Delete. (spec_hasher::equal): Don't increment it. * tree.c (set_array_type_canon): Remove dep parm. (build_cplus_array_type): Remove dep parm changes. (cp_build_qualified_type_real): Remove dependent array type changes. (strip_typedefs): Likewise. * typeck.c (structural_comptypes): Revert comparing_typename changes. gcc/testsuite/ * g++.dg/template/pr98116.C: Enable robust checking.	2020-12-04 08:38:57 -08:00
Nathan Sidwell	97eaf8c92f	c++: Module API declarations This provides the inline predicates about module state, and declares the functions to be provided. gcc/cp/ * cp-tree.h: Add various inline module state predicates, and declare the API that will be provided by modules.cc	2020-12-04 05:23:22 -08:00
Jakub Jelinek	704ccefb57	debug: Fix another vector DECL_MODE ICE [PR98100] The PR88587 fix changes DECL_MODE of vars with vector type during inlining/cloning when the vars are copied, so that their DECL_MODE matches their TYPE_MODE in the new function. Unfortunately, the following testcase still ICEs, the var isn't really used in the new function and so it isn't copied, but becomes just a nonlocalized var. So we can't adjust its DECL_MODE because it appears in multiple functions and needs different modes in between them. The following patch changes the DEBUG_INSN creation to use TYPE_MODE instead of DECL_MODE for vars with vector types. 2020-12-04 Jakub Jelinek <jakub@redhat.com> PR target/98100 * cfgexpand.c (expand_gimple_basic_block): For vars with vector type, use TYPE_MODE rather than DECL_MODE. * gcc.target/i386/pr98100.c: New test.	2020-12-04 12:18:21 +01:00
Jakub Jelinek	65312dfc64	dwarf: Add -gdwarf{32,64} options The following patch makes the choice between 32-bit and 64-bit DWARF formats selectable by command line switch, rather than being hardcoded through DWARF_OFFSET_SIZE macro. The options themselves don't turn on debug info themselves, so one needs to use -g -gdwarf64 or similar. 2020-12-04 Jakub Jelinek <jakub@redhat.com> * common.opt (-gdwarf32, -gdwarf64): New options. * config/rs6000/rs6000.c (rs6000_option_override_internal): Default dwarf_offset_size to 8 if not overridden from the command line. * dwarf2out.c: Change all occurrences of DWARF_OFFSET_SIZE to dwarf_offset_size. * doc/invoke.texi (-gdwarf32, -gdwarf64): Document.	2020-12-04 10:54:45 +01:00
Martin Liska	485b40a527	testsuite: use param for if-to-switch tests gcc/testsuite/ChangeLog: PR testsuite/98123 * gcc.dg/tree-ssa/if-to-switch-4.c: Add param to make the test stable on all architectures. * gcc.dg/tree-ssa/if-to-switch-6.c: Likewise. * gcc.dg/tree-ssa/if-to-switch-8.c: Likewise.	2020-12-04 10:45:22 +01:00

1 2 3 4 5 ...

181942 Commits