mirror/gcc - gcc - Collaboration & Inovation

mirror/gcc

mirror of git://gcc.gnu.org/git/gcc.git synced 2025-02-26 10:35:50 +08:00

Author	SHA1	Message	Date
Georg-Johann Lay	c3db52bb47	AVR: target/84211 - Add a post reload register optimization pass. This introduces a new post reload pass that tracks known values held in registers and performs optimizations based on that knowledge. It runs between the two instances of the RTL peephole pass. The optimizations are activated by new option -mfuse-move=<0,23> which provides a 3:2:2:2 mixed radix value: Digit 0: Activates try_fuse: Tries to use a MOVW instead of two LDIs. Digit 1: Activates try_bin_arg1: Simplify the 2nd operand of a binary operation, for example X xor Y can be simplified to X when Y = 0. When Y is an expensive constant that's already held in some register R, then the expression can be replaced by X xor R. Digit 2: Activates try_split_any: Split multi-byte operations like shifts into 8-bit instructions. Digit 3: Activates try_split_ldi: Decompose LDI-like insns into a sequence of instructions with better performance. For example, R2[4] = 0x1ff may be performed as: CLR R5 CLR R4 MOVW R2, R4 INC R3 DEC R2 Digit 3 can have a value of 0, 1 or 2, where value=2 may come up with code that performs better than with value=1 at the expense of reduced traceability of the generated assembly code. Here are some examples: Without optimization \| With optimization ==================== \| ================= long long fn_zero (void) { return 0; } ldi r18, 0 ; movqi_insn \| ldi r18, 0 ; movqi_insn ldi r19, 0 ; movqi_insn \| ldi r19, 0 ; movqi_insn ldi r20, 0 ; movqi_insn \| movw r20, r18 ; movhi ldi r21, 0 ; movqi_insn \| ldi r22, 0 ; movqi_insn \| movw r22, r18 ; movhi ldi r23, 0 ; movqi_insn \| ldi r24, 0 ; movqi_insn \| movw r24, r18 ; movhi ldi r25, 0 ; movqi_insn \| ret \| ret int fn_eq0 (char c) { return c == 0; } mov r18, r24 ; movqi_insn \| mov r18, r24 ; movqi_insn ldi r24, 1 ; movhi \| ldi r24, 1 ; movhi ldi r25, 0 \| ldi r25, 0 cp r18, ZERO ; cmpqi3 \| cpse r18, ZERO ; peephole breq .+4 ; branch \| ldi r24, 0 ; movhi \| ldi r24, 0 ; movqi_insn ldi r25, 0 \| ret \| ret unsigned fn_crc (unsigned x, unsigned y) { for (char i = 8; i--; x <<= 1) y ^= (x ^ y) & 0x80 ? 79u : 0u; return y; } movw r18, r24 ; movhi \| movw r18, r24 ; movhi movw r24, r22 ; movhi \| movw r24, r22 ; movhi ldi r22, 8 ; movqi_insn \| ldi r22, 8 ; movqi_insn .L13: \| .L13: movw r30, r18 ; movhi \| movw r30, r18 ; movhi eor r30, r24 ; xorqi3 \| eor r30, r24 ; xorqi3 eor r31, r25 ; xorqi3 \| eor r31, r25 ; xorqi3 mov r20, r30 ; andhi3 \| mov r20, r30 ; andqi3 andi r20, 1<<7 \| andi r20, 1<<7 clr r21 \| sbrs r30, 7 ; sbrx_branchhi \| sbrc r30, 7 ; sbrx_branchhi rjmp .+4 \| ldi r20, 79 ; movqi_insn \| ldi r20, 79 ; movqi_insn ldi r21, 0 ; movqi_insn \| eor r24, r20 ; xorqi3 \| eor r24, r20 ; xorqi3 eor r25, r21 ; xorqi3 \| lsl r18 ; ashlhi3_const \| lsl r18 ; ashlhi3_const rol r19 \| rol r19 subi r22, 1 ; op8.for.cczn.p\| subi r22, 1 ; op8.for.cczn.plus brne .L13 ; branch_ZN \| brne .L13 ; branch_ZN ret \| ret #define SPDR ((uint8_t volatile) 0x2c) void fn_PR49807 (long big) { SPDR = big >> 24; SPDR = big >> 16; SPDR = big >> 8; SPDR = big; } movw r20, r22 ; movhi \| movw r20, r22 ; movhi movw r22, r24 ; movhi \| movw r22, r24 ; movhi mov r24, r23 ; ashrsi3_const \| clr r27 \| sbrc r24,7 \| com r27 \| mov r25, r27 \| mov r26, r27 \| out 0xc, r24 ; movqi_insn \| out 0xc, r23 ; movqi_insn movw r24, r22 ; ashrsi3_const \| clr r27 \| sbrc r25, 7 \| com r27 \| mov r26, r27 \| out 0xc, r24 ; movqi_insn \| out 0xc, r24 ; movqi_insn clr r27 ; ashrsi3_const \| sbrc r23, 7 \| dec r27 \| mov r26, r23 \| mov r25, r22 \| mov r24, r21 \| out 0xc, r24 ; movqi_insn \| out 0xc, r21 ; movqi_insn out 0xc, r20 ; movqi_insn \| out 0xc, r20 ; movqi_insn ret \| ret PR target/84211 gcc/ * doc/invoke.texi (AVR Options) [-mfuse-move]: Document new option. * common/config/avr/avr-common.cc (avr_option_optimization_table): Set -mfuse-move= depending on optimization level. * config/avr/avr.opt (-mfuse-move, -mfuse-move=): New options. * config/avr/t-avr (avr-passes.o): Depend on avr-passes-fuse-move.h. * config/avr/avr-passes-fuse-move.h: New file, used by avr-passes.cc. * config/avr/avr-passes.def (avr_pass_fuse_move): Insert new pass. * config/avr/avr-passes.cc (INCLUDE_ARRAY): Define it. (insn-attr.h): Include it. (avr_pass_data_fuse_move): New const pass_data. (avr_pass_fuse_move): New public rtl_opt_pass class. (make_avr_pass_fuse_move): New function. (gprmask_t): New typedef. (next_nondebug_insn_bb, prev_nondebug_insn_bb) (single_set_with_scratch, size_to_mask, size_to_mode) (emit_valid_insn, emit_valid_move_clobbercc) (gpr_regno_p, regmask, has_bits_in) (find_arith, find_arith2, any_shift_p): New local functions. (AVRasm): New namespace. (FUSE_MOVE_MAX_MODESIZE): New define. (avr-passes-fuse-move.h): New include. (memento_t, absint_t, absins_byte_t, absint_val_t) (optimize_data_t, insn_optimizedata_t, find_plies_data_t) (insninfo_t, bbinfo_t, ply_t, plies_t): New structs / classes. * config/avr/avr-protos.h (avr_chunk, avr_byte, avr_word, avr_int8) (avr_uint8, avr_int16, avr_uint16) (avr_out_set_some, avr_set_some_operation) (output_reload_in_const, make_avr_pass_fuse_move): New protos. (avr_dump): Depend macro definition on GCC_DUMPFILE_H. * config/avr/avr.cc (avr_option_override): Insert after pass "avr-fuse-move" instead of after "peephole2". (avr_chunk, avr_byte, avr_word, avr_int8, avr_uint8, avr_int16) (avr_uint16, output_reload_in_const): Functions are no more static. (avr_out_set_some, avr_set_some_operation): New functions. (ashrqi3_out, ashlqi3_out) [offset=7]: Handle "r,r,C07" alternative. (avr_out_insert_notbit): Comment also allows QImode. (avr_adjust_insn_length) [ADJUST_LEN_SET_SOME]: Handle case. * config/avr/avr.md (adjust_len) <set_some>: New attribute value. (set_some): New insn. (andqi3, andqi3): Add "r,r,Cb1" alternative. (ashrqi3, ashrqi3 ashlqi3, ashlqi3): Add a "r,r,C07" alternative. (gen_move_clobbercc_scratch): New emit helper. config/avr/constraints.md (Cb1): New constraint. * config/avr/predicates.md (dreg_or_0_operand, set_some_operation): New. * config/avr/avr-log.cc (avr_forward_to_printf): New static func. (avr_log_vadump): Use it to recognize more formats. gcc/testsuite/ * gcc.target/avr/torture/test-gprs.h: New file. * gcc.target/avr/torture/pr84211-fuse-move-1.c: New test. * gcc.target/avr/torture/pr84211-fuse-move-2.c: New test.	2024-11-18 19:14:57 +01:00
Harald Anlauf	386f6d98ba	Fortran: add bounds-checking for ALLOCATE of CHARACTER with type-spec [PR53357] Fix a rejects-(potentially)-valid code for ALLOCATE of CHARACTER with type-spec, and implement a string-length check for -fcheck=bounds. Implement more detailed errors or warnings when character function declarations and references do not match. PR fortran/53357 gcc/fortran/ChangeLog: * dependency.cc (gfc_dep_compare_expr): Return correct result if relationship of expressions could not be determined. * interface.cc (gfc_check_result_characteristics): Implement error messages if character function declations and references do not agree, else emit warning in cases where a mismatch is suspected. * trans-stmt.cc (gfc_trans_allocate): Implement a string length check for -fcheck=bounds. gcc/testsuite/ChangeLog: * gfortran.dg/auto_char_len_4.f90: Adjust patterns. * gfortran.dg/typebound_override_1.f90: Likewise. * gfortran.dg/bounds_check_strlen_10.f90: New test.	2024-11-18 19:04:15 +01:00
Richard Biener	c108785c42	tree-optimization/117594 - fix live op vectorization for length masked case The code was passing factor == 0 to vect_get_loop_len which always returns an unmodified length, even if the number of scalar elements doesn't agree. It also failed to insert the eventually generated code. PR tree-optimization/117594 * tree-vect-loop.cc (vectorizable_live_operation_1): Pass factor == 1 to vect_get_loop_len, insert generated stmts. * gcc.dg/vect/pr117594.c: New testcase.	2024-11-18 18:57:21 +01:00
Jeff Law	f5ceca9627	[committed][RISC-V][PR target/117595] Fix bogus use of simplify_gen_subreg And stage3 begins... Zdenek's fuzzer caught this one. Essentially using simplify_gen_subreg directly with an offset of 0 when we just needed a lowpart. The offset of 0 works for little endian, but for big endian it's simply wrong. simplify_gen_subreg will return NULL_RTX because the case isn't representable. We then embed that NULL_RTX into an insn that's later scanned during mark_jump_label. Scanning the port I see a couple more instances of this incorrect idiom. One is pretty obvious to fix. The others look a bit goofy and I'll probably need to sync with Patrick on them. Anyway tested on riscv64-elf and riscv32-elf with no regressions. Pushing to the trunk. PR target/117595 gcc/ * config/riscv/sync.md (atomic_compare_and_swap<mode>): Use gen_lowpart rather than simplify_gen_subreg. * config/riscv/riscv.cc (riscv_legitimize_move): Similarly. gcc/testsuite/ * gcc.target/riscv/pr117595.c: New test.	2024-11-18 10:55:09 -07:00
Gaius Mulley	ab7abf1db0	PR modula2/117660: Errors referring to variables of type array could display full declaration This patch ensures that the tokens defining the full declaration of an ARRAY type is stored in the symbol table and used during production of error messages. gcc/m2/ChangeLog: PR modula2/117660 * gm2-compiler/P2Build.bnf (ArrayType): Update tok with the composite token produced during array type declaration. * gm2-compiler/P2SymBuild.mod (EndBuildArray): Create the combinedtok and store it into the symbol table. Also ensure combinedtok is pushed to the quad stack. (BuildFieldArray): Preserve typetok. * gm2-compiler/SymbolTable.def (PutArray): Rename parameters. * gm2-compiler/SymbolTable.mod (PutArray): Rename parameters. gcc/testsuite/ChangeLog: PR modula2/117660 * gm2/iso/fail/arraymismatch.mod: New test. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>	2024-11-18 17:51:37 +00:00
Georg-Johann Lay	bba27015f2	AVR: target/117659 - Fix wrong code for u24 << 16. gcc/ PR target/117659 * config/avr/avr.cc (avr_out_ashlpsi3) [case 16]: Use %A1 as input (instead of bogus %A0).	2024-11-18 18:16:35 +01:00
Jeff Law	1100c0576b	Fix more c23 bool fallout While these haven't shown up in my tester (not configs I test) and I think we're likely going to be deprecating the nds32 target. we might as well go ahead and fix them. I'm going to include this under the pr117628 umbrella. PR target/117628 libgcc/ * config/arm/freebsd-atomic.c (bool): Remove unnecessary typedef. * config/arm/linux-atomic-64bit.c: Likewise. * config/arm/linux-atomic.c: Likewise. * config/nds32/linux-atomic.c: Likewise. * config/nios2/linux-atomic.c: Likewise.	2024-11-18 10:11:01 -07:00
Jeff Law	39a39d1f38	[RFA] Fix csky and c6x build failures csky fails to build libgcc after the c23 changes because it has a typedef for bool. AFAICT it's internal to the file, so removing the typedef isn't an ABI change. Similiarly for c6x which includes unwind-arm-common.inc. I suspect most, if not all of the arm-v7 and older targets are failing to build right now. I've built and regression tested both csky-linux-gnu and c6x-elf with this change. OK for the trunk? PR target/117628 libgcc/ * config/csky/linux-atomic.c (bool): Remove unnecessary typedef. * unwind-arm-common.inc (bool): Similarly.	2024-11-18 10:01:32 -07:00
Gaius Mulley	e37641458e	PR modula2/117371: Add check for zero step in for loop This patch is a follow on from PR modula2/117371 which could include a check to enforce the ISO restriction on a zero for loop step. gcc/m2/ChangeLog: PR modula2/117371 * gm2-compiler/M2GenGCC.mod (PerformLastForIterator): Add check for zero step value and issue an error message. gcc/testsuite/ChangeLog: PR modula2/117371 * gm2/iso/fail/forloopbyzero.mod: New test. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>	2024-11-18 16:34:42 +00:00
Eric Botcazou	70999668a1	ada: Fix interaction of aspect Predicate and static case expressions The semantics of the GNAT-specific Predicate aspect should be equivalent to those of the Static_Predicate aspect when the predicate expression is static, but that is not correctly implemented for static case expressions. gcc/ada/ChangeLog: * exp_ch4.adb (Expand_N_Case_Expression): Remove the test on enclosing predicate function for the return optimization. Rewrite it in the general case to catch all nondynamic predicates. (Expand_N_If_Expression): Remove the test on enclosing predicate function for the return optimization.	2024-11-18 15:06:55 +01:00
Bob Duff	4e23ce5070	ada: Atomic_Synchronization is not a user-visible check Remove all user-level documentation of the check name "Atomic_Synchronization". The documentation was confusing because this check should never be used in source code, and because it raises the question of whether All_Checks applies to it (it does not). Change the name Atomic_Synchronization to be _Atomic_Synchronization (with a leading underscore) so that it cannot be used in source code. This "check" is not really a check at all; it is used only internally in the implementation of Disable/Enable_Atomic_Synchronization, because the placement and scope of these pragmas match pragma Suppress. gcc/ada/ChangeLog: * doc/gnat_rm/implementation_defined_characteristics.rst: Remove Atomic_Synchronization. * doc/gnat_ugn/building_executable_programs_with_gnat.rst: Likewise. * doc/gnat_rm/implementation_defined_pragmas.rst: DRY. Consolidate documentation of Disable/Enable_Atomic_Synchronization. * checks.adb: Comment fix. * exp_util.ads: Likewise. * targparm.ads: Likewise. * types.ads: Likewise. * gnat1drv.adb: Likewise. DRY. * sem_prag.adb (Process_Disable_Enable_Atomic_Sync): Change name of Atomic_Synchronization to start with underscore. (Process_Suppress_Unsuppress): No need to check Comes_From_Source for Atomic_Synchronization anymore; _Atomic_Synchronization can never come from source. (Anyway, it shouldn't be ignored; it should be an error.) * snames.ads-tmpl (Atomic_Synchronization): Change name to start with underscore. * switch-c.adb (Scan_Front_End_Switches): Minor cleanup: Use 'in'. * gnat_rm.texi: Regenerate. * gnat_ugn.texi: Regenerate.	2024-11-18 15:06:55 +01:00
Eric Botcazou	70faad1961	ada: Fix small oversight in removal of N_Unchecked_Expression node In addition to Resolve_Indexed_Component, Eval_Indexed_Component can also set the Do_Range_Check flag on the expressions of an N_Indexed_Component node through the call on Check_Non_Static_Context, so this also needs to be blocked by the Kill_Range_Check flag. gcc/ada/ChangeLog: * sem_eval.adb (Eval_Indexed_Component): Clear Do_Range_Check on the expressions if Kill_Range_Check is set on the node.	2024-11-18 15:06:55 +01:00
Eric Botcazou	b4fd15d8be	ada: Fix another minor fallout of previous changes to aggregate expansion This is another glitch associated with Initialization_Statements. gcc/ada/ChangeLog: * exp_util.adb (Remove_Init_Call): Rewrite a compound statement in the Initialization_Statements of the variable as a null statement instead of removing it. * freeze.adb (Explode_Initialization_Compound_Statement): Small comment tweaks.	2024-11-18 15:06:55 +01:00
Eric Botcazou	1b24e30cab	ada: Fix another minor fallout of previous changes to aggregate expansion The processing of static array aggregates in Exp_Aggr requires that their bounds be representable as Int(eger) values for practical purposes, and the previous changes have exposed another path where this is not checked. This introduces a UI_Are_In_Int_Range local predicate for convenience. gcc/ada/ChangeLog: * exp_aggr.adb (UI_Are_In_Int_Range): New predicate. (Aggr_Size_OK): Use it. (Flatten): Likewise. (Packed_Array_Aggregate_Handled): Likewise. (Static_Array_Aggregate): Likewise.	2024-11-18 15:06:55 +01:00
Eric Botcazou	3716d9887c	ada: Fix minor fallout of previous changes to aggregate expansion The problem occurs for an anonymous array object declared with an aspect and when pragma {Initialize,Normalize}_Scalars is in effect: in this case, the synthesized aggregate is attached to the Initialization_Statements field by Convert_Aggr_In_Object_Decl, but Explode_Initialization_Compound_Statement puts it back at the point of declaration instead of the freeze point, thus voiding the effects of the mechanism. This was previously hidden because of a bypass in Freeze_Entity which drops the freeze node on the floor in this case, so the change fixes the issue and removes the bypass in the process. gcc/ada/ChangeLog: * freeze.ads (Explode_Initialization_Compound_Statement): Adjust the description. * freeze.adb (Explode_Initialization_Compound_Statement): If the entity has its freezing delayed, append the initialization actions to its freeze actions. (Freeze_Object_Declaration): Remove commented out code. (Freeze_Entity): Remove bypass for object of anonymous array type.	2024-11-18 15:06:55 +01:00
Eric Botcazou	6a7849592d	ada: Small cleanup and refactoring in expansion of asynchronous select The exception handler that catches Abort_Signal does nothing nowadays. This refactors the code to use Build_Abort_Block more consistently and also makes it simpler by dropping the identifier on the abort block. No functional changes. gcc/ada/ChangeLog: * exp_sel.ads (Build_Abort_Block): Remove second parameter and rename the third. (Build_Abort_Block_Handler): Fix description. * exp_sel.adb (Build_Abort_Block): Remove second parameter, rename the third and adjust accordingly. * exp_ch9.adb (Expand_N_Asynchronous_Select): Fix the description of the exception handler throughout. Remove Abort_Block_Ent and Hdle local variables. Call Build_Abort_Block consistently to build the abort block and adjust existing calls.	2024-11-18 15:06:55 +01:00
Steve Baird	0019e8dc76	ada: Array aggregate with large static bounds causes compiler crash In some cases an array aggregate with statically known bounds and at least one bound outside of the range of a 32-bit signed integer causes a bugbox. gcc/ada/ChangeLog: * exp_aggr.adb (Convert_To_Positional.Flatten): Avoid raising Constraint_Error in UI_To_Int by testing UI_Is_In_Int_Range first.	2024-11-18 15:06:54 +01:00
Eric Botcazou	b2320a12df	ada: Cleanup in expansion of array aggregates in object declarations This mainly decouples the handling of the declaration case from that of the assignment case in Expand_Array_Aggregate, as well as moves the expansion in the case of an aggregate that can be processed by the back end to the Build_Array_Aggr_Code routine. gcc/ada/ChangeLog: * exp_aggr.adb (Build_Array_Aggr_Code): Build the simple assignment for the case of an aggregate that can be handled by the back end. (Expand_Array_Aggregate): Adjust description of the processing. Move handling of declaration case to STEP 4 and remove handling of the case of an aggregate that can be processed by the back end. (Late_Expansion): Likewise for the second part. * exp_ch3.adb (Expand_N_Object_Declaration): Deal with a delayed aggregate synthesized for the default initialization, if any. * sem_eval.adb (Eval_Indexed_Component): Bail out for the name of an assignment statement.	2024-11-18 15:06:54 +01:00
Eric Botcazou	7617b83242	ada: Further cleanup in expansion of array aggregates in allocators This mainly decouples the handling of the allocator case from that of the assignment case in Expand_Array_Aggregate and also makes Must_Slide a bit more forgiving. gcc/ada/ChangeLog: * exp_aggr.adb (In_Place_Assign_OK): Remove handling of allocators and call Must_Slide instead of implementing the check manually. (Convert_To_Assignments): Adjust outdated comment. (Expand_Array_Aggregate): Move handling of allocator case to STEP 3 and call Must_Slide directly for it. (Must_Slide): Replace tests based on Is_OK_Static_Expression with tests based on Compile_Time_Known_Value.	2024-11-18 15:06:54 +01:00
Eric Botcazou	856467a7e6	ada: Small cleanup in expansion of array aggregates in allocators Convert_Array_Aggr_In_Allocator does nothing that Late_Expansion cannot do, so this deletes the former and moves its support code for Storage_Model to the latter. No functional changes. gcc/ada/ChangeLog: * exp_aggr.adb (Convert_Array_Aggr_In_Allocator): Delete. (Convert_Aggr_In_Allocator): Do not call above procedure. (Late_Expansion): Deal with a target that is the dereference of a prefix with a Storage_Model. Remove an useless actual parameter in the call to Build_Array_Aggr_Code.	2024-11-18 15:06:54 +01:00
Javier Miranda	7eafe8e9e9	ada: Constraint error not raised in ACATS test c413007 Reverse the meaning of switch -gnatd_P; that is, enable by default the generating of a runtime check when the prefix of the call is an access-to-subprogram type with a null value. gcc/ada/ChangeLog: * sem_res.adb (Resolve_Actuals): Add by default a null-exclusion check on the prefix of the call when it is an access-type; it can be disabled using -gnatd_P. * debug.adb (gnatd_P): Update documentation.	2024-11-18 15:06:54 +01:00
squirek	1850d0dbd3	ada: Crash on 'Access for Stream_Element_Array object This patch fixes a crash in the compiler when the actual for an anonymous access type formal is an 'Access of a Sream_Element_Array object during the calculation of said actual's accessibility level. gcc/ada/ChangeLog: * accessibility.adb (Accessibility_Level): Handle the Input attribute case	2024-11-18 15:06:54 +01:00
Ronan Desplanques	28a69cb3db	ada: Tweak test for predefined main unit This change is part of an effort to reduce usage of Is_Predefined_Filename. gcc/ada/ChangeLog: * frontend.adb (Frontend): tweak test for predefined main unit.	2024-11-18 15:06:54 +01:00
Tobias Burnus	884637b636	libgomp/plugin/plugin-gcn.c: async-queue init - fix function-return type and fail fatally libgomp/ChangeLog: * plugin/plugin-gcn.c (GOMP_OFFLOAD_openacc_async_construct): In case of an error, call GOMP_PLUGIN_fatal not ..._error; use NULL not false in return.	2024-11-18 14:58:21 +01:00
Jennifer Schmitz	944471eaee	testsuite: Move test pr117093.c into gcc.target/aarch64. The test file pr117093.c failed on platforms other than aarch64, because it uses arm_neon.h. We moved it into gcc.target/aarch64. The patch was bootstrapped and tested on aarch64-linux-gnu and x86_64-linux-gnu, no regression. Committed as obvious. Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com> gcc/testsuite/ PR tree-optimization/117093 * gcc.dg/tree-ssa/pr117093.c: Move to gcc.target/aarch64. * gcc.target/aarch64/pr117093.c: New test.	2024-11-18 13:17:19 +01:00
Robin Dapp	52a392b8b7	RISC-V: Add VLS modes to strided loads. This patch adds VLS modes to the strided load expanders. gcc/ChangeLog: * config/riscv/autovec.md: Add VLS modes. * config/riscv/vector-iterators.md: Ditto. * config/riscv/vector.md: Ditto.	2024-11-18 11:49:25 +01:00
Robin Dapp	b89273a049	RISC-V: Add else operand to masked loads [PR115336]. This patch adds else operands to masked loads. Currently the default else operand predicate just accepts "undefined" (i.e. SCRATCH) values. PR middle-end/115336 PR middle-end/116059 gcc/ChangeLog: * config/riscv/autovec.md: Add else operand. * config/riscv/predicates.md (maskload_else_operand): New predicate. * config/riscv/riscv-v.cc (get_else_operand): Remove static. (expand_load_store): Use get_else_operand and adjust index. (expand_gather_scatter): Ditto. (expand_lanes_load_store): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr115336.c: New test. * gcc.target/riscv/rvv/autovec/pr116059.c: New test.	2024-11-18 11:48:42 +01:00
Robin Dapp	ebf3077241	i386: Add zero maskload else operand. gcc/ChangeLog: * config/i386/sse.md (maskload<mode><sseintvecmodelower>): Call maskload<mode>..._1. (maskload<mode><sseintvecmodelower>_1): Rename.	2024-11-18 11:48:42 +01:00
Robin Dapp	4a39addb49	gcn: Add else operand to masked loads. This patch adds an undefined else operand to the masked loads. gcc/ChangeLog: * config/gcn/predicates.md (maskload_else_operand): New predicate. * config/gcn/gcn-valu.md: Use new predicate.	2024-11-18 11:48:42 +01:00
Robin Dapp	a166a6ccdc	aarch64: Add masked-load else operands. This adds zero else operands to masked loads and their intrinsics. I needed to adjust more than initially thought because we rely on combine for several instructions and a change in a "base" pattern needs to propagate to all those. gcc/ChangeLog: * config/aarch64/aarch64-sve-builtins-base.cc: Add else handling. * config/aarch64/aarch64-sve-builtins.cc (function_expander::use_contiguous_load_insn): Ditto. * config/aarch64/aarch64-sve-builtins.h: Add else operand to contiguous load. * config/aarch64/aarch64-sve.md (@aarch64_load<SVE_PRED_LOAD:pred_load> _<ANY_EXTEND:optab><SVE_HSDI:mode><SVE_PARTIAL_I:mode>): Split and add else operand. (@aarch64_load_<ANY_EXTEND:optab><SVE_HSDI:mode><SVE_PARTIAL_I:mode>): Ditto. (aarch64_load_<ANY_EXTEND:optab>_mov<SVE_HSDI:mode><SVE_PARTIAL_I:mode>): Ditto. config/aarch64/aarch64-sve2.md: Ditto. * config/aarch64/iterators.md: Remove unused iterators. * config/aarch64/predicates.md (aarch64_maskload_else_operand): Add zero else operand.	2024-11-18 11:48:42 +01:00
Robin Dapp	634ae740f5	vect: Add maskload else value support. This patch adds an else operand to vectorized masked load calls. The current implementation adds else-value arguments to the respective target-querying functions that is used to supply the vectorizer with the proper else value. We query the target for its supported else operand and uses that for the maskload call. If necessary, i.e. if the mode has padding bits and if the else operand is nonzero, a VEC_COND enforcing a zero else value is emitted. gcc/ChangeLog: * optabs-query.cc (supports_vec_convert_optab_p): Return icode. (get_supported_else_val): Return supported else value for optab's operand at index. (supports_vec_gather_load_p): Add else argument. (supports_vec_scatter_store_p): Ditto. * optabs-query.h (supports_vec_gather_load_p): Ditto. (get_supported_else_val): Ditto. * optabs-tree.cc (target_supports_mask_load_store_p): Ditto. (can_vec_mask_load_store_p): Ditto. (target_supports_len_load_store_p): Ditto. (get_len_load_store_mode): Ditto. * optabs-tree.h (target_supports_mask_load_store_p): Ditto. (can_vec_mask_load_store_p): Ditto. * tree-vect-data-refs.cc (vect_lanes_optab_supported_p): Ditto. (vect_gather_scatter_fn_p): Ditto. (vect_check_gather_scatter): Ditto. (vect_load_lanes_supported): Ditto. * tree-vect-patterns.cc (vect_recog_gather_scatter_pattern): Ditto. * tree-vect-slp.cc (vect_get_operand_map): Adjust indices for else operand. (vect_slp_analyze_node_operations): Skip undefined else operand. * tree-vect-stmts.cc (exist_non_indexing_operands_for_use_p): Add else operand handling. (vect_get_vec_defs_for_operand): Handle undefined else operand. (check_load_store_for_partial_vectors): Add else argument. (vect_truncate_gather_scatter_offset): Ditto. (vect_use_strided_gather_scatters_p): Ditto. (get_group_load_store_type): Ditto. (get_load_store_type): Ditto. (vect_get_mask_load_else): Ditto. (vect_get_else_val_from_tree): Ditto. (vect_build_one_gather_load_call): Add zero else operand. (vectorizable_load): Use else operand. * tree-vectorizer.h (vect_gather_scatter_fn_p): Add else argument. (vect_load_lanes_supported): Ditto. (vect_get_mask_load_else): Ditto. (vect_get_else_val_from_tree): Ditto.	2024-11-18 11:48:42 +01:00
Robin Dapp	6b6bd53619	tree-ifcvt: Add zero maskload else value. When predicating a load we implicitly assume that the else value is zero. This matters in case the loaded value is padded (like e.g. a Bool) and we must ensure that the padding bytes are zero on targets that don't implicitly zero inactive elements. A former version of this patch still had this handling in ifcvt but the latest version defers it to the vectorizer. gcc/ChangeLog: * tree-if-conv.cc (predicate_load_or_store): Add zero else operand and comment.	2024-11-18 11:48:41 +01:00
Robin Dapp	8f68d9cb78	ifn: Add else-operand handling. This patch adds else-operand handling to the internal functions. gcc/ChangeLog: * internal-fn.cc (add_mask_and_len_args): Rename... (add_mask_else_and_len_args): ...to this and add else handling. (expand_partial_load_optab_fn): Use adjusted function. (expand_partial_store_optab_fn): Ditto. (expand_scatter_store_optab_fn): Ditto. (expand_gather_load_optab_fn): Ditto. (internal_fn_len_index): Add else handling. (internal_fn_else_index): Ditto. (internal_fn_mask_index): Ditto. (get_supported_else_vals): New function. (supported_else_val_p): New function. (internal_gather_scatter_fn_supported_p): Add else operand. * internal-fn.h (internal_gather_scatter_fn_supported_p): Define else constants. (MASK_LOAD_ELSE_ZERO): Ditto. (MASK_LOAD_ELSE_M1): Ditto. (MASK_LOAD_ELSE_UNDEFINED): Ditto. (get_supported_else_vals): Declare. (supported_else_val_p): Ditto.	2024-11-18 11:48:41 +01:00
Robin Dapp	5214ddb464	docs: Document maskload else operand and behavior. This patch amends the documentation for masked loads (maskload, vec_mask_load_lanes, and mask_gather_load as well as their len counterparts) with an else operand. gcc/ChangeLog: * doc/md.texi: Document masked load else operand.	2024-11-18 11:48:41 +01:00
Tobias Burnus	e7e3d1838f	libgomp/plugin/plugin-nvptx.c: Change false to NULL to fix C23 wrong-return-type error [PR117626] libgomp/ChangeLog: PR libgomp/117626 * plugin/plugin-nvptx.c (nvptx_open_device): Use 'CUDA_CALL_ERET' with 'NULL' as error return instead of 'CUDA_CALL' that returns false.	2024-11-18 11:06:58 +01:00
Andrew Pinski	45a3277149	match: Fix the `max<a,b>==0` pattern for pointers [PR117646] For pointers I forgot that BIT_IOR_EXPR is not valid so when I added the pattern to convert `max<a,b> != 0` (r15-5356), GCC would start to ICEing saying pointer types were not valid for BIT_IOR_EXPR. This fixes the problem by casting to the unsigned type of the inner type. There was another way of fixing this to handling it as `a == 0 & b == 0` but both match and reassoication (for pointers) will then convert it back into the form I am creating here so let's just use that form instead. Bootstrapped and tested on x86_64-linux-gnu. PR tree-optimization/117646 gcc/ChangeLog: * match.pd (`max<a,b>==0`): Add casts to `unsigned type`. gcc/testsuite/ChangeLog: * gcc.dg/torture/minmaxneeqptr-1.c: New test. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>	2024-11-18 00:50:26 -08:00
Jonathan Wakely	dffc37dead	libstdc++: Fix invalid casts in unordered container merge functions François pointed out that static_cast<__node_ptr>(&_M_before_begin) is invalid, because _M_before_begin is only a node-base not a node. Refactor the new merge overloads to only cast when we know we have a valid node. He also pointed out some optimizations to allow reusing hash codes that might be cached in the node. The _M_src_hash_code function already has the right logic to decide when a cached hash code can be reused by a different _Hashtable object. libstdc++-v3/ChangeLog: * include/bits/hashtable.h (_Hashtable::_M_src_hash_code): Improve comments. (_Hashtable::_M_merge_unique(_Hashtable&)): Use pointer_traits to get before-begin pointer. Only use static_cast on valid nodes, not the before-begin pointer. Reuse a hash code cached in the node when possible. (_Hashtable::_M_merge_multi(_Hashtable&)): Likewise. Reviewed-by: François Dumont <fdumont@gcc.gnu.org>	2024-11-18 08:22:27 +00:00
Jason Merrill	7b8b96a327	libcpp: add .c++-header-unit target The dependency output for header unit modules is based on the absolute pathname of the header file, but that's not something that a makefile can portably refer to. This patch adds a .c++-header-unit target based on the header name relative to an element of the include path. libcpp/ChangeLog: * internal.h (_cpp_get_file_dir): Declare. * files.cc (_cpp_get_file_dir): New fn. * mkdeps.cc (make_write): Use it. gcc/testsuite/ChangeLog: * g++.dg/modules/dep-4.H: New test.	2024-11-18 09:18:17 +01:00
Andrew Pinski	0dc389f21b	testsuite: Fix pr101145inf.c testcases [PR117494] Instead of doing a dg-run with a specific target check for linux. Use signal as the effective-target since this requires the use of ALARM signal to do the testing. Also use check_vect in the main and renames main to main1 to make sure we don't use the registers. Tested on x86_64-linux-gnu. PR testsuite/117494 gcc/testsuite/ChangeLog: gcc.dg/vect/pr101145inf.c: Remove dg-do and replace with dg-require-effective-target of signal. * gcc.dg/vect/pr101145inf_1.c: Likewise. * gcc.dg/vect/pr101145inf.inc: Rename main to main1 and mark as noinline. Include tree-vect.h. Have main call check_vect and main1. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>	2024-11-17 23:44:41 -08:00
Gerald Pfeifer	83e86397b0	libstdc++: Update reference to Angelika Langer's article libstdc++-v3: * doc/xml/manual/allocator.xml: Update reference to Angelika Langer's article. * doc/html/manual/memory.html: Regenerate.	2024-11-18 08:33:49 +01:00
Jeff Law	beec291225	Improve ext-dce's ability to eliminate more extensions I was looking at a regression in ext-dce's behavior just before Cauldron. Essentially a bugfix in ext-dce ended up causing us to fail to eliminate some useless extensions. When we have a SUBREG object with SUBREG_PROMOTED_VAR* flags set, we generally have to be more conservative in how we process bit group liveness, making bits live that wouldn't obviously be live otherwise. That's not always necessary though. For example, if we're storing a promoted subreg into memory, we may not care about those extra live bits on this instance of the subreg object (remember subregs are not shared!). Essentially if the mode of the memory reference is not wider than the mode of the inner REG, then we can clear the promoted state which in turn may allow more extension elimination. So at the start of ext-dce we do a simple pass over the IL and remove promoted subreg state when it's obviously safe to do so (memory stores when the modes allow it). That prevents extra bits from being live and ultimately allows us to remove more useless extensions. The testcase is in theory generic, but many targets won't have an opportunity to optimize this case. So rather then build out a large inclusion/exclusion list, I've just made the test risc-v specific. Bootstrapped and regression tested on aarch64, riscv64, s390x, etc in my tester. gcc/ * ext-dce.cc (maybe_clear_subreg_promoted_p): New function. (ext_dce_execute): Call it. gcc/testsuite * gcc.target/riscv/ext-dce-1.c: New test.	2024-11-17 21:44:50 -07:00
Maciej W. Rozycki	4a8eb5c6d8	Alpha: Remove leftover `;;' for "unaligned_store<mode>" Remove stray `;;' from the middle of the introductory comment for the "unaligned_store<mode>" expander, clearly a leftover from a previous edition. gcc/ * config/alpha/alpha.md (unaligned_store<mode>): Remove stray `;;'.	2024-11-18 03:02:59 +00:00
John David Anglin	29c4f6637c	hppa: Update install documentation 2024-11-17 John David Anglin <danglin@gcc.gnu.org> gcc/ChangeLog: PR target/69374 * doc/install.texi (Specific) <hppa*-hp-hpux11>: Update anchor and heading to reflect removal of 32-bit hppa support on HP-UX. Trim 32-bit related text.	2024-11-17 20:37:53 -05:00
GCC Administrator	24da863403	Daily bump.	2024-11-18 00:17:28 +00:00
Jason Merrill	db348caef9	c++: regenerate opt urls This should have been part of r15-5367. One day I'll remember to do this before buildbot sends me hate mail. gcc/c-family/ChangeLog: * c.opt.urls: Regenerate.	2024-11-17 20:43:43 +01:00
John David Anglin	8f50a07940	hppa: Remove typedef for bool type In C23, bool is now a keyword. So, doing a typedef for it is invalid. 2024-11-17 John David Anglin <danglin@gcc.gnu.org> libgcc/ChangeLog: PR target/117627 * config/pa/linux-atomic.c: Remove typedef for bool type.	2024-11-17 14:42:39 -05:00
Florian Weimer	701d8e7e60	c: Implement -Wdeprecated-non-prototype This warning covers the C23 incompibilities resulting from using () as parameter lists in function declarations. The warning name comes from Clang. The implementation is not perfect because GCC treats these two declarations as equivalent: void f (); void f (not_a_type); This is a bit confusing because they are clearly visually distinct. However, as of GCC 14, the second form is an error by default, so treating both the same as far as -Wdeprecated-non-prototype does not seem so bad from a user experience view. gcc/c-family/ PR c/95445 * c-opts.cc (c_common_post_options): Initialize warn_deprecated_non_prototype. * c.opt (Wdeprecated-non-prototype): New option. * c.opt.urls: Regenerate. gcc/c/ PR c/95445 * c-decl.cc (start_function): Warn about parameters after parameter-less declaration. * c-typeck.cc (build_function_call_vec): Pass fntype to convert_arguments. (convert_arguments): Change argument to fntype and compute typelist. Warn about parameter list mismatches on first parameter. gcc/ PR c/95445 * doc/invoke.texi: Document -Wdeprecated-non-prototype. gcc/testsuite/ PR c/95445 * gcc.dg/Wdeprecated-non-prototype-1.c: New test. * gcc.dg/Wdeprecated-non-prototype-2.c: New test. * gcc.dg/Wdeprecated-non-prototype-3.c: New test. * gcc.dg/Wdeprecated-non-prototype-4.c: New test.	2024-11-17 19:42:33 +01:00
Jason Merrill	3e89a4d513	c++: -M and modules again While experimenting with testing module std I noticed that gcc -M broke on it; it seems I need to set directives_only even sooner than I did in r15-4219. gcc/c-family/ChangeLog: * c-ppoutput.cc (preprocess_file): Don't set directives_only here. gcc/cp/ChangeLog: * module.cc (module_preprocess_options): Set directives_only here.	2024-11-17 16:23:21 +01:00
Jason Merrill	dbfbd3aa2c	c-family: add -fsearch-include-path The C++ modules code has a -fmodule-header (or -x c++-{user,system}-header) option to specify looking up headers to compile to header units on the usual include paths. I'd like to have the same functionality for full C++20 modules such as module std, which I proposed to live on the include path at bits/std.cc. But this behavior doesn't seem necessarily connected to modules, so I'm proposing a general C/C++ option to specify the behavior of looking in the include path for the input files specified on the command line. Other ideas for the name of the option are very welcome. The libcpp change is to allow -fsearch-include-path{,=user} to find files in the current working directory, like -include. This can be handy for a quick compile of both std.cc and a file that imports it, e.g. g++ -std=c++20 -fmodules -fsearch-include-path bits/std.cc importer.cc gcc/ChangeLog: * doc/cppopts.texi: Document -fsearch-include-path. * doc/invoke.texi: Mention it for modules. gcc/c-family/ChangeLog: * c.opt: Add -fsearch-include-path. * c-opts.cc (c_common_post_options): Handle it. gcc/cp/ChangeLog: * module.cc (module_preprocess_options): Don't override it. libcpp/ChangeLog: * internal.h (search_path_head): Declare. * files.cc (search_path_head): No longer static. * init.cc (cpp_read_main_file): Use it.	2024-11-17 16:23:21 +01:00
Jason Merrill	7db55c0ba1	libstdc++: add module std [PR106852] This patch introduces an installed source form of module std and std.compat. To help a build system find them, we install a libstdc++.modules.json file alongside libstdc++.so, which tells the build system where the files are and any special flags it should use when compiling them (none, in this case). The format is from a proposal in SG15. The build system can find this file with 'gcc -print-file-name=libstdc++.modules.json'. It seems preferable to use a relative path from this file to the sources so that moving the installation doesn't break the reference, but I didn't see any obvious way to compute that without relying on coreutils, perl, or python, so I wrote a POSIX shell script for it. The .. canonicalization bits aren't necessary since I discovered $(abspath), but I guess I might as well leave them in. Currently this installs the sources under $(gxx_include_dir)/bits/, i.e. /usr/include/c++/15/bits. So with my -fsearch-include-path change, std.cc can be compiled with g++ -fsearch-include-path bits/std.cc. Note that if someone actually tries to #include <bits/std.cc> it will fail with "error: module control-line cannot be in included file". Any ideas about a more user-friendly way to express "compile module std" are welcome. The sources currently have the extension .cc, like other source files. std.cc started with m.cencora's implementation in PR114600. I've made some adjustments, but more is probably desirable, e.g. of the <algorithm> handling of namespace ranges, and to remove exports of templates that are only specialized in a particular header. I've filled in a bunch of missing exports, and added some FIXMEs where I noticed bits that are not implemented yet. Since bits/stdc++.h also intends to include the whole standard library, I include it rather than duplicate it. But stdc++.h comments out <execution>, due to TBB issues; I include it separately and suppress TBB usage, so module std won't currently provide parallel execution. It seemed most convenient for the two files to be monolithic so we don't need to worry about include paths. So the C library names that module std.compat exports in both namespace std and :: are a block of code that is appended to both files, adjusted based on whether the macro STD_COMPAT is defined before the block. In this implementation std.compat imports std; it would also be valid for it to duplicate everything in std. I see the libc++ std.compat also imports std. As discussed in the PR, module std is supported in C++20 mode even though it was added in C++23. Changes to test module std will follow in a separate patch. In my testing I've noticed a few compiler bugs that break various testcases, so I don't expect to enable module std testing by default at first. PR libstdc++/106852 libstdc++-v3/ChangeLog: * include/bits/version.def: Add __cpp_lib_modules. * include/bits/version.h: Regenerate. * src/c++23/Makefile.am: Add modules std and std.compat. * src/c++23/Makefile.in: Regenerate. * src/c++23/std-clib.cc.in: New file. * src/c++23/std.cc.in: New file. * src/c++23/std.compat.cc.in: New file. * src/c++23/libstdc++.modules.json.in: New file. contrib/ChangeLog: * relpath.sh: New file.	2024-11-17 16:23:21 +01:00

... 2 3 4 5 6 ...

215633 Commits