mirror/gcc - gcc - Collaboration & Inovation

mirror/gcc

mirror of git://gcc.gnu.org/git/gcc.git synced 2025-02-04 16:30:40 +08:00

Author	SHA1	Message	Date
GCC Administrator	0e8c946508	Daily bump.	2022-12-31 00:17:04 +00:00
Alexandre Oliva	012fdbc142	check hash table insertions I've noticed a number of potential problems in hash tables, of three kinds: insertion of entries that seem empty, dangling insertions, and lookups during insertions. These problems may all have the effect of replacing a deleted entry with one that seems empty, which may disconnect double-hashing chains involving that entry, and thus cause entries to go missing. This patch detects such problems by recording a pending insertion and checking that it's completed before other potentially-conflicting operations. The additional field is only introduced when checking is enabled. for gcc/ChnageLog * hash-table.h (check_complete_insertion, check_insert_slot): New hash_table methods. (m_inserting_slot): New hash_table field. (begin, hash_table ctors, ~hash_table): Check previous insert. (expand, empty_slow, clear_slot, find_with_hash): Likewise. (remote_elt_with_hash, traverse_noresize): Likewise. (gt_pch_nx): Likewise. (find_slot_with_hash): Likewise. Record requested insert.	2022-12-30 13:44:50 -03:00
Martin Uecker	ebf7dd754a	regressions tests for PR103770 This adds tests from bugzilla for PR103770 and duplicates. gcc/testsuite/ * gcc.dg/pr103770.c: New test. * gcc.dg/pr103859.c: New test. * gcc.dg/pr105065.c: New test.	2022-12-30 14:51:37 +01:00
Stam Markianos-Wright	4269a6567e	Fix memory constraint on MVE v[ld/st][2/4] instructions [PR107714] In the M-Class Arm-ARM: https://developer.arm.com/documentation/ddi0553/bu/?lang=en these MVE instructions only have '!' writeback variant and at: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107714 we found that the Um constraint would also allow through a register offset writeback, resulting in an assembler error. Here I have added a new constraint and predicate for these instructions, which (uniquely, AFAICT), only support a `!` writeback increment by the data size (inside the compiler this is a POST_INC). No regressions in arm-none-eabi with MVE and MVE.FP. gcc/ChangeLog: PR target/107714 * config/arm/arm-protos.h (mve_struct_mem_operand): New protoype. * config/arm/arm.cc (mve_struct_mem_operand): New function. * config/arm/constraints.md (Ug): New constraint. * config/arm/mve.md (mve_vst4q<mode>): Change constraint. (mve_vst2q<mode>): Likewise. (mve_vld4q<mode>): Likewise. (mve_vld2q<mode>): Likewise. * config/arm/predicates.md (mve_struct_operand): New predicate. gcc/testsuite/ChangeLog: PR target/107714 * gcc.target/arm/mve/intrinsics/vldst24q_reg_offset.c: New test.	2022-12-30 11:25:22 +00:00
Steve Kargl	cdc6bf44ee	Modify checks to avoid referencing NULL pointer. Update test cases with error messages that changed as a result. gcc/fortran/ChangeLog: PR fortran/102595 * decl.cc (attr_decl1): Guard against NULL pointer. * parse.cc (match_deferred_characteristics): Include BT_CLASS in check for derived being undefined. gcc/testsuite/ChangeLog: PR fortran/102595 * gfortran.dg/class_result_4.f90: Update error message check. * gfortran.dg/pr85779_3.f90: Update error message check.	2022-12-29 19:25:17 -08:00
GCC Administrator	bbab9c83f2	Daily bump.	2022-12-30 00:16:36 +00:00
Alexandre Oliva	603da20168	prevent hash set/map insertion of deleted entries Just like the recently-added checks for empty entries, add checks for deleted entries as well. This didn't catch any problems, but it might prevent future accidents. Suggested by David Malcolm. for gcc/ChangeLog * hash-map.h (put, get_or_insert): Check that added entry doesn't look deleted either. * hash-set.h (add): Likewise.	2022-12-29 14:39:47 -03:00
Alexandre Oliva	512af6c380	parloops: don't request insert that won't be completed In take_address_of, we may refrain from completing a decl_address INSERT if gsi is NULL, so dnn't even ask for an INSERT in this case. for gcc/ChangeLog * tree-parloops.cc (take_address_of): Skip INSERT if !gsi.	2022-12-29 14:39:47 -03:00
Alexandre Oliva	8d48107702	hash-map: reject empty-looking insertions Check, after inserting entries, that they don't look empty. for gcc/ChangeLog * hash-map.h (put, get_or_insert): Check that entry does not look empty after insertion.	2022-12-29 14:39:47 -03:00
Alexandre Oliva	1b92708520	hash set: reject attempts to add empty values Check, after adding a key to a hash set, that the entry does not look empty. for gcc/ChangeLog * hash-set.h (add): Check that the inserted entry does not look empty.	2022-12-29 14:39:47 -03:00
Alexandre Oliva	025e3d5799	ada: don't map NULL decl to locus When decl is NULL, don't record its mapping in the decl_to_instance_map. for gcc/ada/ChangeLog * gcc-interface/trans.cc (Sloc_to_locus): Don't map NULL decl.	2022-12-29 14:39:47 -03:00
Alexandre Oliva	3d40144cb4	lto: drop dummy partition mapping When adding a catch-all partition, we map NULL to it. That mapping is ineffective and unnecessary. Drop it. for gcc/lto/ChangeLog * lto-partition.cc (lto_1_to_1_map): Drop NULL partition mapping.	2022-12-29 14:39:47 -03:00
Alexandre Oliva	6ec8079e0b	[C++] constexpr: request insert iff depth is ok cxx_eval_call_expression requests an INSERT even in cases when it would later decide not to insert. This could break double-hashing chains. Arrange for it to use NO_INSERT when the insertion would not be completed. for gcc/cp/ChangeLog * constexpr.cc (cxx_eval_call_expression): Do not request an INSERT that would not be completed.	2022-12-29 14:39:47 -03:00
Alexandre Oliva	06be65894f	tm: complete tm_restart insertion Insertion of a tm_restart_node in tm_restart failed to record the newly-allocated node in the hash table. for gcc/ChangeLog * trans-mem.cc (split_bb_make_tm_edge): Record new node in tm_restart.	2022-12-29 14:39:47 -03:00
Alexandre Oliva	184278b93f	postreload-gcse: no insert on mere lookup lookup_expr_in_table is not used for insertions, but it mistakenly used INSERT rather than NO_INSERT. for gcc/ChangeLog * postreload-gcse.cc (lookup_expr_in_table): Use NO_INSERT.	2022-12-29 14:39:47 -03:00
Alexandre Oliva	a7d397bf6f	tree-inline decl_map: skip mapping result's NULL default def If a result doesn't have a default def, don't attempt to remap it. for gcc/ChangeLog * tree-inline.cc (declare_return_variable): Don't remap NULL default def of result.	2022-12-29 14:39:46 -03:00
Alexandre Oliva	e2535c6035	ssa-loop-niter: skip caching of null operands When a TREE_OPERAND is NULL, do not cache it. for gcc/ChangeLog * tree-ssa-loop-niter.cc (expand_simple_operands): Refrain from caching NULL TREE_OPERANDs.	2022-12-29 14:39:46 -03:00
Alexandre Oliva	8251f31943	[C++] constraint: insert norm entry once Use NO_INSERT to test whether inserting should be attempted. for gcc/cp/ChangeLog * constraint.cc (normalize_concept_check): Use NO_INSERT for pre-insertion check.	2022-12-29 14:39:46 -03:00
Alexandre Oliva	d7c8a16537	tree-inline decl_map: skip mapping NULL to itself Mapping a NULL key is no use, skip it. for gcc/ChangeLog * tree-inline.cc (insert_decl_map): Skip mapping a NULL decl as value to itself.	2022-12-29 14:39:46 -03:00
Alexandre Oliva	50a0270389	varpool: do not add NULL vnodes to referenced Avoid adding NULL vnodes to referenced tables. for gcc/ChangeLog * varpool.cc (symbol_table::remove_unreferenced_decls): Do not add NULL vnodes to referenced table.	2022-12-29 14:32:48 -03:00
Alexandre Oliva	26be8b8460	scoped tables: insert before further lookups Avoid hash table lookups between requesting an insert and storing the inserted value in avail_exprs_stack. Lookups before the insert is completed could fail to find double-hashed elements. for gcc/ChangeLog * tree-ssa-scopedtables.cc (avail_exprs_stack::lookup_avail_expr): Finish hash table insertion before further lookups.	2022-12-29 14:32:46 -03:00
Max Filippov	da086e472b	gcc: xtensa: use GP_RETURN_* instead of magic constant gcc/ * config/xtensa/xtensa.cc (xtensa_return_in_memory): Use GP_RETURN_* instead of magic constant.	2022-12-29 07:04:59 -08:00
Takayuki 'January June' Suwa	65fed695f7	xtensa: Check DF availability before use Perhaps no problem, but for safety. gcc/ChangeLog: * config/xtensa/xtensa.cc (xtensa_expand_prologue): Fix to check DF availability before use of DF_* macros.	2022-12-29 07:04:25 -08:00
GCC Administrator	9b111debbf	Daily bump.	2022-12-29 00:17:39 +00:00
Roger Sayle	4003e470a7	Provide zero_extend versions/variants of several patterns on x86. The middle-end doesn't have a preferred canonical form for expressing zero-extension, sometimes using an AND, sometimes pairs of SHIFTs, and sometimes using zero_extend. Pending changes to RTL simplification will/may alter some of these representations, so a few additional patterns are required to recognize these alternate representations and avoid any testsuite regressions. As an example, popcountsi2_zext is currently represented as: [(set (match_operand:DI 0 "register_operand" "=r") (and:DI (subreg:DI (popcount:SI (match_operand:SI 1 "nonimmediate_operand" "rm")) 0) (const_int 63))) (clobber (reg:CC FLAGS_REG))] this patch adds an alternate/equivalent pattern that matches: [(set (match_operand:DI 0 "register_operand" "=r") (zero_extend:DI (popcount:SI (match_operand:SI 1 "nonimmediate_operand" "rm")))) (clobber (reg:CC FLAGS_REG))] Another example is popcounthi2 which is currently represented as: [(set (match_operand:SI 0 "register_operand") (popcount:SI (zero_extend:SI (match_operand:HI 1 "nonimmediate_operand")))) (clobber (reg:CC FLAGS_REG))] this patch adds an alternate/equivalent pattern that matches: [(set (match_operand:SI 0 "register_operand") (zero_extend:SI (popcount:HI (match_operand:HI 1 "nonimmediate_operand")))) (clobber (reg:CC FLAGS_REG))] The contents of the machine description definitions remain the same. it's just the expected RTL is slightly different but equivalent. Providing both forms makes the backend more robust to middle-end changes [and possibly catches some missed optimizations]. 2022-12-28 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * config/i386/i386.md (clzsi2_lzcnt_zext_2): define_insn_and_split to match ZERO_EXTEND form of clzsi2_lzcnt_zext. (clzsi2_lzcnt_zext_2_falsedep): Likewise, new define_insn to match ZERO_EXTEND form of clzsi2_lzcnt_zext_falsedep. (bmi2_bzhi_zero_extendsidi_5): Likewise, new define_insn to match ZERO_EXTEND form of bmi2_bzhi_zero_extendsidi. (popcountsi2_zext_2): Likewise, new define_insn_and_split to match ZERO_EXTEND form of popcountsi2_zext. (popcountsi2_zext_2_falsedep): Likewise, new define_insn to match ZERO_EXTEND form of popcountsi2_zext_falsedep. (popcounthi2_2): Likewise, new define_insn_and_split to match ZERO_EXTEND form of popcounthi2. (define_peephole2): ZERO_EXTEND variant of HImode popcount&1 using parity flag peephole2.	2022-12-28 19:30:17 +00:00
Roger Sayle	38b649ec16	Use ix86_expand_clear in ix86_split_ashl. This patch is a one line change, to call ix86_expand_clear instead of emit_move_insn with const0_rtx in ix86_split_ashl, allowing the backend to use an xor instruction to clear a register if appropriate. The effect is demonstrated with the following function. __int128 foo(__int128 x, unsigned long long b) { return ((__int128)b << 72) + x; } previously with -O2, GCC would generate foo: movl $0, %eax salq $8, %rdx addq %rdi, %rax adcq %rsi, %rdx ret with this patch, it now generates foo: xorl %eax, %eax salq $8, %rdx addq %rdi, %rax adcq %rsi, %rdx ret 2022-12-28 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * config/i386/i386-expand.cc (ix86_split_ashl): Call ix86_expand_clear to generate an xor instruction. gcc/testsuite/ChangeLog * gcc.target/i386/ashlti3-1.c: New test case.	2022-12-28 19:27:52 +00:00
Martin Liska	d898a17b92	contrib: add contrib to update-copyright.py script contrib/ChangeLog: * update-copyright.py: Add contrib folder.	2022-12-28 10:22:38 +01:00
Martin Liska	ee6f262b87	strlen: do not use cond_expr for boundaries PR tree-optimization/108137 gcc/ChangeLog: * tree-ssa-strlen.cc (get_range_strlen_phi): Reject anything different from INTEGER_CST. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/pr108137.c: New test.	2022-12-28 09:14:17 +01:00
Kito Cheng	31ec203247	RISC-V: Return const ref. for vl_vtype_info::get_avl_info Return const reference could prevent unnecessary copying. gcc/ * config/riscv/riscv-vsetvl.h (vl_vtype_info::get_avl_info): Return const reference rather than value.	2022-12-28 09:35:28 +08:00
GCC Administrator	7b885ecc05	Daily bump.	2022-12-28 00:17:27 +00:00
Jeff Law	103f963e5c	Commit right version of last patch (missing modes) gcc/ * config/riscv/riscv.md: Add missing modes to last patch.t	2022-12-27 16:57:09 -07:00
Raphael Moreira Zinsly	2e886eef7f	RISC-V: Produce better code with complex constants [PR95632] [PR106602] gcc/Changelog: PR target/95632 PR target/106602 * config/riscv/riscv.md: New pattern to simulate complex const_int loads. gcc/testsuite/ChangeLog: * gcc.target/riscv/pr95632.c: New test. * gcc.target/riscv/pr106602.c: New test.	2022-12-27 18:30:58 -05:00
Christoph Müllner	7c755fd901	riscv: Restructure callee-saved register save/restore code This patch restructures the loop over the GP registers which saves/restores then as part of the prologue/epilogue. No functional change is intended by this patch, but it offers the possibility to use load-pair/store-pair instructions. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_next_saved_reg): New function. (riscv_is_eh_return_data_register): New function. (riscv_for_each_saved_reg): Restructure loop. Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>	2022-12-27 20:49:50 +01:00
Christoph Müllner	3895dd7675	riscv: attr: Synchronize comments with code The comment above the enumeration of existing attributes got out of order and a few entries were forgotten. This patch synchronizes the comments according to the list. This commit does not include any functional change. gcc/ChangeLog: * config/riscv/riscv.md: Sync comments with code. Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>	2022-12-27 20:49:43 +01:00
jinma	b0a32b6e1b	Fixed typo in RISCV gcc/ChangeLog: * common/config/riscv/riscv-common.cc:	2022-12-27 10:30:46 -07:00
Jonathan Yong	cf8b110ce3	gcc: fix Windows target binutils secrel detection Newer binutils uses all caps, where it was all lower case previously. Pushed as obvious. This should resolve: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100383 gcc/ * configure.ac: use grep -i for case insensitive test. * configure: Regenerate. Signed-off-by: Jonathan Yong <10walls@gmail.com>	2022-12-27 16:00:11 +00:00
Max Filippov	b92f1c2dcc	gcc: xtensa: use define_c_enums instead of define_constants This improves RTL dumps readability. No functional changes. gcc/ * config/xtensa/xtensa.md (unspec): Extract UNSPEC_* constants into this enum. (unspecv): Extract UNSPECV_* constants into this enum.	2022-12-27 07:38:39 -08:00
Takayuki 'January June' Suwa	48a0e82266	xtensa: Generate density instructions in set_frame_ptr gcc/ChangeLog: * config/xtensa/xtensa.md (set_frame_ptr): Fix to reflect TARGET_DENSITY.	2022-12-27 07:38:39 -08:00
Takayuki 'January June' Suwa	98a1b4d073	xtensa: Change GP_RETURN{,_REG_COUNT} to GP_RETURN_{FIRST,LAST} gcc/ChangeLog: * config/xtensa/xtensa.h (GP_RETURN, GP_RETURN_REG_COUNT): Change to GP_RETURN_FIRST and GP_RETURN_LAST, respectively. * config/xtensa/xtensa.cc (xtensa_function_value, xtensa_libcall_value, xtensa_function_value_regno_p): Ditto.	2022-12-27 07:38:39 -08:00
Takayuki 'January June' Suwa	b22f86ba95	xtensa: Clean up xtensa_expand_prologue gcc/ChangeLog: * config/xtensa/xtensa.cc (xtensa_expand_prologue): Modify to exit the inspection loops as soon as the necessity of stack pointer is found.	2022-12-27 07:38:38 -08:00
Takayuki 'January June' Suwa	89d5982b8f	xtensa: Tabify, and trim trailing spaces Cosmetic and no functional changes. gcc/ChangeLog: * config/xtensa/elf.h: Tabify, and trim trailing spaces. * config/xtensa/linux.h: Likewise. * config/xtensa/uclinux.h: Likewise. * config/xtensa/xtensa-dynconfig.c: Likewise. * config/xtensa/xtensa.cc: Likewise. * config/xtensa/xtensa.h: Likewise. * config/xtensa/xtensa.md: Likewise.	2022-12-27 07:38:38 -08:00
Kito Cheng	3d365acf98	RISC-V: Add riscv_vector.h wrapper Like `d0bbecb1c4`, we add a wrapper to prevent it pull stdint.h from standard C library. gcc/testsuite: * gcc.target/riscv/rvv/vsetvl/riscv_vector.h: New.	2022-12-27 23:29:33 +08:00
Ju-Zhe Zhong	681a5632e0	RISC-V: Fix ICE of visiting non-existing block in CFG. This patch is to fix issue of visiting non-existing block of CFG. Since blocks index of CFG in GCC are not always contiguous, we will potentially visit a gap block which is no existing in the current CFG. This patch can avoid visiting non existing block in CFG. I noticed such issue in my internal regression of current testsuite when I change the X86 server machine. This patch fix it: 17:27:15 job(build_and_test_rv32): Increased FAIL List: 17:27:15 job(build_and_test_rv32): FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-46.c -O2 -flto -fno-use-linker-plugin -flto-partition=none (internal compiler error: Segmentation fault) gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (pass_vsetvl::compute_global_backward_infos): Change to visit CFG. (pass_vsetvl::prune_expressions): Ditto.	2022-12-27 23:29:24 +08:00
Ju-Zhe Zhong	12b23c718c	RISC-V: Fix ICE for avl_info deprecated copy and pp_print error. gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (change_insn): Remove pp_print. (avl_info::avl_info): Add copy function. (vector_insn_info::dump): Remove pp_print. * config/riscv/riscv-vsetvl.h: Add copy function.	2022-12-27 23:29:22 +08:00
Kewen Lin	acc727cf02	rs6000: Rework option -mpowerpc64 handling [PR106680] PR106680 shows that -m32 -mpowerpc64 is different from -mpowerpc64 -m32, this is determined by the way how we handle option powerpc64 in rs6000_handle_option. Segher pointed out this difference should be taken as a bug and we should ensure that option powerpc64 is independent of -m32/-m64. So this patch removes the handlings in rs6000_handle_option and add some necessary supports in rs6000_option_override_internal instead. With this patch, if users specify -m{no-,}powerpc64, the specified value is honoured, otherwise, for 64bit it always enables OPTION_MASK_POWERPC64; while for 32bit and TARGET_POWERPC64 and OS_MISSING_POWERPC64, it disables OPTION_MASK_POWERPC64. btw, following Segher's suggestion, I did some tries to warn when OPTION_MASK_POWERPC64 is set for OS_MISSING_POWERPC64. If warn for the case that powerpc64 is specified explicitly, there are some TCs using -m32 -mpowerpc64 on ppc64-linux, they need some updates, meanwhile the artificial run with "--target_board=unix'{-m32/-mpowerpc64}'" will have noisy warnings on ppc64-linux. If warn for the case that it's specified implicitly, they can just be initialized by TARGET_DEFAULT (like -m32 on ppc64-linux) or set from the given cpu mask, we have to special case them and not to warn. As Segher's latest comment, I decide not to warn them and keep it consistent with before. Bootstrapped and regress-tested on: - powerpc64-linux-gnu P7 and P8 {-m64,-m32} - powerpc64le-linux-gnu P9 and P10 - powerpc-ibm-aix7.2.0.0 {-maix64,-maix32} - powerpc-darwin9 (with Iain's help) PR target/106680 gcc/ChangeLog: * common/config/rs6000/rs6000-common.cc (rs6000_handle_option): Remove the adjustment for option powerpc64 in -m64 handling, and remove the whole -m32 handling. * config/rs6000/rs6000.cc (rs6000_option_override_internal): When no explicit powerpc64 option is provided, enable it for -m64. For 32 bit and OS_MISSING_POWERPC64, disable powerpc64 if it's enabled but not specified explicitly. gcc/testsuite/ChangeLog: * gcc.target/powerpc/pr106680-1.c: New test. * gcc.target/powerpc/pr106680-2.c: New test. * gcc.target/powerpc/pr106680-3.c: New test. * gcc.target/powerpc/pr106680-4.c: New test. 2022-12-27 Kewen Lin <linkw@linux.ibm.com> Iain Sandoe <iain@sandoe.co.uk>	2022-12-27 04:13:07 -06:00
GCC Administrator	e2acff49fb	Daily bump.	2022-12-27 00:16:40 +00:00
David Edelsohn	3e9783139c	testsuite: fix analyzer failures on AIX Many analyzer testcases are failing on AIX, some due to specific system header expectations. This patch skips the testcases to avoid the noise. * gcc.dg/analyzer/fd-accept.c: Skip. * gcc.dg/analyzer/fd-access-mode-target-headers.c: Skip. * gcc.dg/analyzer/fd-bind.c: Skip. * gcc.dg/analyzer/fd-connect.c: Skip. * gcc.dg/analyzer/fd-datagram-socket.c: Skip. * gcc.dg/analyzer/fd-glibc-datagram-client.c: Skip. * gcc.dg/analyzer/fd-glibc-datagram-socket.c: Skip. * gcc.dg/analyzer/fd-listen.c: Skip. * gcc.dg/analyzer/fd-socket-misuse.c: Skip. * gcc.dg/analyzer/fd-stream-socket-active-open.c: Skip. * gcc.dg/analyzer/fd-stream-socket-passive-open.c: Skip. * gcc.dg/analyzer/fd-stream-socket.c: Skip. * gcc.dg/analyzer/fd-symbolic-socket.c: Skip. * gcc.dg/analyzer/flex-with-call-summaries.c: Skip. * gcc.dg/analyzer/getchar-1.c: Skip. * gcc.dg/analyzer/isatty-1.c: Skip. * gcc.dg/analyzer/pr94851-1.c: Skip. * gcc.dg/analyzer/pragma-2.c: Skip.	2022-12-26 12:06:22 -05:00
liuhongt	e54375d85d	x86: Add a new option -mdaz-ftz to enable FTZ and DAZ flags in MXCSR. if (mdaz-ftz) link crtfastmath.o else if ((Ofast \|\| ffast-math \|\| funsafe-math-optimizations) && !shared && !mno-daz-ftz) link crtfastmath.o else Don't link crtfastmath.o gcc/ChangeLog: PR target/55522 PR target/36821 * config/i386/gnu-user-common.h (GNU_USER_TARGET_MATHFILE_SPEC): Link crtfastmath.o whenever -mdaz-ftz is specified. Don't link crtfastmath.o when -share or -mno-daz-ftz is specified. * config/i386/i386.opt (mdaz-ftz): New option. * doc/invoke.texi (x86 options): Document mftz-daz.	2022-12-26 09:11:44 +08:00
GCC Administrator	bc38aee755	Daily bump.	2022-12-26 00:16:27 +00:00
Roger Sayle	febb58d28b	Use movss/movsd to implement V4SI/V2DI VEC_PERM on x86. This patch tweaks the x86 backend to use the movss and movsd instructions to perform some vector permutations on integer vectors (V4SI and V2DI) in the same way they are used for floating point vectors (V4SF and V2DF). As a motivating example, consider: typedef unsigned int v4si __attribute__((vector_size(16))); typedef float v4sf __attribute__((vector_size(16))); v4si foo(v4si x,v4si y) { return (v4si){y[0],x[1],x[2],x[3]}; } v4sf bar(v4sf x,v4sf y) { return (v4sf){y[0],x[1],x[2],x[3]}; } which is currently compiled with -O2 to: foo: movdqa %xmm0, %xmm2 shufps $80, %xmm0, %xmm1 movdqa %xmm1, %xmm0 shufps $232, %xmm2, %xmm0 ret bar: movss %xmm1, %xmm0 ret with this patch both functions compile to the same form. Likewise for the V2DI case: typedef unsigned long v2di __attribute__((vector_size(16))); typedef double v2df __attribute__((vector_size(16))); v2di foo(v2di x,v2di y) { return (v2di){y[0],x[1]}; } v2df bar(v2df x,v2df y) { return (v2df){y[0],x[1]}; } which currently generates: foo: shufpd $2, %xmm0, %xmm1 movdqa %xmm1, %xmm0 ret bar: movsd %xmm1, %xmm0 ret 2022-12-25 Roger Sayle <roger@nextmovesoftware.com> Uroš Bizjak <ubizjak@gmail.com> gcc/ChangeLog * config/i386/i386-builtin.def (__builtin_ia32_movss): Update CODE_FOR_sse_movss to CODE_FOR_sse_movss_v4sf. (__builtin_ia32_movsd): Likewise, update CODE_FOR_sse2_movsd to CODE_FOR_sse2_movsd_v2df. * config/i386/i386-expand.cc (split_convert_uns_si_sse): Update gen_sse_movss call to gen_sse_movss_v4sf, and gen_sse2_movsd call to gen_sse2_movsd_v2df. (expand_vec_perm_movs): Also allow V4SImode with TARGET_SSE and V2DImode with TARGET_SSE2. * config/i386/sse.md (avx512fp16_fcmaddcsh_v8hf_mask3<round_expand_name>): Update gen_sse_movss call to gen_sse_movss_v4sf. (avx512fp16_fmaddcsh_v8hf_mask3<round_expand_name>): Likewise. (sse_movss_<mode>): Renamed from sse_movss using VI4F_128 mode iterator to handle both V4SF and V4SI. (sse2_movsd_<mode>): Likewise, renamed from sse2_movsd using VI8F_128 mode iterator to handle both V2DF and V2DI. gcc/testsuite/ChangeLog * gcc.target/i386/sse-movss-4.c: New test case. * gcc.target/i386/sse2-movsd-3.c: New test case.	2022-12-25 11:57:12 +00:00

... 4 5 6 7 8 ...

197910 Commits