In this patch, we will support AVX10.2 convert instructions. All
of them are new instruction forms.
Among all the instructions, vcvtbiasph2[b,h]f8[,s] needs extra care.
Since Operand 2 could indicate memory size, we do not need suffix
under ATTmode. However, we could not fold all three templates but only
XMM/YMM since the dst operand size are the same for them. Also, a new
iterator <cvt8> is added to reduce redundancy.
gas/
* testsuite/gas/i386/i386.exp: Add AVX10.2 tests.
* testsuite/gas/i386/x86-64.exp: Ditto.
* testsuite/gas/i386/avx10_2-256-cvt-intel.d: New.
* testsuite/gas/i386/avx10_2-256-cvt.d: Ditto.
* testsuite/gas/i386/avx10_2-256-cvt.s: Ditto.
* testsuite/gas/i386/avx10_2-512-cvt-intel.d: Ditto.
* testsuite/gas/i386/avx10_2-512-cvt.d: Ditto.
* testsuite/gas/i386/avx10_2-512-cvt.s: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-cvt-intel.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-cvt.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-cvt.s: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-512-cvt-intel.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-512-cvt.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-512-cvt.s: Ditto.
opcodes/
* i386-dis-evex-prefix.h: Add PREFIX_EVEX_0F3874,
PREFIX_EVEX_MAP5_18, PREFIX_EVEX_MAP5_1B,
PREFIX_EVEX_MAP5_1E and PREFIX_EVEX_MAP5_74.
* i386-dis-evex.h: Add table pass for AVX10.2
instructions.
* i386-dis.c (MOD_EVEX_0F38B1): New.
(PREFIX_EVEX_0F3874): Ditto.
(PREFIX_EVEX_MAP5_18): Ditto.
(PREFIX_EVEX_MAP5_1B): Ditto.
(PREFIX_EVEX_MAP5_1E): Ditto.
(PREFIX_EVEX_MAP5_74): Ditto.
* i386-opc.tbl: Add AVX10.2 instructions.
* i386-mnem.h: Regenerated.
* i386-tbl.h: Ditto.
Co-authored-by: Kong Lingling <lingling.kong@intel.com>
Co-authored-by: Haochen Jiang <haochen.jiang@intel.com>
The enum BFD_RELOC_[32/64] was mistakenly used in the macro instead
of the relocation in fixp. This can cause the second relocation
of a pair to be deleted when -mthin-add-sub is enabled. Apply the
correct macro to fix this.
Also sets the initial value of -mthin-add-sub.
While for executables properly aligning sections within the file can be
quite relevant, the same is of pretty little importance for relocatable
object files. Avoid passing "true" into
_bfd_elf_assign_file_position_for_section() when dealing with object
files, but compensate minimally by applying log_file_align in such
cases as a cap to the alignment put in place.
In disassembler part, for vnni instructions, we extended previous
VEX part using %XE in disassembler to promote them to EVEX by reusing
the original VEX table. For vmpsadbw, we will also use %XE. However,
it is hard to reuse the VEX table, so we are using new ones.
In assmbler part, we put the vnni table entries with previous vnni
instructions since they are just promotion from AVX-VNNI-INT{8,16}.
Since we will prefer VEX encoding, we need to use the different table
order in template <vnni>, which prefers EVEX due to earlier introduction
for AVX512_VNNI than AVX_VNNI. This means a new <vnni>. For vdpphps
and vmpsadbw, we put them at the end of the table, with future AVX10.2
instructions.
Nit: I will remove the arch requirement for avx_vnni_int{8,16} in
evex-promote testcases after AVX10.2 implies AVX-VNNI-INT{8,16}.
gas/Changelog:
* testsuite/gas/i386/i386.exp: Add AVX10.2 tests.
* testsuite/gas/i386/x86-64.exp: Ditto.
* testsuite/gas/i386/avx10_2-256-1-intel.d: New.
* testsuite/gas/i386/avx10_2-256-1.d: Ditto.
* testsuite/gas/i386/avx10_2-256-1.s: Ditto.
* testsuite/gas/i386/avx10_2-512-1-intel.d: Ditto.
* testsuite/gas/i386/avx10_2-512-1.d: Ditto.
* testsuite/gas/i386/avx10_2-512-1.s: Ditto.
* testsuite/gas/i386/avx10_2-promote.d: Ditto.
* testsuite/gas/i386/avx10_2-promote.s: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-1-intel.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-1.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-256-1.s: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-512-1-intel.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-512-1.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-512-1.s: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-promote.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-promote.s: Ditto.
opcodes/Changelog:
* i386-dis-evex-prefix.h: Adjust PREFIX_EVEX_0F3852.
Add PREFIX_EVEX_0F3A42_W_0.
* i386-dis-evex-w.h: Adjust EVEX_W_0F3A42.
* i386-dis-evex.h: Add table pass for AVX10.2
instructions.
* i386-dis.c: Adjust PREFIX_VEX_0F3850_W_0, PREFIX_VEX_0F3851_W_0,
PREFIX_VEX_0F38D2_W_0 and PREFIX_VEX_0F38D3_W_0.
* i386-opc.tbl: Add AVX10.2 instructions.
* i386-mnem.h: Regenerated.
* i386-tbl.h: Ditto.
Co-authored-by: Lili Cui <lili.cui@intel.com>
With the removal of emulations, OBJ_MAYBE_... can no longer be defined.
Tidy code wherever they're used, which also includes the dropping of
most IS_ELF and uses and checks of OUTPUT_FLAVOR.
Where touching such constructs anyway, also drop TE_PEP checks when used
together with TE_PE ones (the former implies the latter).
Both ELF and COFF have various sub-flavors, each of which would then
require its own emulation: Right now when configuring a COFF/PE
secondary target (with perhaps an ELF primary one), one gets plain COFF
emulation rather than COFF/PE one.
As such a multitude of emulations would be unwieldy (and likely fragile)
drop gas emulations altogether instead.
While originally this was in preparation of a subsequent change making
SUPPORT_FRAME_LINKONCE potentially dependent on a global variable, the
construct appears unlikely to have been correct in the first place: The
variable would have been passed reliably uninitialized when
SUPPORT_FRAME_LINKONCE is build-time true.
While there correct indentation of the parameters passed to
get_cfi_seg().
The individual struct emulation instances shouldn't be declared in a .c
file; it and the producers of the symbols want to both see the
declarations, so declarations and definitions don't go out of sync. Move
these declarations to emul.h.
While there also adjust the conditional around this_format: That symbol
is never #define-d anywhere, and it's needed only when USE_EMULATIONS is
defined. (Really, when obj-multi isn't in use, it also is effectively
only ever written to.)
With listings enabled, gas keeps a small cache of source lines. They
are stored in buffers of size LISTING_RHS_WIDTH, ie. 100. Given
listing-rhs-width larger than 100 it is of course possible to overflow
the buffer. Fix that by allocating as needed. We could allocate all
buffers on the first call to print_source using listing_rhs_width, but
I chose not to do that in case some future assembly directive allows
changes to listing_rhs_width similarly to the way paper_width can
change during assembly.
Commits 8015b1b0c1 ("x86-64: Never make R_X86_64_GOT64 section
relative"), d774bf9b36 ("x86: Add tls check in gas"), and
1b714c14e4 ("x86: Turn PLT32 to PC32 only for PC-relative
relocations") all should have adjusted the Solaris counterpart of the
reloc64 test as well.
Odd data padding has a $d label inserted at its beginning. When a $x...
label is removed instead, a replacement is inserted after the padding.
The same, however, needs to also happen when there's no $x to replace.
.insn or data emitted inside text sections can lead to positions not
being at insn granularity. In such situations using alignment
directives should reliably enforce the requested alignment.
Specifically requests to align back to insn granularity may not be
ignored (where, as a subcase thereof, the ordering of ".option norvc"
and e.g. ".p2align 2" should not matter; so far the alignment directive
needs to come first to have any effect). Similarly ahead of emitting
NOPs alignment first needs to be forced back to insn granularity.
The new testcases actually point out a corner case issue in the
disassembler as well, which is being corrected at the same time: We
don't want to print "0x" without any subsequent digits.
Using hard byte code is not a good idea in dump file. Add a label
for intel syntax test check to avoid that.
gas/ChangeLog:
* testsuite/gas/i386/avx10_2-rounding-intel.d: Use label for
test split.
* testsuite/gas/i386/avx10_2-rounding.s: Add label to avoid
hard coding in dump file.
Some configurations (eg. i386-bsd, i386-msdos) broke with the addition
of the TLS relocation checking. The "x86_elf_abi undeclared" error
has been fixed, but "gotrel defined but not used" remains. Fix that.
Also invert the preprocessor test around lex_got to make it positive
logic and remove the LEX_AT condition which is no longer necessary.
(The only x86 config files defining LEX_AT also define TE_PE.)
Since TLS relocation check is ELF specific, enable it only for ELF.
PR gas/32022
* config/tc-i386.c (x86_tls_error_type): Define only if
OBJ_MAYBE_ELF or OBJ_ELF is defined.
(x86_check_tls_relocation): Likewise.
(x86_report_tls_error): Likewise.
(i386_assemble): Check TLS relocations only if OBJ_MAYBE_ELF or
OBJ_ELF is defined.
(md_show_usage): Output -mtls-check= only if OBJ_MAYBE_ELF or
OBJ_ELF is defined.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
commit 292676c15a
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Thu Feb 13 13:44:17 2020 -0800
x86: Resolve PLT32 reloc aganst local symbol to section
resolved PLT32 relocation against local symbol to section and
commit 2585b7a5ce
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Sun Jul 19 06:51:19 2020 -0700
x86: Change PLT32 reloc against section to PC32
turned PLT32 relocation against section into PC32 relocation. But these
transformations are valid only for PC-relative relocations. Add fx_pcrel
check for PC-relative relocations when performing these transformations
to keep PLT32 relocation in `movq $foo@PLT, %rax`.
gas/
PR gas/32196
* config/tc-i386.c (tc_i386_fix_adjustable): Return fixP->fx_pcrel
for PLT32 relocations.
(i386_validate_fix): Turn PLT32 relocation into PC32 relocation
only if fixp->fx_pcrel is set.
* testsuite/gas/i386/reloc32.d: Updated.
* testsuite/gas/i386/reloc64.d: Likewise.
* testsuite/gas/i386/reloc32.s: Add PR gas/32196 test.
* testsuite/gas/i386/reloc64.s: Likewise.
ld/
PR gas/32196
* testsuite/ld-x86-64/plt3.s: New file.
* testsuite/ld-x86-64/x86-64.exp: Run plt3.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Assembler shouldn't accept invalid TLS instructions, TLS relocations
can only be used with specific instructions as specified in TLS psABI
and linker issues an error when TLS relocations are used with wrong
instructions or format. Since it is inconvenient for gcc to rely on
linker to report errors, adding TLS check in the assembler stage so
that gcc can know TLS errors earlier.
gas/ChangeLog:
PR gas/32022
* config.in: Regenerate.
* config/tc-i386.c
*(enum x86_tls_error_type): New.
*(struct _i386_insn): Added has_gotrel to indicate whether TLS
relocations need to be checked.
(x86_check_tls_relocation): Added a new function to check TLS
relocation.
(x86_report_tls_error): Created a new function to report TLS error.
(i386_assemble): Handle x86_check_tls_relocation.
(lex_got): Set i.has_gotrel.
(OPTION_MTLS_CHECK): Added a new option to contrl TLS check.
(struct option): Ditto.
(md_parse_option): Ditto.
(md_show_usage): Ditto.
* configure.ac: Added a new option to check TLS relocation by
default.
* configure: Regenerated.
* doc/c-i386.texi: Document -mtls-check=.
* testsuite/gas/i386/i386.exp: Added new tests.
* testsuite/gas/i386/ilp32/ilp32.exp: Ditto.
* testsuite/gas/i386/ilp32/reloc64.d: Disable TLS check for it.
* testsuite/gas/i386/ilp32/x32-tls.d: Ditto.
* testsuite/gas/i386/inval-tls.l: Added more test cases.
* testsuite/gas/i386/inval-tls.s: Ditto.
* testsuite/gas/i386/reloc32.d: Disable TLS check for it.
* testsuite/gas/i386/reloc64.d: Ditto.
* testsuite/gas/i386/x86-64-inval-tls.l: Added more test cases.
* testsuite/gas/i386/x86-64-inval-tls.s: Ditto.
* testsuite/gas/i386/x86-64.exp: Added new tests.
* testsuite/gas/i386/ilp32/x32-inval-tls.l: New test.
* testsuite/gas/i386/ilp32/x32-inval-tls.s: Ditto.
* testsuite/gas/i386/ilp32/x86-64-tls.d: Ditto.
* testsuite/gas/i386/tls.d: Ditto.
* testsuite/gas/i386/tls.s: Ditto.
* testsuite/gas/i386/x86-64-tls.d: Ditto.
* testsuite/gas/i386/x86-64-tls.s: Ditto.
ld/ChangeLog:
PR gas/32022
* testsuite/ld-i386/tlsgdesc1.d: Disable TLS check for it.
* testsuite/ld-i386/tlsgdesc2.d: Ditto.
* testsuite/ld-i386/tlsie2.d: Ditto.
* testsuite/ld-i386/tlsie3.d: Ditto.
* testsuite/ld-i386/tlsie4.d: Ditto.
* testsuite/ld-i386/tlsie5.d: Ditto.
* testsuite/ld-i386/tlsgdesc3.d: Ditto.
* testsuite/ld-x86-64/tlsdesc3.d: Ditto.
* testsuite/ld-x86-64/tlsdesc4.d: Ditto.
* testsuite/ld-x86-64/tlsie2.d: Ditto.
* testsuite/ld-x86-64/tlsie3.d: Ditto.
* testsuite/ld-x86-64/tlsie5.d: Ditto.
* testsuite/ld-x86-64/tlsdesc5.d: Ditto.
R_X86_64_GOT64 relocation should never be made section relative. Change
tc_i386_fix_adjustable to return 0 for BFD_RELOC_X86_64_GOT64.
gas/
PR gas/32189
* config/tc-i386.c (tc_i386_fix_adjustable): Return 0 for
BFD_RELOC_X86_64_GOT64.
* testsuite/gas/i386/reloc64.d: Updated.
* testsuite/gas/i386/reloc64.s: Add more tests for R_X86_64_GOT64
and R_X86_64_GOTOFF64.
ld/
PR gas/32189
* testsuite/ld-x86-64/x86-64.exp: Run PR gas/32189 test.
* testsuite/ld-x86-64/pr32189.s: New file.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
V{BROADCAST,EXTRACT,INSERT}{F,I}128 and VROUND{P,S}{S,D} aren't promoted
to support EGPR in APX spec. Don't promote them out of APX spec. This
commit effectively reverted:
ec3babb8c1 x86/APX: V{BROADCAST,EXTRACT,INSERT}{F,I}128 can also be expressed
5a635f1f59 x86/APX: VROUND{P,S}{S,D} encodings require AVX512{F,VL}
eea4357967 x86/APX: VROUND{P,S}{S,D} can generally be encoded
gas/
PR gas/32171
* testsuite/gas/i386/x86-64-apx-egpr-promote-inval.s: Add
V{BROADCAST,EXTRACT,INSERT}{F,I}128 tests with EGPR.
* testsuite/gas/i386/x86-64-apx-evex-promoted.s: Remove
V{BROADCAST,EXTRACT,INSERT}{F,I}128 and VROUND{P,S}{S,D} tests
with EGPR.
* testsuite/gas/i386/x86-64-apx-egpr-inval.l: Updated.
* testsuite/gas/i386/x86-64-apx-egpr-promote-inval.l: Likewise.
* testsuite/gas/i386/x86-64-apx-evex-promoted-intel.d: Likewise.
* testsuite/gas/i386/x86-64-apx-evex-promoted-wig.d: Likewise.
* testsuite/gas/i386/x86-64-apx-evex-promoted.d: Likewise.
opcodes/
PR gas/32171
* i386-opc.tbl: Remove V{BROADCAST,EXTRACT,INSERT}{F,I}128 and
VROUND{P,S}{S,D} entries with EGPR.
* i386-tbl.h: Regenerated.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
A sequence such as:
li at,-1
bne xx,at,0f
li at,1
dsll32 at,at,0x1f
is produced in the expansion of the DDIV and DREM assembly macros, where
a redundant `li at,1' instruction is used to load an intermediate value
of 1 into $at, which is then left-shifted by 63 with `dsll32 at,at,0x1f'
yielding 0x8000000000000000. However this value likewise results from
left-shifting the value of -1, already present in $at at this point.
Remove the extraneous instruction then, shortening the sequence emitted.
Adjust dumps in the testsuite accordingly.
Add `--show-raw-insn' to division tests so as to verify branch offsets
without the need to know actual offsets into the text section individual
instructions have been assembled at. Add `-z' where applicable to make
interlock NOP instructions appear in output so as to verify them without
the need to know the offsets too. Replace individual offsets to match
against with generic patterns so that a change in the expansion of an
assembly macro does not affect code that follows.
This leverages commit ("s390: Simplify (dis)assembly of insn operands
with const bits") to relax the operand constraints of the immediate
operand that contains the constant Z- or T-bit of the following extended
mnemonics:
risbgz, risbgnz, risbhgz, risblgz, rnsbgt, rosbgt, rxsbgt
Previously those instructions were the only ones where the assembler
on s390 restricted the specification of the subject I3/I4 operand values
exactly according to their specification to an unsigned 6- or 5-bit
unsigned integer. For any other instructions the assembler allows to
specify any operand value allowed by the instruction format, regardless
of whether the instruction specification is more restrictive.
Allow to specify the subject I3/I4 operand as unsigned 8-bit integer
with the constant operand bits being ORed during assembly.
Relax the instructions subject significant operand bit masks to only
consider the Z/T-bit as significant, so that the instructions get
disassembled as their *z or *t flavor regardless of whether any reserved
bits are set in addition to the Z/T-bit.
Adapt the rnsbg, rosbg, and rxsbg test cases not to inadvertently set
the T-bit in operand I3, as they otherwise get disassembled as their
rnsbgt, rosbgt, and rxsbgt counterpart.
This aligns GNU Assembler to LLVM Assembler.
opcodes/
* s390-opc.c (U6_18, U5_27, U6_26): Remove.
(INSTR_RIE_RRUUU2, INSTR_RIE_RRUUU3, INSTR_RIE_RRUUU4): Define
as INSTR_RIE_RRUUU while retaining insn fmt mask.
(MASK_RIE_RRUUU2, MASK_RIE_RRUUU3, MASK_RIE_RRUUU4): Treat only
Z/T-bit of I3/I4 operand as significant.
gas/testsuite/
* gas/s390/zarch-z10.s (rnsbg, rosbg, rxsbg): Do not set T-bit.
Reported-by: Dominik Steenken <dost@de.ibm.com>
Suggested-by: Ulrich Weigand <ulrich.weigand@de.ibm.com>
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Simplify assembly and disassembly of extended mnemonics with operands
with constant ORed bits:
Their instruction template already contains the respective constant
operand bits, as they are significant to distinguish the extended from
their base mnemonic. Operands are ORed into the instruction template.
Therefore it is not necessary to OR the constant bits into the operand
value during assembly in s390_insert_operand.
Additionally the constant operand bits from the instruction template
can be used to mask them from the operand value during disassembly in
s390_print_insn_with_opcode. For now do so for non-length unsigned
integer operands only.
The separate instruction formats need to be retained, as their masks
differ, which is relevant during disassembly to distinguish the base
and extended mnemonics from each other.
This affects the following extended mnemonics:
- vfaebs, vfaehs, vfaefs
- vfaezb, vfaezh, vfaezf
- vfaezbs, vfaezhs, vfaezfs
- vstrcbs, vstrchs, vstrcfs
- vstrczb, vstrczh, vstrczf
- vstrczbs, vstrczhs, vstrczfs
- wcefb, wcdgb
- wcelfb, wcdlgb
- wcfeb, wcgdb
- wclfeb, wclgdb
- wfisb, wfidb, wfixb
- wledb, wflrd, wflrx
include/
* opcode/s390.h (S390_OPERAND_OR1, S390_OPERAND_OR2,
S390_OPERAND_OR8): Remove.
opcodes/
* s390-opc.c (U4_OR1_24, U4_OR2_24, U4_OR8_28): Remove.
(INSTR_VRR_VVV0U1, INSTR_VRR_VVV0U2, INSTR_VRR_VVV0U3): Define
as INSTR_VRR_VVV0U0 while retaining respective insn fmt mask.
(INSTR_VRR_VV0UU8): Define as INSTR_VRR_VV0UU while retaining
respective insn fmt mask.
(INSTR_VRR_VVVU0VB1, INSTR_VRR_VVVU0VB2, INSTR_VRR_VVVU0VB3):
Define as INSTR_VRR_VVVU0VB while retaining respective insn fmt
mask.
* s390-dis.c (s390_print_insn_with_opcode): Mask constant
operand bits set in insn template of non-length unsigned
integer operands.
gas/
* config/tc-s390.c (s390_insert_operand): Do not OR constant
operand value bits.
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
What's past an active .end directive (when that has its default purpose)
is supposed to be entirely ignored. That should be true not just for
regular processing, but also for "pre-processing" (aka scrubbing). A
complication is that such a directive may of course occur inside a
(false) conditional or a macro definition. To deal with that make sure
we can continue as usual if called another time.
Note however that .end inside a macro will still have the full macro
body expanded; dealing with that would require further (perhaps
intrusive) adjustments in sb_scrub_and_add_sb() and/or callers thereof.
However, at least some of the warnings issued by do_scrub_chars() are
unlikely to occur when expanding a macro. (If we needed to go that far,
presumably .exitm would also want recognizing.)
In that mode the comment char is ; while @ has no special meaning.
Engaging the special logic in that case results in comments not being
respected on .symver lines.
Error messages there would better not be followed by further "junk at
end of line" diagnostics. Arrange for this to be the case uniformly.
While there also replace a somewhat unhelpful open-coding of
restore_line_pointer().
Document the s390-specific assembler syntax introduced by commit
aacf780bca ("s390: Allow to explicitly omit base register operand in
assembly") to omit the base register operand B in D(X,B) and D(L,B) by
coding D(X,) and D(L,).
While at it document the alternative syntax to omit the index register
operand X in D(X,B) by coding D(,B) instead of D(B).
gas/
* doc/c-s390.texi (s390 Operands): Document syntax to omit base
register operand.
Fixes: aacf780bca ("s390: Allow to explicitly omit base register operand in assembly")
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
The precedence of the operators "+" and "-" in the current loongarch
instruction expression is higher than "<<" and ">>", which is different
from the explanation in the user guide.
We modified the precedence of "<<" and ">>" to be higher than "+" and "-".
LoongArch: Add macros to get opcode and register of instructions appropriately
Currently, we get opcode of an instruction by manipulate the binary with
it's mask, it's a bit of a pain. Now a macro is defined to do this and a
macro to get the RD and RJ registers which is applicable to most instructions
of LoongArch are added.