For a long time there hasn't been a need anymore to keep together all
templates with identical mnemonics. Move the MOVQ and MOVABS ones next
to their MOV counterparts. Move the string forms of CMPSD and MOVSD next
to their CMPS / MOVS counterparts. Re-arrange what so fgar was the SSE3
section.
This makes reasonably obvious that MONITOR/MWAIT aren't suitable to
cover by CpuSSE3, but adjusting this is left for another time.
In commit 79dec6b7ba ("x86-64: optimize certain commutative
VEX-encoded insns") I missed the fact that there being subtraction
involved here doesn't matter, as absolute differences get summed up.
This way not only the overall (source) table size shrinks by quite a
bit and the risk of related templates going out of sync with one another
gets lowered, but also (dis)similarities between neighboring templates
become easier to spot.
Note that for certain SSE2AVX templates this results in benign attribute
changes:
- LDMXCSR and STMXCSR: NoAVX gets set,
- MOVMSKPS, PMOVMSKB, PEXTR{B,W} (register destination), and PINSR{B,W}
(register source): IgnoreSize and NoRex64 get set,
- CVT{DQ,PS}2PD, CVTSD2SS, MOVMSKPD, MOVDDUP, PMOV{S,Z}X{BW,WD,DQ}, and
ROUNDSD: NoRex64 gets set,
- CVTSS2SD, INSERTPS, PEXTRW (memory destination), PINSRW (memory
source), and PMOV{S,Z}X{BD,WQ,BQ}: IgnoreSize gets set.
Similarly the "normal" (non-SSE2AVX)
- non-64-bit CVTS{I,S}2SD forms get NoRex64 set,
- CMP{EQ,ORD,NEQ,UNORD}{P,S}{S,D} forms get C set,
all again in a benign way.
The remaining differences in the generated table are due to re-ordering
of entries in the course of being folded into templates.
The table entries are more natural to read (and slightly shorter) when
the prefixes, like is the case for VEX/XOP/EVEX-encoded entries, are
specified as part of the opcode. This is particularly noticable for
side-by-side legacy and SSE2AVX entries.
An implication is that we now need to use "unsigned long long" for the
initially parsed opcode in i386-gen. I don't expect this to be an issue.
Now that all base opcodes are only at most 2 bytes in size, shrink its
template field to just as much. By also shrinking extension_opcode and
operands to just what they really need, we can free up an entire 32-bit
slot (plus 4 left bits past the bitfields themselves).
At present this alters sizeof(struct insn_template) only for 32-bit
builds. In 64-bit builds it instead leaves a padding hole that will
allow to buffer future growth of other fields (opcode_modifier,
cpu_flags, operand_types[]).
Just like is already done for VEX/XOP/EVEX encoded insns, record the
encoding space information in the respective opcode modifier field. Do
this again without changing the source table, but rather by deriving the
values from their existing source representation.
While all of MMX, SSE, and SSE2 are included in "generic64", they can be
individually disabled. There are two MOVQ forms lacking respective
attributes. While the MMX one would get refused anyway (due to MMX
registers not recognized with .nommx), the assembler did happily accept
the SSE2 form. Add respective CPU settings to both, paralleling what the
MOVD counterparts have.
The code was checking wrong bit for sign extension. It caused it
to zero-extend instead of sign-extend the immediate value.
2021-03-25 Abid Qadeer <abidh@codesourcery.com>
opcodes/
* nios2-dis.c (nios2_print_insn_arg): Fix sign extension of
immediate in br.n instruction.
gas/
* testsuite/gas/nios2/brn.s: New.
* testsuite/gas/nios2/brn.d: New.
For VEX-encoded ones, all three involved vector registers have to be
distinct. For EVEX-encoded ones an actual mask register has to be in use
and zeroing-masking cannot be used (violation of either will #UD).
Additionally both involved vector registers have to be distinct for
EVEX-encoded gathers.
For INVLPGB the operand count was wrong (besides %edx there's also %ecx
which is an input to the insn). In this case I see little sense in
retaining the bogus 2-operand template. Plus swapping of the operands
wasn't properly suppressed for Intel syntax.
For PVALIDATE, RMPADJUST, and RMPUPDATE bogus single operand templates
were specified. These get retained, as the address operand is the only
one really needed to expressed non-default address size, but only for
compatibility reasons. Proper multi-operand insn get introduced and the
testcases get adjusted / extended accordingly.
While at it also drop the redundant definition of __amd64__ - we already
have x86_64 defined (or not) to distinguish 64-bit and non-64-bit cases.
In the majority of cases we can easily determine the length from the
encoding, irrespective of whether a prefix is specified there as well.
We further don't even need to record the value in the table entries, as
it's easy enough to determine it (without any guesswork, unless an insn
with major opcode 00 appeared that requires a 2nd opcode byte to be
specified explicitly) when installing the chosen template for further
processing.
Should an encoding appear which
- has a major opcode byte of 66, F3, or F2,
- requires a 2nd opcode byte to be specified explicitly,
- doesn't have a mandatory prefix
we'd need to convert all templates presently encoding a mandatory prefix
this way to the Prefix_0X<nn> model to eliminate the respective guessing
i386-gen does.
Just like is already done for legacy encoded insns, record the mandatory
prefix information in the respective opcode modifier field. Do this
without changing the source table, but rather by deriving the values from
their existing source representation.
This is in preparation of opcode_length going away as a field in the
templates. Identify pseudo prefixes by a base opcode of zero instead:
No real prefix has an opcode of zero. This at the same time allows
dropping a curious special case from i386-gen.
Since most attributes are identical for all pseudo prefixes, take the
opportunity and also template them.
In preparation to use PREFIX_0X<nn> attributes also in VEX/XOP/EVEX
encoding templates, renumber the pseudo-enumerators such that their
values can then also be used directly in the respective prefix bit
fields.
Commit 8b65b8953a ("x86: Remove the prefix byte from non-VEX/EVEX
base_opcode") used the opcodeprefix field for two distinct purposes. In
preparation of having VEX/XOP/EVEX and non-VEX templates become similar
in the representatioon of both encoding space and opcode prefixes, split
the field to have a separate one holding an insn's opcode space.
Commit 6ff00b5e12 ("x86/Intel: correct permitted operand sizes for
AVX512 scatter/gather") brought the assembler side of AVX512 S/G insn
handling in line with AVX2's, but the disassembler side was forgotten.
This has the benefit of
- allowing to fold a number of table entries,
- rendering a few #define-s and enumerators unused.
Some of the enumerators have ended up misplaced under the general
current ordering scheme. Move them (and their table entries) around
accordingly. Add a couple of blank lines as separators when close to
code being touched anyway. Also drop the odd 0F from 0FXOP (there's no
"0f" involved there anywhere) infixes where the respective enum gets
played with anyway.
When the VEX.L=1 decode matches that of both EVEX.L'L=1 and EVEX.L'L=2
(typically when all three are invalid) the (smaller) VEX table entry can
be reused by EVEX, instead of duplicating data. (Note that XM and XMM as
well as EXxmm_md and EXd are equivalent at least for the purposes here.)
The order of decodes influences the overall number of table entries.
Reduce table size quite a bit by first decoding few-alternatives
attributes common to all valid leaves.
This also adds a PREFIX_DATA 7531c61332 ("x86: simplify decode of
opcodes valid with (embedded) 66 prefix only") missed to apply to
vbroadcastf64x4.
The order of decodes influences the overall number of table entries.
Reduce table size quite a bit by first decoding few-alternatives
attributes common to all valid leaves.
The order of decodes influences the overall number of table entries.
Reduce table size quite a bit by first decoding few-alternatives
attributes common to all valid leaves.
The order of decodes influences the overall number of table entries.
Reduce table size quite a bit by first decoding few-alternatives
attributes common to all valid leaves.
All encodings not used in this range are (reserved) NOPs. Hence their
decoding should be fully consistent. For this to work the PREFIX_IGNORED
logic needs slightly extending, such that the attribute will also
- have an effect when used inside prefix_table[], yet without always
falling back to using slot 0,
- cause prefixes marked as ignored while decoding through prefix_table[]
to no longer be considered decoded, when encountered in a subsequent
decoding step.
In adjacent code also drop meaningless PREFIX_OPCODE.
RepPrefixOk, HLEPrefixOk, and NoTrackPrefixOk can't be specified
together, so can share an enum-like field. IsLockable can be inferred
from HLE setting and hence only needs specifying when neither of them
is present.
Despite SYSEXIT being an Intel-only insn in long mode, its behavior
there is similar to SYSRET's: Depending on REX.W execution continues in
either 64-bit or compatibility mode. Hence distinguishing by suffix is
as necessary here as it is there.
Having this count explicitly in the table is redundant and (even if just
slightly) disturbs clarity. Infer the count from the number of operands
actually found.
Also convert the "no operands" indicator from "{ 0 }" to just "{}", as
that (now) ends up being easier to parse.
Make the opcode/riscv-opc.c and include/opcode/riscv.h tidy, move the
spec versions stuff to bfd/cpu-riscv.h. Also move the csr stuff and
ext_version_table to gas/config/tc-riscv.c for internal use. To avoid
too many repeated code, define general RISCV_GET_SPEC_NAME/SPEC_CLASS
macros. Therefore, assembler/dis-assembler/linker/gdb can get all spec
versions related stuff from cpu-riscv.h and cpu-riscv.c, since the stuff
are defined there uniformly.
bfd/
* Makefile.am: Added cpu-riscv.h.
* Makefile.in: Regenerated.
* po/SRC-POTFILES.in: Regenerated.
* cpu-riscv.h: Added to support spec versions controlling.
Also added extern arrays and functions for cpu-riscv.c.
(enum riscv_spec_class): Define all spec classes here uniformly.
(struct riscv_spec): Added for all specs.
(RISCV_GET_SPEC_CLASS): Added to reduce repeated code.
(RISCV_GET_SPEC_NAME): Likewise.
(RISCV_GET_ISA_SPEC_CLASS): Added to get ISA spec class.
(RISCV_GET_PRIV_SPEC_CLASS): Added to get privileged spec class.
(RISCV_GET_PRIV_SPEC_NAME): Added to get privileged spec name.
* cpu-riscv.c (struct priv_spec_t): Replaced with struct riscv_spec.
(riscv_get_priv_spec_class): Replaced with RISCV_GET_PRIV_SPEC_CLASS.
(riscv_get_priv_spec_name): Replaced with RISCV_GET_PRIV_SPEC_NAME.
(riscv_priv_specs): Moved below.
(riscv_get_priv_spec_class_from_numbers): Likewise, updated.
(riscv_isa_specs): Moved from include/opcode/riscv.h.
* elfnn-riscv.c: Included cpu-riscv.h.
(riscv_merge_attributes): Initialize in_priv_spec and out_priv_spec.
* elfxx-riscv.c: Included cpu-riscv.h and opcode/riscv.h.
(RISCV_UNKNOWN_VERSION): Moved from include/opcode/riscv.h.
* elfxx-riscv.h: Removed extern functions to cpu-riscv.h.
gas/
* config/tc-riscv.c: Included cpu-riscv.h.
(enum riscv_csr_clas): Moved from include/opcode/riscv.h.
(struct riscv_csr_extra): Likewise.
(struct riscv_ext_version): Likewise.
(ext_version_table): Moved from opcodes/riscv-opc.c.
(default_isa_spec): Updated type to riscv_spec_class.
(default_priv_spec): Likewise.
(riscv_set_default_isa_spec): Updated.
(init_ext_version_hash): Likewise.
(riscv_init_csr_hash): Likewise, also fixed indent.
include/
* opcode/riscv.h: Moved stuff and make the file tidy.
opcodes/
* riscv-dis.c: Included cpu-riscv.h, and removed elfxx-riscv.h.
(default_priv_spec): Updated type to riscv_spec_class.
(parse_riscv_dis_option): Updated.
* riscv-opc.c: Moved stuff and make the file tidy.
There is a tiny error left in dwarf.c:read_leb128 after Nick fixed the
signed overflow problem in code I wrote. It's to do with sleb128
values that have unnecessary excess bytes. For example, -1 is
represented as 0x7f, the most efficient encoding, but also as
0xff,0x7f or 0xff,0xff,0x7f and so on. None of these sequences
overflow any size signed value, but read_leb128 will report an
overflow given enough excess bytes. This patch fixes that problem,
and since the proper test for signed values with excess bytes can
easily be adapted to also test a sleb byte with just some bits that
overflow the result, I changed the code to not use signed right
shifts. (The C standard ISO/IEC 9899:1999 6.5.7 says signed right
shifts of negative values have an implementation defined value. A
long time ago I even used a C compiler for a certain microprocessor
that always did unsigned right shifts. Mind you, it is very unlikely
to be compiling binutils with such a compiler.)
bfd/
* wasm-module.c: Guard include of limits.h.
(CHAR_BIT): Provide backup define.
(wasm_read_leb128): Use CHAR_BIT to size "result" in bits.
Correct signed overflow checking.
opcodes/
* wasm32-dis.c: Include limits.h.
(CHAR_BIT): Provide backup define.
(wasm_read_leb128): Use CHAR_BIT to size "result" in bits.
Correct signed overflow checking.
binutils/
* dwarf.c: Include limits.h.
(CHAR_BIT): Provide backup define.
(read_leb128): Use CHAR_BIT to size "result" in bits. Correct
signed overflow checking.
* testsuite/binutils-all/pr26548.s,
* testsuite/binutils-all/pr26548.d,
* testsuite/binutils-all/pr26548e.d: New tests.
* testsuite/binutils-all/readelf.exp: Run them.
(readelf_test): Drop unused "xfails" parameter. Update all uses.
CVTPI2PD with a memory operand, unlike CVTPI2PS, doesn't engage MMX
logic. Therefore it
- has a proper AVX equivalent (CVTDQ2PD) and hence can be subject to
SSE2AVX translation and SSE checking,
- should not record MMX use in the respective ELF note.
opcodes/
* s390-mkopc.c (main): Accept arch14 as cpu string.
* s390-opc.txt: Add new arch14 instructions.
include/
* opcode/s390.h (enum s390_opcode_cpu_val): Add
S390_OPCODE_ARCH14.
gas/
* config/tc-s390.c (s390_parse_cpu): New entry for arch14.
* doc/c-s390.texi: Document arch14 march option.
* testsuite/gas/s390/s390.exp: Run the arch14 related tests.
* testsuite/gas/s390/zarch-arch14.d: New test.
* testsuite/gas/s390/zarch-arch14.s: New test.
Right now, these libraries hardwire -L../intl -lintl on a few fixed
platforms, which works fine on those platforms but on other platforms
leads to shared libraries that lack libintl_* symbols when configured
--with-included-gettext, and/or static libraries that contain libintl as
*another* static library. If we instead use the LIBINTL variable
defined in ../intl/config.intl, this gives us the right thing on all
three classes of platform (gettext in libc, gettext in system libintl,
gettext in ../intl/libintl.a).. This also means we can rip out some
Darwin-specific machinery from configure.ac and also simplify the Cygwin
side.
This also means that the libctf testsuite (and other places that include
libbfd, libopcodes or libctf) don't need to grow libintl dependencies
just on account of those libraries (though they still need such
dependencies if they themselves use gettext machinery).
bfd/ChangeLog
2021-02-03 Nick Alcock <nick.alcock@oracle.com>
* configure.ac (SHARED_LIBADD): Remove explicit -lintl population in
favour of LIBINTL.
* configure: Regenerated.
libctf/ChangeLog
2021-02-02 Nick Alcock <nick.alcock@oracle.com>
* configure.ac (CTF_LIBADD): Remove explicit -lintl population in
favour of LIBINTL.
* Makefile.am (libctf_nobfd_la_LIBADD): No longer explicitly
include $(LIBINTL).
(check-DEJAGNU): Pass down to tests as well.
* configure: Regenerated.
* Makefile.in: Likewise.
opcodes/ChangeLog
2021-02-04 Nick Alcock <nick.alcock@oracle.com>
* configure.ac (SHARED_LIBADD): Remove explicit -lintl population in
favour of LIBINTL.
* configure: Regenerated.
bfd/
* elfnn-riscv.c: Indent, labels and GNU coding standards tidy,
also aligned the code.
gas/
* config/tc-riscv.c: Indent and GNU coding standards tidy,
also aligned the code.
* config/tc-riscv.h: Likewise.
include/
* opcode/riscv.h: Indent and GNU coding standards tidy,
also aligned the code.
opcodes/
* riscv-opc.c (riscv_gpr_names_abi): Aligned the code.
(riscv_fpr_names_abi): Likewise.
(riscv_opcodes): Likewise.
(riscv_insn_types): Likewise.