binutils-gdb/gas
Andrew Burgess 202be274a4 opcodes/i386: remove trailing whitespace from insns with zero operands
While working on another patch[1] I had need to touch this code in
i386-dis.c:

  ins->obufp = ins->mnemonicendp;
  for (i = strlen (ins->obuf) + prefix_length; i < 6; i++)
    oappend (ins, " ");
  oappend (ins, " ");
  (*ins->info->fprintf_styled_func)
    (ins->info->stream, dis_style_mnemonic, "%s", ins->obuf);

What this code does is add whitespace after the instruction mnemonic
and before the instruction operands.

The problem I ran into when working on this code can be seen by
assembling this input file:

    .text
    nop
    retq

Now, when I disassemble, here's the output.  I've replaced trailing
whitespace with '_' so that the issue is clearer:

    Disassembly of section .text:

    0000000000000000 <.text>:
       0:	90                   	nop
       1:	c3                   	retq___

Notice that there's no trailing whitespace after 'nop', but there are
three spaces after 'retq'!

What happens is that instruction mnemonics are emitted into a buffer
instr_info::obuf, then instr_info::mnemonicendp is setup to point to
the '\0' character at the end of the mnemonic.

When we emit the whitespace, this is then added starting at the
mnemonicendp position.  Lets consider 'retq', first the buffer is
setup like this:

  'r' 'e' 't' 'q' '\0'

Then we add whitespace characters at the '\0', converting the buffer
to this:

  'r' 'e' 't' 'q' ' ' ' ' ' ' '\0'

However, 'nop' is actually an alias for 'xchg %rax,%rax', so,
initially, the buffer is setup like this:

  'x' 'c' 'h' 'g' '\0'

Then in NOP_Fixup we spot that we have an instruction that is an alias
for 'nop', and adjust the buffer to this:

  'n' 'o' 'p' '\0' '\0'

The second '\0' is left over from the original buffer contents.
However, when we rewrite the buffer, we don't afjust mnemonicendp,
which still points at the second '\0' character.

Now, when we insert whitespace we get:

  'n' 'o' 'p' '\0' ' ' ' ' ' ' ' ' '\0'

Notice the whitespace is inserted after the first '\0', so, when we
print the buffer, the whitespace is not printed.

The fix for this is pretty easy, I can change NOP_Fixup to adjust
mnemonicendp, but now a bunch of tests start failing, we now produce
whitespace after the 'nop', which the tests don't expect.

So, I could update the tests to expect the whitespace....

...except I'm not a fan of trailing whitespace, so I'd really rather
not.

Turns out, I can pretty easily update the whitespace emitting code to
spot instructions that have zero operands and just not emit any
whitespace in this case.  So this is what I've done.

I've left in the fix for NOP_Fixup, I think updating mnemonicendp is
probably a good thing, though this is not really required any more.

I've then updated all the tests that I saw failing to adjust the
expected patterns to account for the change in whitespace.

[1] https://sourceware.org/pipermail/binutils/2022-April/120610.html
2022-05-27 14:12:33 +01:00
..
config Remove use of bfd_uint64_t and similar 2022-05-27 22:08:59 +09:30
doc
po
testsuite opcodes/i386: remove trailing whitespace from insns with zero operands 2022-05-27 14:12:33 +01:00
.gitignore
acinclude.m4
aclocal.m4
app.c
as.c
as.h
asintl.h
atof-generic.c
bignum.h
bit_fix.h
cgen.c
cgen.h
ChangeLog oops - forgot changelog entry for the previous delta. 2022-05-18 16:26:21 +01:00
ChangeLog-0001
ChangeLog-0203
ChangeLog-2004
ChangeLog-2005
ChangeLog-2006
ChangeLog-2007
ChangeLog-2008
ChangeLog-2009
ChangeLog-2010
ChangeLog-2011
ChangeLog-2012
ChangeLog-2013
ChangeLog-2014
ChangeLog-2015
ChangeLog-2016
ChangeLog-2017
ChangeLog-2018
ChangeLog-2019
ChangeLog-2020
ChangeLog-9295
ChangeLog-9697
ChangeLog-9899
compress-debug.c
compress-debug.h
cond.c gas: don't ignore .linefile inside false conditionals 2022-05-18 09:37:00 +02:00
config.in
configure
configure.ac
configure.com
configure.tgt
CONTRIBUTORS
COPYING
debug.c
dep-in.sed
depend.c
dw2gencfi.c
dw2gencfi.h
dwarf2dbg.c
dwarf2dbg.h
ecoff.c
ecoff.h
ehopt.c
emul-target.h
emul.h
expr.c
expr.h
flonum-copy.c
flonum-konst.c
flonum-mult.c
flonum.h
frags.c
frags.h
gdbinit.in
hash.c
hash.h
input-file.c
input-file.h
input-scrub.c
itbl-lex-wrapper.c
itbl-lex.h
itbl-lex.l
itbl-ops.c
itbl-ops.h
itbl-parse.y
listing.c
listing.h
literal.c
macro.c gas: tweak .irp and alike file/line handling for M68K/MRI 2022-05-18 09:35:29 +02:00
macro.h
MAINTAINERS
Makefile.am
Makefile.in
makefile.vms
messages.c
NEWS
obj.h
output-file.c
output-file.h
read.c gas: avoid octal numbers being accepted when processing .linefile 2022-05-18 09:38:40 +02:00
read.h gas: fold do_repeat{,_with_expander}() 2022-05-18 09:37:34 +02:00
README
remap.c
sb.c
sb.h
stabs.c
subsegs.c
subsegs.h
symbols.c Replace bfd_hostptr_t with uintptr_t 2022-05-27 22:08:59 +09:30
symbols.h
tc.h
write.c Replace bfd_hostptr_t with uintptr_t 2022-05-27 22:08:59 +09:30
write.h ppc: extend opindex to 16 bits 2022-05-25 12:13:44 +09:30

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

		README for GAS

A number of things have changed since version 1 and the wonderful
world of gas looks very different.  There's still a lot of irrelevant
garbage lying around that will be cleaned up in time.  Documentation
is scarce, as are logs of the changes made since the last gas release.
My apologies, and I'll try to get something useful.

Unpacking and Installation - Summary
====================================

See ../binutils/README.

To build just the assembler, make the target all-gas.

Documentation
=============

The GAS release includes texinfo source for its manual, which can be processed
into `info' or `dvi' forms.

The DVI form is suitable for printing or displaying; the commands for doing
this vary from system to system.  On many systems, `lpr -d' will print a DVI
file.  On others, you may need to run a program such as `dvips' to convert the
DVI file into a form your system can print.

If you wish to build the DVI file, you will need to have TeX installed on your
system.  You can rebuild it by typing:

	cd gas/doc
	make as.dvi

The Info form is viewable with the GNU Emacs `info' subsystem, or the
stand-alone `info' program, available as part of the GNU Texinfo distribution.
To build the info files, you will need the `makeinfo' program.  Type:

	cd gas/doc
	make info

Specifying names for hosts and targets
======================================

   The specifications used for hosts and targets in the `configure'
script are based on a three-part naming scheme, but some short
predefined aliases are also supported.  The full naming scheme encodes
three pieces of information in the following pattern:

     ARCHITECTURE-VENDOR-OS

   For example, you can use the alias `sun4' as a HOST argument or in a
`--target=TARGET' option.  The equivalent full name is
`sparc-sun-sunos4'.

   The `configure' script accompanying GAS does not provide any query
facility to list all supported host and target names or aliases.
`configure' calls the Bourne shell script `config.sub' to map
abbreviations to full names; you can read the script, if you wish, or
you can use it to test your guesses on abbreviations--for example:

     % sh config.sub i386v
     i386-unknown-sysv
     % sh config.sub i786v
     Invalid configuration `i786v': machine `i786v' not recognized


`configure' options
===================

   Here is a summary of the `configure' options and arguments that are
most often useful for building GAS.  `configure' also has several other
options not listed here.

     configure [--help]
               [--prefix=DIR]
               [--srcdir=PATH]
               [--host=HOST]
               [--target=TARGET]
               [--with-OPTION]
               [--enable-OPTION]

You may introduce options with a single `-' rather than `--' if you
prefer; but you may abbreviate option names if you use `--'.

`--help'
     Print a summary of the options to `configure', and exit.

`-prefix=DIR'
     Configure the source to install programs and files under directory
     `DIR'.

`--srcdir=PATH'
     Look for the package's source code in directory DIR.  Usually
     `configure' can determine that directory automatically.

`--host=HOST'
     Configure GAS to run on the specified HOST.  Normally the
     configure script can figure this out automatically.

     There is no convenient way to generate a list of all available
     hosts.

`--target=TARGET'
     Configure GAS for cross-assembling programs for the specified
     TARGET.  Without this option, GAS is configured to assemble .o files
     that run on the same machine (HOST) as GAS itself.

     There is no convenient way to generate a list of all available
     targets.

`--enable-OPTION'
     These flags tell the program or library being configured to
     configure itself differently from the default for the specified
     host/target combination.  See below for a list of `--enable'
     options recognized in the gas distribution.

`configure' accepts other options, for compatibility with configuring
other GNU tools recursively; but these are the only options that affect
GAS or its supporting libraries.

The `--enable' options recognized by software in the gas distribution are:

`--enable-targets=...'
     This causes one or more specified configurations to be added to those for
     which BFD support is compiled.  Currently gas cannot use any format other
     than its compiled-in default, so this option is not very useful.

`--enable-bfd-assembler'
     This causes the assembler to use the new code being merged into it to use
     BFD data structures internally, and use BFD for writing object files.
     For most targets, this isn't supported yet.  For most targets where it has
     been done, it's already the default.  So generally you won't need to use
     this option.

Compiler Support Hacks
======================

On a few targets, the assembler has been modified to support a feature
that is potentially useful when assembling compiler output, but which
may confuse assembly language programmers.  If assembler encounters a
.word pseudo-op of the form symbol1-symbol2 (the difference of two
symbols), and the difference of those two symbols will not fit in 16
bits, the assembler will create a branch around a long jump to
symbol1, and insert this into the output directly before the next
label: The .word will (instead of containing garbage, or giving an
error message) contain (the address of the long jump)-symbol2.  This
allows the assembler to assemble jump tables that jump to locations
very far away into code that works properly.  If the next label is
more than 32K away from the .word, you lose (silently); RMS claims
this will never happen.  If the -K option is given, you will get a
warning message when this happens.


REPORTING BUGS IN GAS
=====================

Bugs in gas should be reported to:

   https://sourceware.org/bugzilla/

See ../binutils/README for what we need in a bug report.

Copyright (C) 2012-2022 Free Software Foundation, Inc.

Copying and distribution of this file, with or without modification,
are permitted in any medium without royalty provided the copyright
notice and this notice are preserved.