mirror of
https://sourceware.org/git/binutils-gdb.git
synced 2025-01-12 12:16:04 +08:00
2c3b9a9130
The i386 disassembler is pretty complex. Most disassembly is done indirectly; operands are built into buffers within a struct instr_info instance, before finally being printed later in the disassembly process. Sometimes the operand buffers are built in a different order to the order in which they will eventually be printed. Each operand can contain multiple components, e.g. multiple registers, immediates, other textual elements (commas, brackets, etc). When looking for how to apply styling I guess the ideal solution would be to move away from the operands being a single string that is built up, and instead have each operand be a list of "parts", where each part is some text and a style. Then, when we eventually print the operand we would loop over the parts and print each part with the correct style. But it feels like a huge amount of work to move from where we are now to that potentially ideal solution. Plus, the above solution would be pretty complex. So, instead I propose a .... different solution here, one that works with the existing infrastructure. As each operand is built up, piece be piece, we pass through style information. This style information is then encoded into the operand buffer (see below for details). After this the code can continue to operate as it does right now in order to manage the set of operand buffers. Then, as each operand is printed we can split the operand buffer into chunks at the style marker boundaries, with each chunk being printed with the correct style. For encoding the style information I use a single character, currently \002, followed by the style encoded as a single hex digit, followed again by the \002 character. This of course relies on there not being more than 16 styles, but that is currently true, and hopefully will remain true for the foreseeable future. The other major concern that has arisen around this work is whether the escape character could ever be encountered in output naturally generated by the disassembler. If this did happen then the escape characters would be stripped from the output, and the wrong styling would be applied. However, I don't believe that this is currently a problem. Disassembler content comes from a number of sources. First there's content that copied directly from the i386-dis.c file, this is things like register names, and other syntax elements (brackets, commas, etc). We can easily check that the i386-dis.c file doesn't contain our special character. The next source of content are immediate operands. The text for these operands is generated by calls into libc. By selecting a non-printable character we can be confident that this is not something that libc will generate as part of an immediate representation. The other output that appears to be from the disassembler is operands that contain addresses and (possibly) symbol names. It is quite possible that a symbol name might contain any special character we could imagine, so is this a problem? I don't think it is, we don't actually print address and symbol operands through the disassembler, instead, the disassembler calls back to the user (objdump, gdb, etc) to print the address and symbol on its behalf. This content is printed directly to the output stream, it does not pass through the i386 disassembler output buffers. As a result, we never check this particular output for styling escape characters. In some (not very scientific) benchmarking on my machine, disassembling a reasonably large (142M) shared library, I'm not seeing any significant slow down in disassembler speed with this change. Most instructions are now being fully syntax highlighted when I disassemble using the --disassembler-color=extended-color option. I'm sure that there are probably still a few corner cases that need fixing up, but we can come back to them later I think. When disassembler syntax highlighting is not being used, then there should be no user visible changes after this commit. |
||
---|---|---|
.. | ||
po | ||
.gitignore | ||
aarch64-asm-2.c | ||
aarch64-asm.c | ||
aarch64-asm.h | ||
aarch64-dis-2.c | ||
aarch64-dis.c | ||
aarch64-dis.h | ||
aarch64-gen.c | ||
aarch64-opc-2.c | ||
aarch64-opc.c | ||
aarch64-opc.h | ||
aarch64-tbl.h | ||
aclocal.m4 | ||
alpha-dis.c | ||
alpha-opc.c | ||
arc-dis.c | ||
arc-dis.h | ||
arc-ext-tbl.h | ||
arc-ext.c | ||
arc-ext.h | ||
arc-fxi.h | ||
arc-nps400-tbl.h | ||
arc-opc.c | ||
arc-regs.h | ||
arc-tbl.h | ||
arm-dis.c | ||
avr-dis.c | ||
bfin-dis.c | ||
bpf-asm.c | ||
bpf-desc.c | ||
bpf-desc.h | ||
bpf-dis.c | ||
bpf-ibld.c | ||
bpf-opc.c | ||
bpf-opc.h | ||
cgen-asm.c | ||
cgen-asm.in | ||
cgen-bitset.c | ||
cgen-dis.c | ||
cgen-dis.in | ||
cgen-ibld.in | ||
cgen-opc.c | ||
cgen.sh | ||
ChangeLog | ||
ChangeLog-0001 | ||
ChangeLog-0203 | ||
ChangeLog-2004 | ||
ChangeLog-2005 | ||
ChangeLog-2006 | ||
ChangeLog-2007 | ||
ChangeLog-2008 | ||
ChangeLog-2009 | ||
ChangeLog-2010 | ||
ChangeLog-2011 | ||
ChangeLog-2012 | ||
ChangeLog-2013 | ||
ChangeLog-2014 | ||
ChangeLog-2015 | ||
ChangeLog-2016 | ||
ChangeLog-2017 | ||
ChangeLog-2018 | ||
ChangeLog-2019 | ||
ChangeLog-2020 | ||
ChangeLog-9297 | ||
ChangeLog-9899 | ||
config.in | ||
configure | ||
configure.ac | ||
configure.com | ||
cr16-dis.c | ||
cr16-opc.c | ||
cris-desc.c | ||
cris-desc.h | ||
cris-dis.c | ||
cris-opc.c | ||
cris-opc.h | ||
crx-dis.c | ||
crx-opc.c | ||
csky-dis.c | ||
csky-opc.h | ||
d10v-dis.c | ||
d10v-opc.c | ||
d30v-dis.c | ||
d30v-opc.c | ||
dep-in.sed | ||
dis-buf.c | ||
dis-init.c | ||
disassemble.c | ||
disassemble.h | ||
dlx-dis.c | ||
epiphany-asm.c | ||
epiphany-desc.c | ||
epiphany-desc.h | ||
epiphany-dis.c | ||
epiphany-ibld.c | ||
epiphany-opc.c | ||
epiphany-opc.h | ||
fr30-asm.c | ||
fr30-desc.c | ||
fr30-desc.h | ||
fr30-dis.c | ||
fr30-ibld.c | ||
fr30-opc.c | ||
fr30-opc.h | ||
frv-asm.c | ||
frv-desc.c | ||
frv-desc.h | ||
frv-dis.c | ||
frv-ibld.c | ||
frv-opc.c | ||
frv-opc.h | ||
ft32-dis.c | ||
ft32-opc.c | ||
h8300-dis.c | ||
hppa-dis.c | ||
i386-dis-evex-len.h | ||
i386-dis-evex-mod.h | ||
i386-dis-evex-prefix.h | ||
i386-dis-evex-reg.h | ||
i386-dis-evex-w.h | ||
i386-dis-evex.h | ||
i386-dis.c | ||
i386-gen.c | ||
i386-init.h | ||
i386-opc.c | ||
i386-opc.h | ||
i386-opc.tbl | ||
i386-reg.tbl | ||
i386-tbl.h | ||
ia64-asmtab.c | ||
ia64-asmtab.h | ||
ia64-dis.c | ||
ia64-gen.c | ||
ia64-ic.tbl | ||
ia64-opc-a.c | ||
ia64-opc-b.c | ||
ia64-opc-d.c | ||
ia64-opc-f.c | ||
ia64-opc-i.c | ||
ia64-opc-m.c | ||
ia64-opc-x.c | ||
ia64-opc.c | ||
ia64-opc.h | ||
ia64-raw.tbl | ||
ia64-war.tbl | ||
ia64-waw.tbl | ||
ip2k-asm.c | ||
ip2k-desc.c | ||
ip2k-desc.h | ||
ip2k-dis.c | ||
ip2k-ibld.c | ||
ip2k-opc.c | ||
ip2k-opc.h | ||
iq2000-asm.c | ||
iq2000-desc.c | ||
iq2000-desc.h | ||
iq2000-dis.c | ||
iq2000-ibld.c | ||
iq2000-opc.c | ||
iq2000-opc.h | ||
lm32-asm.c | ||
lm32-desc.c | ||
lm32-desc.h | ||
lm32-dis.c | ||
lm32-ibld.c | ||
lm32-opc.c | ||
lm32-opc.h | ||
lm32-opinst.c | ||
loongarch-coder.c | ||
loongarch-dis.c | ||
loongarch-opc.c | ||
m32c-asm.c | ||
m32c-desc.c | ||
m32c-desc.h | ||
m32c-dis.c | ||
m32c-ibld.c | ||
m32c-opc.c | ||
m32c-opc.h | ||
m32r-asm.c | ||
m32r-desc.c | ||
m32r-desc.h | ||
m32r-dis.c | ||
m32r-ibld.c | ||
m32r-opc.c | ||
m32r-opc.h | ||
m32r-opinst.c | ||
m68hc11-dis.c | ||
m68hc11-opc.c | ||
m68k-dis.c | ||
m68k-opc.c | ||
m10200-dis.c | ||
m10200-opc.c | ||
m10300-dis.c | ||
m10300-opc.c | ||
MAINTAINERS | ||
Makefile.am | ||
Makefile.in | ||
makefile.vms | ||
mcore-dis.c | ||
mcore-opc.h | ||
mep-asm.c | ||
mep-desc.c | ||
mep-desc.h | ||
mep-dis.c | ||
mep-ibld.c | ||
mep-opc.c | ||
mep-opc.h | ||
metag-dis.c | ||
microblaze-dis.c | ||
microblaze-dis.h | ||
microblaze-opc.h | ||
microblaze-opcm.h | ||
micromips-opc.c | ||
mips16-opc.c | ||
mips-dis.c | ||
mips-formats.h | ||
mips-opc.c | ||
mmix-dis.c | ||
mmix-opc.c | ||
moxie-dis.c | ||
moxie-opc.c | ||
msp430-decode.c | ||
msp430-decode.opc | ||
msp430-dis.c | ||
mt-asm.c | ||
mt-desc.c | ||
mt-desc.h | ||
mt-dis.c | ||
mt-ibld.c | ||
mt-opc.c | ||
mt-opc.h | ||
nds32-asm.c | ||
nds32-asm.h | ||
nds32-dis.c | ||
nds32-opc.h | ||
nfp-dis.c | ||
nios2-dis.c | ||
nios2-opc.c | ||
ns32k-dis.c | ||
opc2c.c | ||
opintl.h | ||
or1k-asm.c | ||
or1k-desc.c | ||
or1k-desc.h | ||
or1k-dis.c | ||
or1k-ibld.c | ||
or1k-opc.c | ||
or1k-opc.h | ||
or1k-opinst.c | ||
pdp11-dis.c | ||
pdp11-opc.c | ||
pj-dis.c | ||
pj-opc.c | ||
ppc-dis.c | ||
ppc-opc.c | ||
pru-dis.c | ||
pru-opc.c | ||
riscv-dis.c | ||
riscv-opc.c | ||
rl78-decode.c | ||
rl78-decode.opc | ||
rl78-dis.c | ||
rx-decode.c | ||
rx-decode.opc | ||
rx-dis.c | ||
s12z-dis.c | ||
s12z-opc.c | ||
s12z-opc.h | ||
s390-dis.c | ||
s390-mkopc.c | ||
s390-opc.c | ||
s390-opc.txt | ||
score7-dis.c | ||
score-dis.c | ||
score-opc.h | ||
sh-dis.c | ||
sh-opc.h | ||
sparc-dis.c | ||
sparc-opc.c | ||
spu-dis.c | ||
spu-opc.c | ||
sysdep.h | ||
tic4x-dis.c | ||
tic6x-dis.c | ||
tic30-dis.c | ||
tic54x-dis.c | ||
tic54x-opc.c | ||
tilegx-dis.c | ||
tilegx-opc.c | ||
tilepro-dis.c | ||
tilepro-opc.c | ||
v850-dis.c | ||
v850-opc.c | ||
vax-dis.c | ||
visium-dis.c | ||
visium-opc.c | ||
wasm32-dis.c | ||
xc16x-asm.c | ||
xc16x-desc.c | ||
xc16x-desc.h | ||
xc16x-dis.c | ||
xc16x-ibld.c | ||
xc16x-opc.c | ||
xc16x-opc.h | ||
xgate-dis.c | ||
xgate-opc.c | ||
xstormy16-asm.c | ||
xstormy16-desc.c | ||
xstormy16-desc.h | ||
xstormy16-dis.c | ||
xstormy16-ibld.c | ||
xstormy16-opc.c | ||
xstormy16-opc.h | ||
xtensa-dis.c | ||
z8k-dis.c | ||
z8k-opc.h | ||
z8kgen.c | ||
z80-dis.c |