mirror of
https://github.com/netwide-assembler/nasm.git
synced 2024-11-21 03:14:19 +08:00
200 lines
8.2 KiB
Plaintext
200 lines
8.2 KiB
Plaintext
The Netwide Disassembler, NDISASM
|
|
=================================
|
|
|
|
Introduction
|
|
============
|
|
|
|
The Netwide Disassembler is a small companion program to the Netwide
|
|
Assembler, NASM. It seemed a shame to have an x86 assembler,
|
|
complete with a full instruction table, and not make as much use of
|
|
it as possible, so here's a disassembler which shares the
|
|
instruction table (and some other bits of code) with NASM.
|
|
|
|
The Netwide Disassembler does nothing except to produce
|
|
disassemblies of _binary_ source files. NDISASM does not have any
|
|
understanding of object file formats, like `objdump', and it will
|
|
not understand DOS .EXE files like `debug' will. It just
|
|
disassembles.
|
|
|
|
Getting Started: Installation
|
|
=============================
|
|
|
|
See `nasm.doc' for installation instructions. NDISASM, like NASM,
|
|
has a man page which you may want to put somewhere useful, if you
|
|
are on a Unix system.
|
|
|
|
Running NDISASM
|
|
===============
|
|
|
|
To disassemble a file, you will typically use a command of the form
|
|
|
|
ndisasm [-b16 | -b32] filename
|
|
|
|
NDISASM can disassemble 16 bit code or 32 bit code equally easily,
|
|
provided of course that you remember to specify which it is to work
|
|
with. If no `-b' switch is present, NDISASM works in 16-bit mode by
|
|
default. The `-u' switch (for USE32) also invokes 32-bit mode.
|
|
|
|
Two more command line options are `-r' which reports the version
|
|
number of NDISASM you are running, and `-h' which gives a short
|
|
summary of command line options.
|
|
|
|
COM Files: Specifying an Origin
|
|
===============================
|
|
|
|
To disassemble a DOS .COM file correctly, a disassembler must assume
|
|
that the first instruction in the file is loaded at address 0x100,
|
|
rather than at zero. NDISASM, which assumes by default that any file
|
|
you give it is loaded at zero, will therefore need to be informed of
|
|
this.
|
|
|
|
The `-o' option allows you to declare a different origin for the
|
|
file you are disassembling. Its argument may be expressed in any of
|
|
the NASM numeric formats: decimal by default, if it begins with `$'
|
|
or `0x' or ends in `H' it's hex, if it ends in `Q' it's octal, and
|
|
if it ends in `B' it's binary.
|
|
|
|
Hence, to disassemble a .COM file:
|
|
|
|
ndisasm -o100h filename.com
|
|
|
|
will do the trick.
|
|
|
|
Code Following Data: Synchronisation
|
|
====================================
|
|
|
|
Suppose you are disassembling a file which contains some data which
|
|
isn't machine code, and _then_ contains some machine code. NDISASM
|
|
will faithfully plough through the data section, producing machine
|
|
instructions wherever it can (although most of them will look
|
|
bizarre, and some may have unusual prefixes, e.g. `fs or
|
|
ax,0x240a'), and generating `db' instructions every so often if it's
|
|
totally stumped. Then it will reach the code section.
|
|
|
|
Supposing NDISASM has just finished generating a strange machine
|
|
instruction from part of the data section, and its file position is
|
|
now one byte _before_ the beginning of the code section. It's
|
|
entirely possible that another spurious instruction will get
|
|
generated, starting with the final byte of the data section, and
|
|
then the correct first instruction in the code section will not be
|
|
seen because the starting point skipped over it. This isn't really
|
|
ideal.
|
|
|
|
To avoid this, you can specify a `synchronisation' point, or indeed
|
|
as many synchronisation points as you like (although NDISASM can
|
|
only handle 8192 sync points internally). The definition of a sync
|
|
point is this: NDISASM guarantees to hit sync points exactly during
|
|
disassembly. If it is thinking about generating an instruction which
|
|
would cause it to jump over a sync point, it will discard that
|
|
instruction and output a `db' instead. So it _will_ start
|
|
disassembly exactly from the sync point, and so you _will_ see all
|
|
the instructions in your code section.
|
|
|
|
Sync points are specified using the `-s' option: they are measured
|
|
in terms of the program origin, not the file position. So if you
|
|
want to synchronise after 32 bytes of a .COM file, you would have to
|
|
do
|
|
|
|
ndisasm -o100h -s120h file.com
|
|
|
|
rather than
|
|
|
|
ndisasm -o100h -s20h file.com
|
|
|
|
As stated above, you can specify multiple sync markers if you need
|
|
to, just by repeating the `-s' option.
|
|
|
|
Mixed Code and Data: Automatic (Intelligent) Synchronisation
|
|
============================================================
|
|
|
|
Suppose you are disassembling the boot sector of a DOS floppy (maybe
|
|
it has a virus, and you need to understand the virus so that you
|
|
know what kinds of damage it might have done you). Typically, this
|
|
will contain a JMP instruction, then some data, then the rest of the
|
|
code. So there is a very good chance of NDISASM being misaligned
|
|
when the data ends and the code begins. Hence a sync point is
|
|
needed.
|
|
|
|
On the other hand, why should you have to specify the sync point
|
|
manually? What you'd do in order to find where the sync point would
|
|
be, surely, would be to read the JMP instruction, and then to use
|
|
its target address as a sync point. So can NDISASM do that for you?
|
|
|
|
The answer, of course, is yes: using either of the synonymous
|
|
switches `-a' (for automatic sync) or `-i' (for intelligent sync)
|
|
will enable auto-sync mode. Auto-sync mode automatically generates a
|
|
sync point for any forward-referring PC-relative jump or call
|
|
instruction that NDISASM encounters. (Since NDISASM is one-pass, if
|
|
it encounters a PC-relative jump whose target has already been
|
|
processed, there isn't much it can do about it...)
|
|
|
|
Only PC-relative jumps are processed, since an absolute jump is
|
|
either through a register (in which case NDISASM doesn't know what
|
|
the register contains) or involves a segment address (in which case
|
|
the target code isn't in the same segment that NDISASM is working
|
|
in, and so the sync point can't be placed anywhere useful).
|
|
|
|
For some kinds of file, this mechanism will automatically put sync
|
|
points in all the right places, and save you from having to place
|
|
any sync points manually. However, it should be stressed that
|
|
auto-sync mode is _not_ guaranteed to catch all the sync points, and
|
|
you may still have to place some manually.
|
|
|
|
Auto-sync mode doesn't prevent you from declaring manual sync
|
|
points: it just adds automatically generated ones to the ones you
|
|
provide. It's perfectly feasible to specify `-i' _and_ some `-s'
|
|
options.
|
|
|
|
Another caveat with auto-sync mode is that if, by some unpleasant
|
|
fluke, something in your data section should disassemble to a
|
|
PC-relative call or jump instruction, NDISASM may obediently place a
|
|
sync point in a totally random place, for example in the middle of
|
|
one of the instructions in your code section. So you may end up with
|
|
a wrong disassembly even if you use auto-sync. Again, there isn't
|
|
much I can do about this. If you have problems, you'll have to use
|
|
manual sync points, or use the `-k' option (documented below) to
|
|
suppress disassembly of the data area.
|
|
|
|
Other Options
|
|
=============
|
|
|
|
The `-e' option skips a header on the file, by ignoring the first N
|
|
bytes. This means that the header is _not_ counted towards the
|
|
disassembly offset: if you give `-e10 -o10', disassembly will start
|
|
at byte 10 in the file, and this will be given offset 10, not 20.
|
|
|
|
The `-k' option is provided with two comma-separated numeric
|
|
arguments, the first of which is an assembly offset and the second
|
|
is a number of bytes to skip. This _will_ count the skipped bytes
|
|
towards the assembly offset: its use is to suppress disassembly of a
|
|
data section which wouldn't contain anything you wanted to see
|
|
anyway.
|
|
|
|
Bugs and Improvements
|
|
=====================
|
|
|
|
There are no known bugs. However, any you find, with patches if
|
|
possible, should be sent to <jules@earthcorp.com> or
|
|
<anakin@pobox.com>, and we'll try to fix them. Feel free to send
|
|
contributions and new features as well.
|
|
|
|
Future plans include awareness of which processors certain
|
|
instructions will run on, and marking of instructions that are too
|
|
advanced for some processor (or are FPU instructions, or are
|
|
undocumented opcodes, or are privileged protected-mode instructions,
|
|
or whatever).
|
|
|
|
That's All Folks!
|
|
=================
|
|
|
|
I hope NDISASM is of some use to somebody. Including me. :-)
|
|
|
|
I don't recommend taking NDISASM apart to see how an efficient
|
|
disassembler works, because as far as I know, it isn't an efficient
|
|
one anyway. You have been warned.
|
|
|
|
Please feel free to send comments, suggestions, or chat to
|
|
<anakin@pobox.com>. As with NASM, no flames please.
|
|
|
|
- Simon Tatham <anakin@pobox.com>, 21-Nov-96
|