For 16-bit and 32-bit x86 code, the size and realsize() always
matches as only jumps, calls and loops uses PC relative
addressing and the address isn't followed by any other opcode
bytes. In 64-bit mode there is RIP relative addressing which
means the fixup location can be followed by an immediate value,
meaning that size > realsize().
When the CPU is calculating the effective address, it takes the
RIP at the end of the instruction and adds the fixed up relative
address value to it.
The linker's point of reference is the end of the fixup location
(which is the end of the instruction for Jcc, CALL, LOOP[cc]).
It is calculating distance between the target symbol and the end
of the fixup location, and add this to the displacement value we
are calculating here and storing at the fixup location.
To get the right effect, we need to _reduce_ the displacement
value by the number of bytes following the fixup.
Example:
data at address 0x100; REL4ADR at 0x050, 4 byte immediate,
end of fixup at 0x054, end of instruction at 0x058.
=> size = 8.
=> realsize() -> 4
=> CPU needs a value of: 0x100 - 0x058 = 0x0a8
=> linker/loader will add: 0x100 - 0x054 = 0x0ac
=> We must add an addend of -4.
=> realsize() - size = -4.
The code used to do size - realsize() at least since v0.90,
probably because it wasn't needed...
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Concatenating the cwd with the name of the output file is incorrect
for filenames which are specified as absolute. We already have
nasm_realpath() for this purpose, use it.
Cc: Jim Kukunas <james.t.kukunas@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Instead of walking a linear list of files for every line, make a
simple comparison for the common case of the same file, and otherwise
use a hash table.
Cc: Jim Kukunas <james.t.kukunas@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
This essentially reverts 6503051dcc360172c49311d586f2b9cf4ab2ea81 since
that workaround is no longer needed thanks to support for multiple source
files
Signed-off-by: Jim Kukunas <james.t.kukunas@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Handle the existence of multiple source files, as is normal when using
include files.
Signed-of-by: Jim Kukunas <james.t.kukunas@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Previously, debug info would refer to the first file seen, even
when it did not actually generate line numbers (e.g. segto=-1).
Fix it so we only lock in the file name the first time we actually
produce a line number record. Not as good as proper support for
debug info referencing multiple source files but much more useful
than the current behavior.
Signed-off-by: Fabian Giesen <fabiang@radgametools.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
When assembling on Windows machines with CRLF line endings, computing
the MD5 hash from the file read in "text" mode (transforms CRLF->LF)
gives incorrect results.
Signed-off-by: Fabian Giesen <fabiang@radgametools.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
The hash calculation in calc_md5 tries to open the source file via
"filename" again. For %includes, this is the file name that was
specified in the %include directive, not the actual name of the file
that was opened by the preprocessor. In other words, this fails if the
include file is not in the current working directory.
Add pp_input_fopen that uses the preprocessor include path lookup
code to resolve a file name and open it, and use that in codeview.c.
Signed-off-by: Fabian Giesen <fabiang@radgametools.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Fix a missing brace introduced in checkin
84f6860ed53492976c9d79e9a8a0bdc60da78bc6. This was a transcription
error of mine; Zenith432's original patch was correct.
Cc: Zenith432 <zenith432@users.sourceforge.net>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
IP-relative relocations were broken for 32-bit Mach-O when referencing
external symbols after the Mach-O backends were merged.
This fixes bug reports 3392348, 3392352, and 3392346.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Avoid funnies with ordering of debug format selection by deferring
debug format search until after command line processing. Also permit
the -gformat syntax used by many C compilers.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Get rid of the completely pointless "debuginfo" parameter to
ofmt->cleanup(). Most backends completely ignore it, and the two that
care (obj, ieee) can simply test dfmt instead.
Also, dfmt is never NULL, so any test for a NULL dfmt is bogus.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Now when labels are properly concatenated in common code, there is no
reason for the debugging backend to need to be aware of local
symbols. We don't have to consider ..[^@] special symbols either, as
they are now filtered in labels.c.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Debugged-by: Jim Kukunas <james.t.kukunas@linux.intel.com>
labels.c now filter ..[^@] special symbols from the debug backend,
so we don't have to open-code that everywhere.
In the actual output format, don't treat ..@ symbols as special.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Jim Kukunas <james.t.kukunas@linux.intel.com>
labels.c now filter ..[^@] special symbols from the debug backend, so
we don't have to open-code that everywhere.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Jim Kukunas <james.t.kukunas@linux.intel.com>
Move ofmt->current_dfmt into a separate global variable. This
should allow us to make ofmt readonly and removes some additional
gratuitious differences between backends.
From master branch checkin a7bc15dd0aa963ab56a98ee04bc08ce6d9e3baea
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
We pass around a whole bunch of function pointers in arguments,
which then just get stashed in static variables. Clean this mess
up and in particular handle the error management in the preprocessor
using nasm_set_verror() which already exists.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
From master branch checkin 130736c0cfcad28ee16cec6c14bb22999d982e5a
Resolved Conflicts:
nasm.c
preproc-nop.c
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Replace all instances of ERR_FATAL or ERR_PANIC with nasm_fatal or
nasm_panic so the compiler knows that these functions cannot return,
*and* we trigger abort() if we were to ever violate that constraint.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
When we have to die due to an assertion violation, then show the
missing symbol. Also, use nasm_panic() rather than nasm_assert() for
this purpose.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Remove unused debugging functions, and the _unused macro which turned
out to cause compilation problems on Linux/PowerPC.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
In order to make it more likely to compile cleanly with "C90 plus long
long" style compilers, remove existing constructs (mostly commas at
the end of enums) that aren't compliant.
Ironically enough this was most likely an unintentional omission in
C90...
From master branch checkin 7214d18b405f883010a74a3f8281c7906a5a21ca
Resolved Conflicts:
output/outelf32.c
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Fixes Visual C++ 2010 breakage in recently added Codeview 8 code;
these are C99 features which were not necessary to introduce.
Signed-off-by: Knut St. Osmundsen <bird-nasm@anduin.net>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Correctly generate references between sections. The previous
version would work correctly as long as all relative references
came from the first section, which is usually __TEXT,__text and
so it usually worked.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
For 64 bits, a BRANCH reloc is sometimes needed to fix up PIC
problems. Make a best effort at generating BRANCH relocs just as
we make a best effort at distinguishing GOTLOAD from GOT.
This needs to be replaced with information from the assembler to
the backend.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
The __TEXT segment in particular contains both code and data. The
most consistent thing is to look only at the section name, and have
the same behavior across sections.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Make a point of the output format constants instead of making it
a pointer. The output format is set only once, but it is accessed
all the time.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Correct the handling of GOT relocations, as they need a symbol
reference. Add handling of TLVP relocations; it is unclear to me
if non-local relocations in TLVP space is permitted.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
For the mapping of .rodata to __TEXT,__const in the absence of
relocations, it would help if we changed the segment name *before* we
emit that part of the load command.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Sanitize the handling of sections in outmacho somewhat. This should
bring further performance improvements.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
If we specify .rodata as opposed to the explicit __DATA,__const, and
we end up with no relocations, change it to __TEXT,__const per the
Mach-O ABI. However, it is generally better for the programmer to
explicitly specify the items that should go into __TEXT,__const as
otherwise a single relocatable item will force the whole thing into
__DATA.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Allow specifying sections with arbitary MachO segment and section
names, as opposed to having a fixed list of supported sections
(especially __DATA,__const is wrong in some cases.) Furthermore,
we do a completely unnecessary lookup of the bss section *for every
call to macho_output()* which is just plain crazy.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Exceeding MAX_SECT is not a warning, it is a fatal error. However,
there is no point to test for it until we already process all the
sections.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
When we clear the ext bit, creating section-relative relocations,
the resulting value is computed somewhat differently; we need to
adjust for that.
TODO: Need to make sure we do the right thing for ALL relocations.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
We generate section-relative relocations for local symbols for all
the other output formats, and we should do the same for MachO;
this was done in MachO-32 but not in MachO-64, presumably because
the MachO spec implies that such relocations shouldn't exist in
64-bit code. They are indeed rare, but that is a programmer's
decision, and the spec is clear that they are legal.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
The name for the macho32 output format was incorrectly set to
macho64, which means neither macho32 nor macho64 worked correctly.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Allow section alignment to be declared more than once, with different
values. The strictest alignment value via either a section or
sectalign directive becomes the controlling parameter.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Merge the two Mach-O backends for cleanliness and maintainability.
This should also make the recent fixes to MachO-64 available in
MachO-32.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
When the macho64 backend was forked, instead of fixing variables which
ought to have been static all along, the porter added a -64 suffix to
prevent namespace conflict. Fix it by making those variables static.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Fix an array that was way too small resulting in memory overwrite
errors, and free a few more dynamic data structures.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Hopefully actually fix the issues with alignment this time.
Avoid a linear search of segments for each symbol emitted.
Issue an empty LC_DATA_IN_CODE command since that seems to be
expected.
With this, ffmpeg builds but still crashes on startup, which seems
very strange.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>