On some platforms, tolower() is implemented as a function call, in
order to handle locale support. We never change locales, so can the
result of tolower() into a table, so we don't have to sit through the
function call every time.
~1.3% overall performance improvement on a macro-heavy benchmark under
Linux x86-64.
- Add %warning directive
- Only unquote an %error or %warning string if it is the only thing on
the directive line.
- Don't expand macros inside a quoted string, even for %error.
Add the -MP option to emit phony targets. Since this means each
header file has to be visited more than once, change the
implementation to use an internal list of all the dependencies, and
centralize the emission of the dependency files.
Implement the dependency options:
-MF: set the file to which dependencies are written.
-MD: generate dependencies in parallel with compilation.
-MT: set the name of the dependency target.
-MQ: same as -MT, but *attempt* to quote it for Makefile safety.
First cut at AVX machinery support. The only instruction implemented
is VPERMIL2PS, and it's probably buggy. I'm checking this in with the
hope that other people can start helping out with (a) testing this,
and (b) adding instructions.
NDISASM support is not there yet.
This checkin creates the following date and time macros:
__DATE__, __TIME__, __UTC_DATE__, __UTC_TIME__: strings
__DATE_NUM__, __TIME_NUM__, __UTC_DATE_NUM__, __UTC_TIME_NUM__:
civil dates in digit-string formats
__POSIX_TIME__: time in POSIX time_t format
is_suppressed_warning() should never return true unless we're actually
dealing with a warning. There is a handful of cases where we pass
ERR_PASS1 down together with errors, but that's mostly because it fits
into an overall pattern. Thus, ignore it.
For PASS1 warnings, only do them when pass0 == 1. The prior passes
are to be considered training passes. This is a bit awkward if we
then hit an error, but it's better than n repeated warnings.
The five-pass-minimum was a hack for a bug which I think is identified
now. Doesn't really change the fact that if you want the optimizer,
you probably want -Ox.
We have a number of bug reports about things not working properly when
the optimizer is running out of passes. I suspect the reason is
simply that we don't properly execute the final passes (pass0 = 1, 2)
when hitting the limit. Make sure we advance pass0 the last few
times.
Avoid redundant error messages:
./nasm
nasm: error: no input file specified
nasm: fatal: file `' is both input and output file
type `nasm -h' for help
... which is more than a wee bit confusing to the user.
Add gcc-style -Wxxx -Wno-xxx warning selection as an alternative to
-w+xxx/-w-xxx.
Add "all" as an alias for all (actual) warnings.
Add "error" to treat warnings as errors.
Address data is always int64_t even if the size itself is smaller;
this was broken on bigendian hosts (still need testing!)
Create simple "write sized object" macros.
Actually generate the appropriate floating-point warnings, and only
one per assembly, pretty please.
Correct the round-to-overflow condition; as written all numbers with a
positive exponent were considered overflows!
Proper use of bool and enum makes code easier to debug. Do more of
it. In particular, we really should stomp out any residual uses of
magic constants that aren't enums or, in some cases, even #defines.
Both C and C++ have "bool", "true" and "false" in lower case; C
requires <stdbool.h> for this, in C++ it is an inherent type built
into the compiler. Use those instead of the old macros; emulate with
a simple typedef enum if unavailable.
SAA's were never intended to allow random access, but several backends
do random or semirandom access via saa_fread() and saa_fwrite()
anyway. Rewrite the SAA system to allow for efficient random access.
On "label.pl 10000000" this improves performance by a factor of 12.
Change cloc_t to struct location, and reorder the members so that it
should fit in 16 bytes instead of needing 8 bytes of extra padding on
64-bit machines.
Change loc_t to cloc_t to avoid AIX conflict.
We really shouldn't use _t names at all; they are usually considered
platform types, but worry about that later.
Concentrate compiler dependencies to compiler.h; make sure compiler.h
is included first in every .c file (since some prototypes may depend
on the presence of feature request macros.)
Actually use the conditional inclusion of various functions (totally
broken in previous releases.)
Implement the -MG option, to generate dependencies in the presence of
generated files. In the end, we probably need to support the full
gamut of GCC-like dependency-generation options.
Finish the perfect hash tokenizer, and actually enable it.
Move stdscan() et al to a separate file, since it's not needed in any
of the clients of nasmlib other than nasm itself.
Run make alldeps.
Implement "REL" and "ABS" modifiers for offsets in 64-bit mode. This
replaces "rip+XXX" type addressing. The infrastructure to set the default
mode is there, but there is nothing to throw the switch just yet.
- MOV gpr,CRx or MOV CRx,gpr can access high control registers with a LOCK
prefix; handle that in both the assembler and disassembler.
- Get a saner error message when trying to access high resources in
non-64-bit mode.