Commit Graph

428 Commits

Author SHA1 Message Date
H. Peter Anvin (Intel)
f770ce8be4 preproc: reserve space for terminal NUL in %strcat
Technically, this is not necessary, because make_tok_qstr_len()
doesn't rely on NUL termination, and in fact it *can't*, since the
string might contain embedded NULs, but tacking on a NUL is good for
debugging if nothing else. That means reserving space for it!

Reported-by: C. Masloch <pushbx@ulukai.org>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-10-17 18:22:43 -07:00
H. Peter Anvin (Intel)
18f4134222 preproc: fix %strcat and %substr
Fix incorrectly running off the end of the intended string for %strcat
and %substr.

This is a modified version of a patch contributed by C. Masloch.

Reported-by: C. Masloch <pushbx@ulukai.org>
Originally-by: C. Masloch <pushbx@ulukai.org>
Link: https://bugzilla.nasm.us/show_bug.cgi?id=3392599#c11
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-10-16 15:02:44 -07:00
H. Peter Anvin (Intel)
e86fa7fffd preproc: %xdefine must process arguments before expanding
The only way in which
    %xdefine(xxx) yyyy zzzz
differs from
    %define(xxx) yyyy %[zzzz]

is that in the former case macro arguments get preserved, even if
they are macros defined elsewhere. Revert to that behavior.

Reported-by: C. Masloch <pushbx@ulukai.org>
Link: https://bugzilla.nasm.us/show_bug.cgi?id=3392623
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-10-16 14:51:16 -07:00
H. Peter Anvin (Intel)
84b852bff0 Implement an enhanced version of MASM's dup() and "db ?" syntax.
Add support for complex data (Dx) statement expressions involving both
initialized and uninitialized data. In addition, we have support for
overriding the size of each element on an individual item and/or list
basis.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-10-16 14:29:16 -07:00
H. Peter Anvin
d03a6c8ffe preproc: fix the detection of the >= operator
There are *four* operators starting with ">": > >> >>> and >=.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2019-10-07 21:29:05 -07:00
H. Peter Anvin
d983b62233 preproc: make %exitrep do what it is supposed to
%exitrep should should stop emitting code immediately, not just
terminate the loop when we hit %endrep. There is a bunch of hacky code
that special-cases that using istk->in_progress == 0.

The handling of the tail of %exitrep, %include and non-emitting
conditionals using entirely different mechanisms is just dumb. They
need to be unified.

Link: https://bugzilla.nasm.us/show_bug.cgi?id=3392612
Reported-by: Jason Hood <jadoxa@yahoo.com.au>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2019-10-07 21:25:18 -07:00
H. Peter Anvin
58bd8e6644 warnings.pl: correct the documentation output for aliases
Expand the list of aliases, not the prefix "="!!

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2019-10-07 21:11:13 -07:00
H. Peter Anvin
bef71a86b9 warnings: do a line break before enabled/disabled note
We need to create a separate paragraph if the help text had used \c
anyway. Putting the enabled/disabled separately for all entries makes
it read a lot cleaner anyway.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2019-10-03 23:47:08 -07:00
H. Peter Anvin
7ad824be7a warnings: make it possible to put blank lines in doc text
rdsrc.pl requires blank lines around \c paragraph, but warnings.pl
would strip them. Create a *!- prefix to force a blank line.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2019-10-03 22:18:35 -07:00
H. Peter Anvin
1dd926e8ce pragma: handle default name/fallback handler for NULL list; cleanups
The previous code would fail to process any directive if the directive
list was NULL. However, we also need to process the default name
passed to search_pragma_list() (e.g. "elf32"), as well as the global
name (e.g. "output") and call the default handler in that case.

In the process, improve the handling such that if one handler returns
DIRR_UNKNOWN, try calling subsequent handlers in the list.

Finally, factor out as much as possible to generic handlers.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2019-09-30 13:30:15 -07:00
H. Peter Anvin
8571f06061 preprocessor: major cleanups; inline text into Token
Major cleanups of the preprocessor. In particular, the
block-allocation of Token is pretty ridiculous since nearly every
token requires a text allocation anyway. Change the definition of
Token so that only very long tokens (48+ characters on 64-bit systems)
need to be stored out of line.

If malloc() preserves alignment (XXX: glibc doesn't) then this means
that each Token will fit in a cache line.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2019-09-23 16:40:03 -07:00
H. Peter Anvin (Intel)
f7dbdb2e13 preproc: fix multiple memory corruption issues
paste_tokens() would not null-terminate the buffer before passing it
to tokenize(), resulting in garbage or a memory overwrite.

In several places the next pointers got confused; sometimes causing a
circular list and sometimes an invalid pointer.

Some minor code cleanups while fixing things, too...

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-09-18 21:20:52 -07:00
H. Peter Anvin (Intel)
f24d975008 preproc: fix list corruption bug; clean up token handling
expand_one_smacro() would corrupt the end of the list if a macro
expanded to another macro with arguments, which was also the last part
of the expansion.

Instead of doing all that testing with ttail, just scan forward at the
end to find the tail pointer; it is O(n) regardless.

Clean up the handling of tokens: use inline functions rather than odd
macros that sometimes modify their arguments and sometimes don't, and
fold some common code into new functions.

The tok_is() and tok_isnt() functions always are used with single
characters, so make it explicitly so (and remove the local hacks used
in some places.)

Allow using nasm_malloc() rather than blocked Tokens; this makes tools
like valgrind more useful in their reports.

For the future, consider making Tokens a separate memory allocation
immediately followed by the text, instead of using a pointer; we
allocate space for the string in almost every case anyway. Also
consider making it a doubly linked list...

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-09-18 18:17:26 -07:00
H. Peter Anvin
dd88aa9a1b preproc: add %ifusable and %ifusing directives
%ifusable tests to see if a certain %use package is available in this
version of NASM.

%ifusing tests if a certain %use packages is already loaded.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-09-12 19:39:48 -07:00
H. Peter Anvin
a039fcdb46 preproc: move %use package parsing to a separate routine
Move the parsing of %use package names to a separate routine, and stop
using get_id() for that purpose -- get_id() is wrong in a number of
ways.

This also means we can drop the error string argument to get_id().

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2019-09-12 19:27:42 -07:00
H. Peter Anvin
14f0328aa1 eval: don't try to poke *opflags if opflags is NULL
While changing this code around to not do redundant lookups, dropped
this NULL pointer check. Oops.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-09-12 18:34:14 -07:00
H. Peter Anvin
d626f355f6 preproc: correct handling of %ifdef for aliases
Correctly handling %ifdef when operating on aliases; we had an
infinite loop going...

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-09-12 18:31:29 -07:00
H. Peter Anvin
86b2e93081 assemble: fix too aggressive dropping of overflow warnings
Drop down to OUT_WRAP when the size is big enough, as opposed to not
doing any tests at all.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-09-12 18:20:07 -07:00
H. Peter Anvin
7ad25b2e18 Change LBL_NONE to LBL_none
NASM convention is to use all-upper-case for "real" information, and
mixed-case (upper case common prefix, lower case description) for
meta-information. This is a highly useful distinction.

Thus "LBL_NONE" implies an actual label of type "NONE", as opposed to
no label at all.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-09-12 20:26:23 -04:00
H. Peter Anvin
90b1ccff86 Drop unnecessary EXTERN symbols
Currently, NASM always issues as an unknown symbol any symbol declared
EXTERN. This is highly undesirable when using common header files,
as it might cause the linker to pull in a bunch of unnecessary
modules, depending on how smart the linker is.

Add a new REQUIRED directive which behaves like the old EXTERN, for
the use cases which might still need this behavior.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-09-12 20:21:03 -04:00
H. Peter Anvin
a73ccfebcc error: replace nasm_verror() indirection with preproc callback
Since pp_error_list_macros() was introduced, the only need for
pp_verror() is to suppress error messages in certain contexts. Replace
this function with a preprocessor callback,
preproc->pp_suppress_error(), so we can drop the nasm_verror()
function pointer entirely.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2019-08-28 19:02:47 -07:00
H. Peter Anvin
6a4353c4c2 errors: be more robust in handling unexpected fatal errors
Introduce a new error level, ERR_CRITICAL, beyond which we will
minimize the amount of code that will be executed before we die; in
particular don't execute any memory allocations, and if we somehow end
up recursing, abort() immediately.

Basically, "less than panic, more than fatal."

At this point this level is used by nasm_alloc_failed().

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2019-08-28 18:32:46 -07:00
H. Peter Anvin
2201ceb238 nasm: avoid null pointer reference on VERY early memory allocation failure
If we get a memory allocation failure before preproc is initialized,
we could end up taking a NULL pointer reference while trying to unwind
macros.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2019-08-27 17:19:07 -07:00
H. Peter Anvin
d235408c65 preproc: standard macros now C-safe, %aliases off, fix %? recursion
Enough users expect the namespace starting with underscore to be safe
for symbols. Change our private namespace from __foo__ to
__?foo?__. Use %defalias to provide backwards compatiblity (by using
%defalias instead of %define, we handle the case properly where the
user changes the value.)

Add a preprocessor directive:

%aliases off

... to disable all smacro aliases and thereby making the namespace
clean.

Finally, fix infinite recursion when seeing %? or %?? due to
paste_tokens(). If we don't paste anything, the expansion is done.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2019-08-27 16:42:41 -07:00
H. Peter Anvin (Intel)
7eb18213b7 preproc: make sure the mmacro params list is NULL-terminated
If we adjust nparams due to default or greedy arguments, we need to
re-terminate the params[] array.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-20 16:24:46 -07:00
H. Peter Anvin (Intel)
d4607846a4 preproc: smacro argument lists can't be preceded by space
The smacro argument list cannot be preceded by whitespace, or we
wouldn't be able to define no-argument smacros the expansion of which
starts with (.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-20 16:19:37 -07:00
H. Peter Anvin (Intel)
ffe89ddaed preproc: fix comment -La -> -Lm
The -Lm option was briefly called -Lm during development, fix.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-20 16:06:36 -07:00
H. Peter Anvin (Intel)
41d91a9273 preproc: mmacro argument fixes; listing option for mmacro args
Correctly handle empty mmacro arguments that still have preceding
whitespace tokens.

Default mmacro parameters are obtained by count_mmac_params() so they,
too, need to be shifted over by one.

Add an option to list mmacro calls with arguments. Name this -Lm;
remove the old -Lm option to -Ls since it is related to single-line
macros.

Trivially optimize the case where an mmacro is called from within
itself: if all possible mmacros are excluded by loop removal, there is
no need to delve into the mmac processing code.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-20 16:03:46 -07:00
H. Peter Anvin (Intel)
68075f8fa6 preproc: fix uninitialized variables
Fix uninitialized variables (not just warnings, actual bugs.)

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-20 12:28:05 -07:00
H. Peter Anvin (Intel)
a1a844697d preproc: fix varadic macros, add conditional comma operator
Fix the (severely broken handling of) varadic macros.

Add a conditional comma operator "%,". This expands to a comma unless
followed by a null expansion of some sort, which allows suppressing
the comma before an empty argument (usually varadic, but not
necessarily.)

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-20 01:32:28 -07:00
H. Peter Anvin (Intel)
62cf4aaef6 preproc: add suppport for stringify, nostrip, greedy smacro arguments
Add a few more types of smacro arguments, and clean stuff up in
process.

If declared with an &, an smacro parameter will be quoted as a string.
If declared with a +, it is a greedy/varadic parameter.
If declared with an !, don't strip whitespace and braces (useful with &).

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-20 00:05:41 -07:00
H. Peter Anvin (Intel)
de7acc3a46 preproc: defer %00, %? and %?? expansion for nested macros, cleanups
BR 3392603: When doing nested macro definitions, we need %00, %? and
%?? expansion to be deferred to actual expansion time, just as the
other parameters.

Do major cleanups to the mmacro expansion code.

Reported-by: Alexandre Audibert <alexandre.audibert@outlook.fr>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-19 18:24:02 -07:00
H. Peter Anvin (Intel)
41e9705054 assemble.c: fix signed/unsigned comparison warning
Ponderance: if data->bits < globalbits, should we actually use
OUT_UNSIGNED rather than OUT_WRAP here?

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-19 15:47:36 -07:00
H. Peter Anvin (Intel)
b83621350c listing: add the -L+ option to enable all listing options
-L+ or %pragma list options ++ will enable all possible listing
 options.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-19 13:11:34 -07:00
H. Peter Anvin (Intel)
2586cee21d BR 3392472: don't complain on wraparound for lower bit modes
If the address we are using is >= the size of the instruction, then
don't complain on overflow as we can wrap around the top and bottom of
the address space just fine.

Alternatively we could downgrade it to OUT_WRAP in that case.

Reported-by: C. Masloch <pushbx@38.de>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-16 01:44:49 -07:00
H. Peter Anvin (Intel)
93d41d8296 BR 3392576: don't segfault on a bad %pragma limit
Don't segfault on a bad %pragma limit. Instead treat a NULL pointer as
an empty string.

Reported-by: Ren Kimura <rkx1209dev@gmail.com>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-16 01:12:54 -07:00
H. Peter Anvin (Intel)
8b6e6bf04f config.h: separate function and function pointer attributes; automate
Separate out function and function pointer attributes, as not all
versions of all compilers support both.

Have macros related to function attributes auto-generated by
autoheader. As a result, rename config.h.in to unconfig.h, to make it
more obvious that it is really intended to be included from some C
programs.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-16 00:08:27 -07:00
H. Peter Anvin (Intel)
9fbd9fb859 preproc: fix mmacro nesting prevention
BR 3392602: mmacros should not nest unless so explicitly specified.

Reported-by: C. Masloch <pushbx@38.de>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-15 19:26:52 -07:00
H. Peter Anvin (Intel)
4b282d0503 macros: can't use the __USE_*__ macro string anymore; fix comment stripping
Since we use 127 not 0 for end of line in stdmac packages now, we
can't simply use the __USE_*__ macro as a string to test for a %use
package. Keep an internal array of state instead.

Fix the stripping of comments from lines in macro files.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-15 11:53:19 -07:00
H. Peter Anvin (Intel)
6d5c77c95f stdmac: handle up to 160 directives, make macros.c more readable
Handle up to 160 directives for stdmac compression.  This is done by
allowing the directive numbers to wrap around (128-255, 0-31), using
127 for end of line, and forcing any whitespace character to be space.

Make macros.c a bit more legible by using #defines for the byte codes;
strictly for the benefit of the human reader.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-15 02:29:40 -07:00
H. Peter Anvin (Intel)
566a0f2187 pptok.pl: don't leak internal codes into pptok.c
If we have internal codes in pptok.c, we may have false matches for
them as tokens, plus, there is no reason for them to exist there. Go
back to putting NULL in those slots.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-15 01:37:48 -07:00
H. Peter Anvin (Intel)
97cbdd34d0 preproc: simplify handling of conditionals and casesense
Simplify the handling of conditionals; remove the PPC_* types.

Automate the generation of case-sensitive versus case-insensitive
directives, and make it so the bulk of the code doesn't have to worry
about it.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-15 01:14:23 -07:00
H. Peter Anvin (Intel)
5e3d741b00 preproc: introduce alias smacros, cleanups
Introduce "alias smacros", which are the smacro equivalent of
symlinks; when used with the various smacro-defining and undefining
directives, they affect the macro they are aliased to. Only explicit
%defalias, %idefalias, and %undefalias affect them.

This is intended for being able to rename macros while retaining the
legacy names.

This patch also removes an *astonishing* amount of duplicated
code:

1. Every caller to defined_smacro() and undef_smacro() would call
   get_ctx() to mangle the macro name; push that into those functions.
2. Common code to get an smacro identifier.
3. Every code path that returns DIRECTIVE_FOUND also has to do
   free_tlist(origline); make it do so.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-14 23:50:37 -07:00
H. Peter Anvin (Intel)
8981724f17 masm.mac, parser: VERY limited MASM emulation package
Very limited MASM emulation.

The parser has been extended to emulate the PTR keyword if the
corresponding macro is enabled, and the syntax displacement[index] for
memory operations is now recognized.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-14 15:44:50 -07:00
H. Peter Anvin (Intel)
02b60ddd1c LEA: allow immediate syntax; ignore operand size entirely
The memory operand size of LEA doesn't matter in any way as it isn't
"real memory". Add an ANYSIZE option to ignore sizes entirely.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-14 15:23:00 -07:00
H. Peter Anvin
a635809620 list_option_mask(): return 0 (empty mask) for < '0'
Missing limit check at the bottom of the ASCII range.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2019-08-10 18:16:37 -07:00
H. Peter Anvin
d91519a107 listing: encapsulate the list_options encoding, make more comprehensive
Encapsulate the list_options() encoding in an inline function. We only
ever compute a mask with a non-constant input in two places (command
line and pragma parsing), so a slightly more complex mapping is of no
consequence; thus map a-z, A-Z and 0-9 as being the most likely
characters we may want to use as options. Space is left for two more :)

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2019-08-10 18:04:04 -07:00
H. Peter Anvin
59d4ccc2b0 Add %pragma list options
Add a %pragma to set (or clear) listing options. It only takes effect
on the next assembly pass, however!

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2019-08-10 06:45:12 -07:00
H. Peter Anvin
06335873ae preproc: avoid dropping the facility name in %pragma
tline got advanced a token too far, with the obvious results that the
facility name got truncated. Skip whitespace *after* expand_smacro(),
not before.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2019-08-10 06:42:55 -07:00
H. Peter Anvin
f5d7d90148 preproc: fix double free in the handling of %pragma
expand_smacro() consumes its input, so we need to truncate the input
list so we can call free_tline(origline) safely.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2019-08-10 06:21:00 -07:00
H. Peter Anvin
6686de2bab preproc: add listing options to override nolist; some cleanups
Add listing options:

    -Lb    to show builtin macro packages
    -Lf    to override .nolist

Do some cleanups in the process, in particular generalize read_line()
between stdmac and file alternatives.

When processing stdmac, create an istk entry for it. This means stdmac
can be identified by istk->fp == NULL. At some future date there could
even be a function pointer to an appropriate read function.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2019-08-10 05:33:14 -07:00
H. Peter Anvin
3f51082bcd listing: clean up before a restart
With the -Lp option, the listing generator gets invoked multiple times
in the same session. If we already have a list file open, call
list_cleanup() before reinitializing; otherwise we get stray output in
the updated file.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2019-08-10 05:31:51 -07:00
H. Peter Anvin
a2c1c7d0d4 listing: coalesce TIMES in non-final passes, print <len>, clarify hex
Merge TIMES in the nonfinal passes, there is no point in getting <len
...> an arbitrary number of times.

Actually print <len> (OUT_RAWDATA without a data pointer), not <res>
(OUT_RESERVE).

Dropping the zero-fill for the hex format made the listing more
manageable, but it also doesn't immediately look like hex, plus there
is now the -Ld option. Put an h after hex (shorter than leading 0x) to
make it obvious.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2019-08-10 02:45:41 -07:00
H. Peter Anvin
355bfb879d Stylistic improvements to help text
Trivial stylistic improvements to the help text.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2019-08-10 01:55:00 -07:00
H. Peter Anvin
322bee0aac Additional listing options, improve help output, fix macro limits
Additional listing options:

   -Ld to display counts in decimal
   -Lp to output a list file in every pass (to make sure one exists)

Clean up the help output and make it comprehensive. The -hf and -y
options are no longer necessary, although they are supported for
backwards compatiblity.

Fix macro-levels so it actually count descent levels; a new
macro-tokens limit introduced for the actual token limit.

Slightly simplify the limits code.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2019-08-10 01:46:58 -07:00
H. Peter Anvin
ab6f831955 listing: when listing lines in macros and rep blocks, show the actual line
When printing lines coming from %rep blocks and macros, show the line
number corresponding to the line actually being printed.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2019-08-09 22:31:45 -07:00
H. Peter Anvin (Intel)
ad1f50a506 warnings.pl: remove one more instance of "scalar(%hash)"
scalar(%hash) is broken on old versions of Perl, use scalar(keys
%hash).

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-09 16:20:40 -07:00
H. Peter Anvin (Intel)
98031bfff4 preproc.c: make sure we have the correct token lengths
It turns out that in tokenize() we would sometimes truncate a token
string by inserting a NUL into the input string, expecting new_Token()
to pick it up using strlen(). With explicit lengths, that no longer
works, but there is a better solution anyway: instead of inserting
NUL characters, keep track of where the token actually ends and feed
the correct length to new_Token().

This triggered a buffer overflow in detoken(), add a debug level 2
assert for this condition. Use a relatively high debug level, because
strlen() is fairly expensive, and this is an extremely
performance-critical path.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-09 16:11:28 -07:00
H. Peter Anvin (Intel)
5067fde483 asm/nasm.c: make --debug=level actually work
Turns out there was no support for optional arguments at all.
[
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-09 16:10:17 -07:00
H. Peter Anvin (Intel)
fb118aecc5 obsolete: make the message clearer in the case of NEVER,!NOP
"instruction never implemented and removed from the target CPU"

... doesn't really make sense, so change it to ...

"instruction never implemented and invalid on the target CPU"

(still may seen redundant, but it is to distingush it from "and is a
noop on...")

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-09 15:01:28 -07:00
H. Peter Anvin (Intel)
5b39461178 obsolete handing: handle a few more subcases in a useful way
Distinguish instructions which have once been valid (OBSOLETE) from
those that never saw the light of day (NEVER). Futhermore, flag
instructions which devolve to an architectural noop from those with
undefined behavior and possibly recycled opcodes.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-09 14:52:16 -07:00
H. Peter Anvin (Intel)
fb11889040 BR 3392590: add warning for valid but obsolete instructions
Just becase one is compiling for an old CPU doesn't mean one wants to
use obsolete instructions that would not be forward compatible. Rename
the "obsolete" warning to "obsolete-removed" and create a new
"obsolete-valid" warning to go with it (-w[+-]obsolete controls both
options, as usual.)

Suggested-by: C. Masloch <pushbx@38.de>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-09 14:21:42 -07:00
H. Peter Anvin (Intel)
d73b10abd5 warnings.pl: BR 3392585: don't use scalar(%hash)
The idiom scalar(%hash) seems similar to scalar(@array), and in fact
is in current versions of Perl. However, in older versions of Perl,
the former is totally useless:

       Prior to Perl 5.25 the value returned was a string consisting
       of the number of used buckets and the number of allocated
       buckets, separated by a slash.  This is pretty much useful only
       to find out whether Perl's internal hashing algorithm is
       performing poorly on your data set.  For example, you stick
       10,000 things in a hash, but evaluating %HASH in scalar context
       reveals "1/16", which means only one out of sixteen buckets has
       been touched, and presumably contains all 10,000 of your items.
       This isn't supposed to happen.

       As of Perl 5.25 the return was changed to be the count of keys
       in the hash. If you need access to the old behavior you can use
       "Hash::Util::bucket_ratio()" instead.

Use scalar(keys %hash) instead.

Reported-by: Orkan Sezer <sezeroz@gmail.com>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-09 13:45:41 -07:00
H. Peter Anvin (Intel)
177a05d0ce perl files: clean up warnings
Clean up some perl warnings, some of which were legitimate (apparently
undef doesn't actually take a list of arguments, a common enough
mistake that it is mentioned in the man page!, and a list of variables
after "my" can be cantankerous), and some of which were nuisance but
were easy enough to clean up.

Maybe this can resolve the problems with very old version of Perl?

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-09 13:30:19 -07:00
H. Peter Anvin (Intel)
d6e817751e listing: add -L option for additional listing info
Add an -L option for additional listing information. Currently
supported is -Le, which emits each line after processing through the
preprocessor, and -Lm, which displays each single-line macro defined
or undefined.

NASM doesn't preserve the names of unused arguments, nor does it have
any technical reason to do so. Instead of adding complexity to save
them, make unnamed parameters official by specifying an empty string
in the argument list.

This has the additional advantage that () is now simply considered a
single empty argument, which means that NASM should now properly
handle things like:

%define myreg() eax
	mov edx,myreg()

... similar to how the C preprocessor allows an empty macro argument
list which is distinct from a macro with no arguments whatsoever.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-09 08:39:02 -07:00
H. Peter Anvin (Intel)
d66927a677 Diagnostics: make debug more dynamic, note -> info, add listmsg level
Make debug messages more dynamic by making it easy to conditionalize
the messages.

Change ERR_NOTE to ERR_INFO which reflects the usage better.  Other
compilers use note: for additional information.

Don't unwind the macro stack with ERR_HERE; it is only going to give
confusing results as it will unwind the wrong macro stack.

Add ERR_LISTMSG level which is *always* suppressed, but will still
appear in the list file.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-09 04:28:55 -07:00
H. Peter Anvin (Intel)
524918394d labels.c: don't use ERR_NOTE for additional information
ERR_NOTE just confuses things, especially in the case of a suppressed
warning.

The preprocessor doesn't use it for unwinding macros, either.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-09 03:55:29 -07:00
H. Peter Anvin (Intel)
b1e15f42fe Add implicitly sized versions of the K instructions
This allows the K instructions to be specified without a size suffix
as long as the operands are sized; this matches the way most other x86
instructions work. As this is not the syntax specified in the SDM,
don't use it for disassembly.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-09 02:44:46 -07:00
H. Peter Anvin (Intel)
1c21a53e4e preproc: fix parsing of single-line macro arguments, cleanups
The single-line macro argument parsing was completely broken as a
comma would not be recognized as an argument separator.

In the process of fixing this, make a fair bit of code cleanups.

Note: reverse tokens for smacro->expansion doesn't actually make any
sense anymore, might reconsider that.

This checkin also removes the distinction between "magic" and plain
smacros; the only difference is which specific expand method is being
invoked.

Finally, extend the allocating-string functions such that *all* the
allocating string functions support querying the length of the string
a posteori.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-09 02:34:21 -07:00
H. Peter Anvin (Intel)
875eb24b29 preproc.c: fix macro descent
We have to call expand_one_smacro() recursively, otherwise we will not
expand smacros which point to other smacros. We cannot simply do this
by looping after token pasting, because we need to make sure we don't
recursively expand the same smacro.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-07 17:14:35 -07:00
H. Peter Anvin
0d4d431a01 Merge empty reservations from TIMES; add counts in listings
For constructs like TIMES xx RESB yy merge the TIMES and RESB and feed
a single reservation to the backend; this can (obviously) be
dramatically faster.

Add byte count in listings for <incbin> and repeat count to <rept>; to
make them more reasonable in length shorten to <bin ...> and <rep ...>
respectively, and don't require leading zeroes in bin/rep/res count.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2019-08-07 00:59:24 -07:00
H. Peter Anvin (Intel)
77335213e3 assemble: shuffle a few assignments around
Shuffle around a few assignments which might help the compiler need to
spill less.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-06 23:22:48 -07:00
H. Peter Anvin (Intel)
41bb8a8114 Warn if trying to assemble obsolete instructions
Print a warning if one tries to assemble an obsolete instruction,
unless there is an exact match for the CPU directive.

For example:

	CPU 386
	POP CS		; Warning - obsolete instruction

	CPU 8086
	POP CS		; No warning

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-06 22:56:51 -07:00
H. Peter Anvin (Intel)
d13a6f9708 iflag.h: fix IF_CPU_LEVEL_MASK, add missing CPU definitions
Fix the definition of IF_CPU_LEVEL_MASK (which was missing the top
bit, IFM_ANY itself).

Add CPU definitions that we actually have into directiv.c.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-06 22:33:14 -07:00
H. Peter Anvin (Intel)
ca47c843ed warnings.pl: move comment
Move a comment to where it makes more sense.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-06 19:30:36 -07:00
H. Peter Anvin (Intel)
65c958d59f warnings.pl: warn on duplicate definition instead of broken output
Have warnings.pl give a warning(!) message if a warning definition is
found to be duplicated, including the location of both
definitions. Much better than silently creating bogus output.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-06 19:28:57 -07:00
H. Peter Anvin (Intel)
873ceee29f Replace nasm_error(ERR_WARNING|...) with nasm_warn()
Remove a few remaining instances of nasm_error(ERR_WARNING).

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-08-06 19:18:36 -07:00
H. Peter Anvin
959702baa8 asm/assemble.c: stylistic fix to bnd warning
Clean up the style for the bnd warning.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2019-06-06 20:56:50 -07:00
H. Peter Anvin
fdeb3b0d01 Add group aliases for all prefixed warnings.
For example, -w+float will now enable all warnings with names staring
with float-*.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2019-06-06 20:53:17 -07:00
H. Peter Anvin
db6960c3fa quote: improve comment
Explain why 0xfc + vb5 cannot overflow a byte value.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2019-06-06 17:32:44 -07:00
H. Peter Anvin
10d9589f02 quote: emit invalid UTF-8 rather than just dropping a strange value
If an UTF-8 value exceeds 0x7fffffff, there is no legitimate encoding
for it. However, using FE or FF as leading bytes provide at least some
kind of encoding. This is assembly, and the programmer is (almost?)
always right. It might be worthwhile to add a suppressible warning for
invalid UTF-8 strings in general, though, including any character >
0x10ffff, surrogates, or a string that is constructed by hand.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2019-06-06 17:28:51 -07:00
H. Peter Anvin
236f4a832b strfunc: better error messages if a string transform fails
Let the user know what string transform actually failed on them.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2019-06-06 17:17:16 -07:00
H. Peter Anvin
d4b20355d2 asm/quote.c: fix range cutoffs for UTF-8
The various UTF-8 byte cutoffs were off by a factor of 2. Fix.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2019-06-06 17:16:45 -07:00
H. Peter Anvin
4d7bf79ed7 quote.c: let nasm_skip_string() return NULL for a non-string
Returning NULL makes more sense than returning the initial pointer
(the only other sensible alternative would be to return a pointer the
final null character.)

This currently can't happen, as all callers to nasm_skip_string()
currently explicitly tests for an initial quote.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2019-06-06 16:29:52 -07:00
H. Peter Anvin
5282cea85b Merge branch 'master' of ssh://repo.or.cz/nasm
Resolved Conflicts:
	asm/preproc.c

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2019-06-06 16:26:22 -07:00
H. Peter Anvin
249c217070 quote: drop merging of adjacent strings; allow some control chars
The merging of adjacent ' or " strings really does nothing but
introduce gratuitous incompatiblities; drop it.

Allow *some* control characters (BEL BS TAB ESC) in
nasm_unquote_cstr().

The ` state machine can be greatly simplified by treating \0 as just
another character and let it terminate the string in appropriate
contexts, just like `. The only difference with ` is when it occurs
in state st_backslash: you can't escape the null character.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2019-06-06 16:21:01 -07:00
Chang S. Bae
fea22697e2 preproc: Fix the initial enum value in stdmac_ptr()
TOKEN_ID is from enum pp_token_type, but struct Type has enum
token_type. TOK_ID seems to be a matched one.s

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
2019-06-02 23:51:25 +03:00
H. Peter Anvin (Intel)
a7afe276da preproc: factor out getting a line of tokens and detokenizing it
Split the code for getting a line of tokens from the code that sets
verror and detokenizes the resulting string.

While we are at it, merge the handling of EOF and ^Z into the general
loop in read_line().

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-04-26 00:34:04 -07:00
H. Peter Anvin (Intel)
41e9682efe preproc: massive cleanup of smacro expansion
The smacro expansion code was virtually impossible to understand, and
was leading to very strange failures. Clean it up, and do much better
handling of magic macros.  This should also allow for recursive
macros, but recursive macros are extremely tricky in that it is very
hard to keep them from recursing forever, unless there is at least one
argument which is never expanded. They are not currently implemented.

Even so, I believe token pasting makes it possible to create infinite
loops; e.g.:

%define foo foo %+

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-04-25 18:00:32 -07:00
H. Peter Anvin (Intel)
9bb55bd127 Merge branch 'evalmacro'
Resolved Conflicts:
	asm/preproc.c
	output/elf.h
	output/outelf.c
	output/outelf.h
	version

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-04-24 11:14:43 -07:00
H. Peter Anvin
bb42d30737 quote: disallow control characters in C strings; concatendate; cleanups
In nasm_unquote_cstr(), disallow any control character, not just
NUL. This will matter when allowing quoting symbols.

Merge nasm_unquote() and nasm_unquote_cstr().

Strings can now be concatenated, C style: adjacent quoted strings
(including whitespace-separated) are merged into a single string.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2019-04-22 14:34:22 -07:00
Cyrill Gorcunov
982186a1a3 preproc: Fix nil dereference on error paths
https://bugzilla.nasm.us/show_bug.cgi?id=3392562

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
2019-03-16 23:19:12 +03:00
H. Peter Anvin
8b26247442 preproc: add %i... variants, evaluated macro parameters, cleanups
All directives which create single-line macros now have %i... variants
to define case-insensitive versions. Case insensitive rather sucks,
but at least this way it is consistent.

Single-line macro parameters can now be evaluated as a number, as done
by %assign. To do so, declare a parameter starting with =, for
example:

%define foo(x,=y) mov [x],macro_array_y

... would evaluate y as a number but leave x as a string.

NOTE: it would arguably be better to have this as a per-instance
basis, but it is easily handled by having a secondary macro called
with the same argument twice.

Finally, add a more consistent method for defining "magic" macros,
which need to be evaluated at runtime. For now, it is only used by the
special macros __FILE__, __LINE__, __BITS__, __PTR__, and __PASS__.

__PTR__ is a new macro which evaluates to word, dword or qword
matching the value of __BITS__.

The magic macro framework, however, provides a natural hook for a
future plug-in infrastructure to hook into a scripting language.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2019-02-26 14:00:54 -08:00
H. Peter Anvin (Intel)
1df7263ae9 warnings: add [warning push] and [warning pop]
Add [warning push] and [warning pop] directives.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-01-11 13:13:03 -08:00
H. Peter Anvin (Intel)
38ddb19977 Warnings: move zeroing reserved space to a separate warning class
Zeroing reserved space in a progbits section really should be a
separate warning class, so it can be controlled independently.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
2019-01-11 12:27:02 -08:00
H. Peter Anvin
ef4f23d76a tokhash.pl: zero all the fields for a not-found token
Make sure we zero all the token fields if we don't find something in
the hash.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2018-12-29 20:14:50 -08:00
H. Peter Anvin
6a4f0b36c8 tokens.dat: TOKEN_SIZE sizes belong in inttwo, not in flags
TOKEN_SIZE size values ended up in the wrong place, which caused
parser errors due to being mistaken as flags.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2018-12-29 20:13:35 -08:00
H. Peter Anvin
8960e1bc83 Remove #includes already provided by "compiler.h"
"compiler.h" already includes a bunch of common include files. There
is absolutely no reason to duplicate them in individual files, and in
fact it robs us of central control of how these files are used.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2018-12-27 12:45:44 -08:00
H. Peter Anvin
c2f3f26015 Replace <ctype.h> includes with "nctype.h"
For almost everything we should use "nctype.h". Right now we don't
have a nasm_toupper() to use <ctype.h> for things that need toupper().

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2018-12-27 12:37:25 -08:00
H. Peter Anvin
bd2803964e Merge tag 'nasm-2.14.03rc1'
NASM 2.14.03rc1

Resolved Conflicts:
	asm/labels.c
	include/error.h

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2018-12-27 11:37:22 -08:00