Seems like Ghostscript has managed to break fontconfig support again,
at least in Fedora 30. Help Ghostscript along by giving it an explicit
font path.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
If an UTF-8 value exceeds 0x7fffffff, there is no legitimate encoding
for it. However, using FE or FF as leading bytes provide at least some
kind of encoding. This is assembly, and the programmer is (almost?)
always right. It might be worthwhile to add a suppressible warning for
invalid UTF-8 strings in general, though, including any character >
0x10ffff, surrogates, or a string that is constructed by hand.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Returning NULL makes more sense than returning the initial pointer
(the only other sensible alternative would be to return a pointer the
final null character.)
This currently can't happen, as all callers to nasm_skip_string()
currently explicitly tests for an initial quote.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The merging of adjacent ' or " strings really does nothing but
introduce gratuitous incompatiblities; drop it.
Allow *some* control characters (BEL BS TAB ESC) in
nasm_unquote_cstr().
The ` state machine can be greatly simplified by treating \0 as just
another character and let it terminate the string in appropriate
contexts, just like `. The only difference with ` is when it occurs
in state st_backslash: you can't escape the null character.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Unfortunately, there is an assumption that the section name is bounded to '65'
characters and dashes have been appended so far. A simple fix for this report:
https://bugzilla.nasm.us/show_bug.cgi?id=3392564
We may need to cleanup further for those hardcoded numbers in decorating the
section info.
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
TOKEN_ID is from enum pp_token_type, but struct Type has enum
token_type. TOK_ID seems to be a matched one.s
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Windows supports pathnames up to 32767 UTF-16 characters, but using
the standard interfaces only up to 260 characters. Wrap the functions
that take filenames on Windows.
Clean up the compatiblity layers some more for reduced #ifdefs.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Split the code for getting a line of tokens from the code that sets
verror and detokenizes the resulting string.
While we are at it, merge the handling of EOF and ^Z into the general
loop in read_line().
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
The smacro expansion code was virtually impossible to understand, and
was leading to very strange failures. Clean it up, and do much better
handling of magic macros. This should also allow for recursive
macros, but recursive macros are extremely tricky in that it is very
hard to keep them from recursing forever, unless there is at least one
argument which is never expanded. They are not currently implemented.
Even so, I believe token pasting makes it possible to create infinite
loops; e.g.:
%define foo foo %+
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
In nasm_unquote_cstr(), disallow any control character, not just
NUL. This will matter when allowing quoting symbols.
Merge nasm_unquote() and nasm_unquote_cstr().
Strings can now be concatenated, C style: adjacent quoted strings
(including whitespace-separated) are merged into a single string.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
All directives which create single-line macros now have %i... variants
to define case-insensitive versions. Case insensitive rather sucks,
but at least this way it is consistent.
Single-line macro parameters can now be evaluated as a number, as done
by %assign. To do so, declare a parameter starting with =, for
example:
%define foo(x,=y) mov [x],macro_array_y
... would evaluate y as a number but leave x as a string.
NOTE: it would arguably be better to have this as a per-instance
basis, but it is easily handled by having a secondary macro called
with the same argument twice.
Finally, add a more consistent method for defining "magic" macros,
which need to be evaluated at runtime. For now, it is only used by the
special macros __FILE__, __LINE__, __BITS__, __PTR__, and __PASS__.
__PTR__ is a new macro which evaluates to word, dword or qword
matching the value of __BITS__.
The magic macro framework, however, provides a natural hook for a
future plug-in infrastructure to hook into a scripting language.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
note, preinit_array, init_array, and fini_array are ELF section types
that can matter to the assembly programmer.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Dead code elimination in ELF uses separate ELF sections for every
functions or data items that may be garbage collected. This can end up
being more than 32,633 sections which, when the ELF internal and
relocation sections are added in, can exceed the legacy ELF maximum of
65,279 sections.
Newer versions of the ELF specification has added support for much
larger number of sections by putting a place holder value (usually
SHN_XINDEX == 0xffff, but 0 in some cases) into fields where the
section index is a 16-bit value, and storing the full value in a
diffent place: the program header uses entries in section header 0,
the symbol table uses an auxiliary segment with the additional
indicies; the section header did not need it as the sh_link field is
already 32 (or 64) bits long.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The name "aux" is reserved on Windows platforms, a legacy from CP/M
via MS-DOS. Rename it to "helpers".
Turns out that that directory wasn't actually used properly, because
AC_CONFIG_AUX_DIR was never defined, and there was a redundant copy of
install-sh checked into the base of the source tree.
Reported-by: Ehsan Alem Mohammad Ghasemlou <e.ghasemloo@gmail.com>
NASM-Bugzilla: https://bugzilla.nasm.us/show_bug.cgi?id=3392560
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Use a hash table to look up sections by name, and an RAA to look up
sections by index; thus remove O(n) searches. This becomes important
since ELF uses sections for dead code elimination.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
sectalign on|off is documented to only affect the align/alignb
directives, *not* an explicit sectalign directive. This is fairly
obviously the proper behavior, so make it work accordingly.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Allow the alignb directive to be used in either a progbits or a nobits
section, by suppressing the zeroing warning.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Zeroing reserved space in a progbits section really should be a
separate warning class, so it can be controlled independently.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Add support for the "merge" attribute in ELF, along with the
associated "strings" and size specifier attributes.
Fix a few places where we used "int", but a larger type really ought
to have been used.
Be a bit more lax about respecifying attributes. For example, align=
can be respecified; the highest resulting value is used.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
TOKEN_SIZE size values ended up in the wrong place, which caused
parser errors due to being mistaken as flags.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
nasm_aprintf_size() does include the final NUL byte, but does not
include any prefix storage allocated by nasm_[v]axprintf().
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Move the inclusion of <strings.h> from nasmlib.h to compiler.h
Try to centralize compiler dependences as much as possible.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
"compiler.h" already includes a bunch of common include files. There
is absolutely no reason to duplicate them in individual files, and in
fact it robs us of central control of how these files are used.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
For almost everything we should use "nctype.h". Right now we don't
have a nasm_toupper() to use <ctype.h> for things that need toupper().
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
In BR 3392539, the error:
helloW.s:18: error: label `rurt' changed during code generation
[-w+error=label-redef-late]
... occurs a number of times after we have already issued an
error. This is because the erroring instruction computes to a
different size during code generation; this causes each subsequent
label to cause a phase error.
The phase error simply doesn't make much sense to report: if we are
already committed to erroring out, it is more likely an error cascade
rather than an error in its own right, so just suppress it in that
case.
Reported-by: <russvz@comcast.net>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>