mirror of
https://github.com/netwide-assembler/nasm.git
synced 2024-11-21 03:14:19 +08:00
doc: document preprocessor functions
Add documentation for preprocessor functions, as well as the flow of preprocessor expansion. Signed-off-by: H. Peter Anvin <hpa@zytor.com>
This commit is contained in:
parent
3fe5b3f5a1
commit
392b2b18a0
@ -20,6 +20,10 @@ filename information anyway.
|
||||
\b Fix handling of MASM-syntax reserved memory (e.g. \c{dw ?}) when
|
||||
used in structure definitions.
|
||||
|
||||
\b The preprocessor now supports functions, which can be less verbose
|
||||
and more convenient than the equivalent code implemented using
|
||||
directives. See \k{ppfunc}.
|
||||
|
||||
|
||||
\S{cl-2.15.06} Version 2.15.06
|
||||
|
||||
|
302
doc/nasmdoc.src
302
doc/nasmdoc.src
@ -1,7 +1,7 @@
|
||||
\# --------------------------------------------------------------------------
|
||||
\#
|
||||
\# Copyright 1996-2022 The NASM Authors - All Rights Reserved
|
||||
\M{year}{1996-2020}
|
||||
\M{year}{1996-2022}
|
||||
\# See the file AUTHORS included with the NASM distribution for
|
||||
\# the specific copyright holders.
|
||||
\#
|
||||
@ -84,7 +84,8 @@
|
||||
\IR{-w} \c{-w} option
|
||||
\IR{-Z} \c{-Z} option
|
||||
\IR{!=} \c{!=} operator
|
||||
\IR{$, here} \c{$}, Here token
|
||||
\IR{$, here} \c{$}, current address
|
||||
\IR{$, here} here token
|
||||
\IR{$, prefix} \c{$}, prefix
|
||||
\IR{$$} \c{$$} token
|
||||
\IR{%} \c{%} operator
|
||||
@ -118,7 +119,6 @@
|
||||
\IR{^^} \c{^^} operator
|
||||
\IR{|} \c{|} operator
|
||||
\IR{||} \c{||} operator
|
||||
\IR{~} \c{~} operator
|
||||
\IR{%$} \c{%$} and \c{%$$} prefixes
|
||||
\IA{%$$}{%$}
|
||||
\IR{+ opaddition} \c{+} operator, binary
|
||||
@ -127,6 +127,8 @@
|
||||
\IR{- opsubtraction} \c{-} operator, binary
|
||||
\IR{- opunary} \c{-} operator, unary
|
||||
\IR{! opunary} \c{!} operator
|
||||
\IA{~}{~ opunary}
|
||||
\IR{~ opunary} \c{~} operator
|
||||
\IA{A16}{a16}
|
||||
\IA{A32}{a32}
|
||||
\IA{A64}{a64}
|
||||
@ -153,12 +155,16 @@ variables
|
||||
\IR{c calling convention} C calling convention
|
||||
\IR{c symbol names} C symbol names
|
||||
\IA{critical expressions}{critical expression}
|
||||
\IA{command line}{command-line}
|
||||
\IA{command-line}{command line}
|
||||
\IA{comments}{comment}
|
||||
\IR{ccomment} comment, ending in \c{\\}
|
||||
\IA{case sensitivity}{case sensitive}
|
||||
\IA{case-sensitive}{case sensitive}
|
||||
\IA{case-insensitive}{case sensitive}
|
||||
\IA{character constants}{character constant}
|
||||
\IR{codeview debugging format} CodeView debugging format
|
||||
\IR{continuation line} continuation line
|
||||
\IR{continuation line} preprocessor, continuation line
|
||||
\IR{common object file format} Common Object File Format
|
||||
\IR{common variables, alignment in elf} common variables, alignment in ELF
|
||||
\IR{common, elf extensions to} \c{COMMON}, ELF extensions to
|
||||
@ -170,8 +176,8 @@ variables
|
||||
\IR{dll symbols, exporting} DLL symbols, exporting
|
||||
\IR{dll symbols, importing} DLL symbols, importing
|
||||
\IR{dos} DOS
|
||||
\IA{effective address}{effective addresses}
|
||||
\IA{effective-address}{effective addresses}
|
||||
\IA{effective addresses}{effective address}
|
||||
\IA{effective-address}{effective address}
|
||||
\IR{elf} ELF
|
||||
\IR{elf, 16-bit code} ELF, 16-bit code
|
||||
\IR{elf, debug formats} ELF, debug formats
|
||||
@ -241,9 +247,13 @@ variables
|
||||
\IR{plt} PLT
|
||||
\IR{plt} \c{PLT} relocations
|
||||
\IA{pre-defining macros}{pre-define}
|
||||
\IA{preprocessor expressions}{preprocessor, expressions}
|
||||
\IA{preprocessor loops}{preprocessor, loops}
|
||||
\IA{preprocessor variables}{preprocessor, variables}
|
||||
\IR{preprocessor conditionals} preprocessor, conditionals
|
||||
\IR{preprocessor expansions} preprocessor, expansions
|
||||
\IR{preprocessor expressions} preprocessor, expressions
|
||||
\IR{preprocessor loops} preprocessor, loops
|
||||
\IR{preprocessor variables} preprocessor, variables
|
||||
\IR{preprocessor variables} variables, preprocessor
|
||||
\IA{comments}{comment}
|
||||
\IR{relocations, pic-specific} relocations, PIC-specific
|
||||
\IA{repeating}{repeating code}
|
||||
\IR{section alignment, in elf} section alignment, in ELF
|
||||
@ -1164,9 +1174,9 @@ is a macro, a preprocessor directive or an assembler directive: see
|
||||
\c label: instruction operands ; comment
|
||||
|
||||
As usual, most of these fields are optional; the presence or absence
|
||||
of any combination of a label, an instruction and a comment is allowed.
|
||||
Of course, the operand field is either required or forbidden by the
|
||||
presence and nature of the instruction field.
|
||||
of any combination of a label, an instruction and a \i{comment} is
|
||||
allowed. Of course, the operand field is either required or forbidden
|
||||
by the presence and nature of the instruction field.
|
||||
|
||||
NASM uses backslash (\\) as the line continuation character; if a line
|
||||
ends with backslash, the next line is considered to be a part of the
|
||||
@ -2166,10 +2176,23 @@ NASM contains a powerful \i{macro processor}, which supports
|
||||
conditional assembly, multi-level file inclusion, two forms of macro
|
||||
(single-line and multi-line), and a `context stack' mechanism for
|
||||
extra macro power. Preprocessor directives all begin with a \c{%}
|
||||
sign.
|
||||
sign. As a result, some care needs to be taken when using the \c{%}
|
||||
arithmetic operator to avoid it being confused with a preprocessor
|
||||
directive; it is recommended that it always be surrounded by
|
||||
whitespace.
|
||||
|
||||
The preprocessor collapses all lines which end with a backslash (\\)
|
||||
character into a single line. Thus:
|
||||
The NASM preprocessor borrows concepts from both the C preprocessor
|
||||
and the macro facilities of many other assemblers.
|
||||
|
||||
\H{pcsteps} \i{Preprocessor Expansions}
|
||||
|
||||
The input to the preprocessor is expanded in the following ways in the
|
||||
order specified here.
|
||||
|
||||
\S{pcbackslash} \i{Continuation Line} Collapsing
|
||||
|
||||
The preprocessor first collapses all lines which end with a backslash
|
||||
(\c{\\}) character into a single line. Thus:
|
||||
|
||||
\c %define THIS_VERY_LONG_MACRO_NAME_IS_DEFINED_TO \\
|
||||
\c THIS_VALUE
|
||||
@ -2177,8 +2200,122 @@ character into a single line. Thus:
|
||||
will work like a single-line macro without the backslash-newline
|
||||
sequence.
|
||||
|
||||
\IR{comment removal} comment, removal
|
||||
\IR{comment removal} preprocessor, comment removal
|
||||
|
||||
\S{pccomment} \i{Comment Removal}
|
||||
|
||||
After concatenation, comments are removed.
|
||||
\I{comment, syntax}\i{Comments}
|
||||
begin with the character \c{;} unless contained
|
||||
inside a quoted string or a handful of other special contexts.
|
||||
|
||||
\I{ccomment}Note that this is applied \e{after} \i{continuation lines}
|
||||
are collapsed. This means that
|
||||
|
||||
\c add al,'\\' ; Add the ASCII code for \\
|
||||
\c mov [ecx],al ; Save the character
|
||||
|
||||
will probably not do what you expect, as the second line will be
|
||||
considered part of the preceeding comment. Although this behavior is
|
||||
sometimes confusing, it is both the behavior of NASM since the very
|
||||
first version as well as the behavior of the C preprocessor.
|
||||
|
||||
|
||||
\S{pcline}\i\c{%line} directives
|
||||
|
||||
In this step, \i\c{%line} directives are processed. See \k{line}.
|
||||
|
||||
|
||||
\S{pccond}\I{preprocessor conditionals}\I{preprocessor loops}
|
||||
Conditionals, Loops and \i{Multi-Line Macro} Definitions
|
||||
|
||||
In this step, the following \i{preprocessor directives} are processed:
|
||||
|
||||
\b \i{Multi-line macro} definitions, specified by the \i\c{%macro} and
|
||||
\i\c{%imacro} directives. The body of a multi-line macro is stored and
|
||||
is not further expanded at this time. See \k{mlmacro}.
|
||||
|
||||
\b \i{Conditional assembly}, specified by the \i\c{%if} family of preprocessor
|
||||
directives. Disabled part of the source code are discarded and are not
|
||||
futher expanded. See \k{condasm}.
|
||||
|
||||
\b \i{Preprocessor loops}, specified by the \i\c{%rep} preprocessor
|
||||
directive. A preprocessor loop is very similar to a multi-line macro
|
||||
and as such the body is stored and is not futher expanded at this
|
||||
time. See \k{rep}.
|
||||
|
||||
These constructs are required to be balanced, so that the ending of a
|
||||
block can be detected, but no further processing is done at this time;
|
||||
stored blocks will be inserted at this step when they are expanded
|
||||
(see below.)
|
||||
|
||||
It is specific to each directive to what extent \i{inline expansions}
|
||||
and \i{detokenization} are performed for the arguments of the
|
||||
directives.
|
||||
|
||||
|
||||
\S{pcsmacro} \i{Inline expansions} and other \I{preprocessor directives}directives
|
||||
|
||||
In this step, the following expansions are performed on each line:
|
||||
|
||||
\b \i{Single-line macros} are expanded. See \k{slmacro}.
|
||||
|
||||
\b \i{Preprocessor functions} are expanded. See \k{ppfunc}.
|
||||
|
||||
\b If this line is the result of \i{multi-line macro} expansions (see
|
||||
below), the parameters to that macro are expanded at this time. See
|
||||
\k{mlmacro}.
|
||||
|
||||
\b \i{Macro indirection}, using the \i\c{%[]} construct, is expanded. See
|
||||
\k{indmacro}.
|
||||
|
||||
\b Token \i{concatenation} using either the \i\c{%+} operator (see
|
||||
\k{concat%+}) or implicitly (see \k{indmacro} and \k{concat}.)
|
||||
|
||||
\b \i{Macro-local labels} are converted into unique strings, see
|
||||
\k{maclocal}.
|
||||
|
||||
\b Remaining preprocessor \i{directives} are processed. It is specific
|
||||
to each directive to what extend the above expansions or the ones
|
||||
specified in \k{pcfinal} are performed on their arguments.
|
||||
|
||||
|
||||
\S{pcmmacro} \i{Multi-Line Macro Expansion}
|
||||
|
||||
In this step, \i{multi-line macros} are expanded into new lines of
|
||||
source, like the typical macro feature of many other assemblers. See
|
||||
\k{mlmacro}.
|
||||
|
||||
After expansion, the newly injected lines of source are processed
|
||||
starting with the step defined in \k{pccond}.
|
||||
|
||||
|
||||
\S{pcfinal} \i{Detokenization}
|
||||
|
||||
In this step, the final line of source code is produced. It performs
|
||||
the following operations:
|
||||
|
||||
\b Environment variables specified using the \i\c{%!} construct are
|
||||
expanded. See \k{ctxlocal}.
|
||||
|
||||
\b \i{Context-local labels} are expanded into unique strings. See
|
||||
\k{ctxlocal}.
|
||||
|
||||
\b All tokens are converted to their text representation. Unlike the C
|
||||
preprocessor, the NASM preprocessor does not insert whitespace between
|
||||
adjacent tokens unless present in the source code. See \k{concat}.
|
||||
|
||||
The resulting line of text either is sent to the assembler, or, if
|
||||
running in preprocessor-only mode, to the output file (see \k{opt-E});
|
||||
if necessary prefixed by a newly inserted \i\c{%line} directive.
|
||||
|
||||
|
||||
\H{slmacro} \i{Single-Line Macros}
|
||||
|
||||
Single-line macros are expanded inline, much like macros in the C
|
||||
preprocessor.
|
||||
|
||||
\S{define} The Normal Way: \I\c{%idefine}\i\c{%define}
|
||||
|
||||
Single-line macros are defined using the \c{%define} preprocessor
|
||||
@ -2528,6 +2665,8 @@ The expression passed to \c{%assign} is a \i{critical expression}
|
||||
a relocatable reference such as a code or data address, or anything
|
||||
involving a register).
|
||||
|
||||
See also the \i\c{%eval()} preprocessor function, \k{f_eval}.
|
||||
|
||||
|
||||
\S{defstr} Defining Strings: \I\c{%idefstr}\i\c{%defstr}
|
||||
|
||||
@ -2549,6 +2688,8 @@ This can be used, for example, with the \c{%!} construct (see
|
||||
|
||||
\c %defstr PATH %!PATH ; The operating system PATH variable
|
||||
|
||||
See also the \i\c{%str()} preprocessor function, \k{f_str}.
|
||||
|
||||
|
||||
\S{deftok} Defining Tokens: \I\c{%ideftok}\i\c{%deftok}
|
||||
|
||||
@ -2564,6 +2705,8 @@ is equivalent to
|
||||
|
||||
\c %define test TEST
|
||||
|
||||
See also the \i\c{%tok()} preprocessor function, \k{f_tok}.
|
||||
|
||||
|
||||
\S{defalias} Defining Aliases: \I\c{%idefalias}\i\c{%defalias}
|
||||
|
||||
@ -2628,6 +2771,9 @@ or a numeric value) to a single-line macro. When producing a string
|
||||
value, it may change the style of quoting of the input string or
|
||||
strings, and possibly use \c{\\}-escapes inside \c{`}-quoted strings.
|
||||
|
||||
These directives are also available as \i{preprocessor functions}, see
|
||||
\k{ppfunc}.
|
||||
|
||||
\S{strcat} \i{Concatenating Strings}: \i\c{%strcat}
|
||||
|
||||
The \c{%strcat} operator concatenates quoted strings and assign them to
|
||||
@ -2646,6 +2792,9 @@ Similarly:
|
||||
|
||||
The use of commas to separate strings is permitted but optional.
|
||||
|
||||
The corresponding preprocessor function is \c{%strcat()}, see
|
||||
\k{f_strcat}.
|
||||
|
||||
|
||||
\S{strlen} \i{String Length}: \i\c{%strlen}
|
||||
|
||||
@ -2665,6 +2814,9 @@ macro that expands to a string, as in the following example:
|
||||
As in the first case, this would result in \c{charcnt} being
|
||||
assigned the value of 9.
|
||||
|
||||
The corresponding preprocessor function is \c{%strlen()}, see
|
||||
\k{f_strlen}.
|
||||
|
||||
|
||||
\S{substr} \i{Extracting Substrings}: \i\c{%substr}
|
||||
|
||||
@ -2689,11 +2841,126 @@ values out of range result in an empty string. A negative length
|
||||
means "until N-1 characters before the end of string", i.e. \c{-1}
|
||||
means until end of string, \c{-2} until one character before, etc.
|
||||
|
||||
The corresponding preprocessor function is \c{%substr()}, see
|
||||
\k{f_substr}.
|
||||
|
||||
|
||||
\H{ppfunc} \i{Preprocessor Functions}
|
||||
|
||||
Preprocessor functions are, fundamentally, a kind of built-in
|
||||
single-line macros. They expand to a string depending on its
|
||||
arguments, and can be used in any context where single-line macro
|
||||
expansion would be performed. Preprocessor functions were introduced
|
||||
in NASM 2.16.
|
||||
|
||||
\S{f_eval} \i\c{%eval()} Function
|
||||
|
||||
The \c{%eval()} function evaluates its argument as a numeric
|
||||
expression in much the same way the \i\c{%assign} directive would, see
|
||||
\k{assign}. Unlike \c{%assign}, \c{%eval()} supports more than one
|
||||
argument; if more than one argument is specified, it is expanded to a
|
||||
comma-separated list of values.
|
||||
|
||||
\c %assign a 2
|
||||
\c %assign b 3
|
||||
\c %defstr what %expr(a+b,a*b) ; equivalent to %define what "5,6"
|
||||
|
||||
The expressions passed to \c{%eval()} are \i{critical expressions},
|
||||
see \k{crit}.
|
||||
|
||||
|
||||
\S{f_is} \i\c{%is()} Family Functions
|
||||
|
||||
Each \i\c{%if} family directive (see \k{condasm}) has an equivalent
|
||||
\c{%is()} family function, that expands to \c{1} if the equivalent
|
||||
\c{%if} directive would process as true, and \c{0} if the equivalent
|
||||
\c{%if} directive would process as false.
|
||||
|
||||
\c ; Instead of !%isidn() could have used %isnidn()
|
||||
\c %if %isdef(foo) && !%isidn(foo,bar)
|
||||
\c db "foo is defined, but not as 'bar'"
|
||||
\c %endif
|
||||
|
||||
Note that, being functions, the arguments (before expansion) will
|
||||
always need to have balanced parentheses so that the end of the
|
||||
argument list can be defined. This means that the syntax of
|
||||
e.g. \c{%istoken()} and \c{%isidn()} is somewhat stricter than their
|
||||
corresponding \c{%if} directives; it may be necessary to escape the
|
||||
argument to the conditional using \c{\{\}}:
|
||||
|
||||
\c ; Instead of !%isidn() could have used %isnidn()
|
||||
\c %if %isdef(foo) && !%isidn({foo,)})
|
||||
\c db "foo is defined, but not as ')'"
|
||||
\c %endif
|
||||
|
||||
|
||||
\S{f_str} \i\c\{%str()} Function
|
||||
|
||||
The \c{%str()} function converts its argument, including any commas,
|
||||
to a quoted string, similar to the way the \i\c{%defstr} directive
|
||||
would, see \k{defstr}.
|
||||
|
||||
Being a function, the argument will need to have balanced parentheses
|
||||
or be escaped using \c{\{\}}.
|
||||
|
||||
\c ; The following lines are all equivalent
|
||||
\c %define test 'TEST'
|
||||
\c %defstr test TEST
|
||||
\c %define test %str(TEST)
|
||||
|
||||
|
||||
\S{f_strcat} \i\c\{%strcat()} Function
|
||||
|
||||
The \c{%strcat()} function concatenates a list of quoted strings, in
|
||||
the same way the \i\c{%strcat} directive would, see \k{strcat}.
|
||||
|
||||
\c ; The following lines are all equivalent
|
||||
\c %define alpha 'Alpha: 12" screen'
|
||||
\c %strcat alpha "Alpha: ", '12" screen'
|
||||
\c %define alpha %strcat("Alpha: ", '12" screen')
|
||||
|
||||
|
||||
\S{f_strlen} \i\c{%strlen()} Function
|
||||
|
||||
The \c{%strlen()} function expands to the length of a quoted string,
|
||||
in the same way the \i\c{%strlen} directive would, see \k{strlen}.
|
||||
|
||||
\c ; The following lines are all equivalent
|
||||
\c %define charcnt 9
|
||||
\c %strlen charcnt 'my string'
|
||||
\c %define charcnt %strlen('my string')
|
||||
|
||||
|
||||
\S{f_substr} \i\c\{%substr()} Function
|
||||
|
||||
The \c{%substr()} function extracts a substring of a quoted string, in
|
||||
the same way the \i\c{%substr} directive would, see \k{substr}. Note
|
||||
that unlike the \c{%substr} directive, a comma is required after the
|
||||
string argument.
|
||||
|
||||
\c ; The following lines are all equivalent
|
||||
\c %define mychar 'yzw'
|
||||
\c %substr mychar 'xyzw' 2,-1
|
||||
\c %define mychar %substr('xyzw',2,-1)
|
||||
|
||||
|
||||
\S{f_tok} \i\c{%tok()} function
|
||||
|
||||
The \c{%tok()} function converts a quoted string into a sequence of
|
||||
tokens, in the same way the \i\c{%deftok} directive would, see
|
||||
\k{deftok}.
|
||||
|
||||
\c ; The following lines are all equivalent
|
||||
\c %define test TEST
|
||||
\c %deftok test 'TEST'
|
||||
\c %define test %tok('TEST')
|
||||
|
||||
|
||||
\H{mlmacro} \i{Multi-Line Macros}: \I\c{%imacro}\i\c{%macro}
|
||||
|
||||
Multi-line macros are much more like the type of macro seen in MASM
|
||||
and TASM: a multi-line macro definition in NASM looks something like
|
||||
Multi-line macros much like the type of macro seen in MASM
|
||||
and TASM, and expand to a new set of lines of source code.
|
||||
A multi-line macro definition in NASM looks something like
|
||||
this.
|
||||
|
||||
\c %macro prologue 1
|
||||
@ -4614,6 +4881,7 @@ It is still possible to turn in on again by
|
||||
Note that \c{SECTALIGN <ON|OFF>} affects only the \c{ALIGN}/\c{ALIGNB} directives,
|
||||
not an explicit \c{SECTALIGN} directive.
|
||||
|
||||
|
||||
\C{macropkg} \i{Standard Macro Packages}
|
||||
|
||||
The \i\c{%use} directive (see \k{use}) includes one of the standard
|
||||
|
Loading…
Reference in New Issue
Block a user