Fix the printing of the macro stack: we need to follow the
mstk->next_active list, not mstk->next, and we need to reverse the
order so that the highest-level inclusion comes first.
Since this should be a rare or at least performance-insensitive
operation, do it using simple function recursion.
Finally, add an ellipsis before the "from macro" message; it greatly
enhances readability.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
It can be hard to find errors inside potentially nested macros.
Show the mmacro expansion stack when printing diagnostics.
Note that a list file doesn't help for errors that are detected
before the code-generation pass.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Issue a specific suppressible warning if we encounter the PTR keyword.
This usually indicates someone mistakenly using MASM syntax in NASM.
This introduces a generic infrastructure for issuing warnings for such
keywords.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
When a local label was seen, the debug backend would not receive the
full label name! In order to both simplify the code and avoid this
kind of discrepancy again, make both the output and debug format calls
from a common static function.
However, none of the current debug format backends want to see NASM
special symbols (that start with .. but not ..@) so filter those from
the debug backend.
Finally, fix an incorrect comment in nasm.h: the debug format is
called *after* the output format.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Jim Kukunas <james.t.kukunas@linux.intel.com>
The fix for BR 3392278:
aa29b1d93f assemble.c: Don't drop rex prefix from instruction itself
... would cause multiple REX prefixes to be emitted for some
instructions. Create a new flag to indicate that REX has already been
emitted, which can be cleared for each instance of an instruction.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
BR 3392275 complains about xmm0 having to be explicitly included in
the assembly syntax when it is implicit in the encoding. In the
interest of "be liberal in what you accept", accept either form in the
input.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Allow specifying {vex3} or {vex2} (the latter is currently always
redundant, unless we end up with instructions at some point can be
specified with legacy prefixes or VEX) to select a specific encoding
of VEX-encoded instructions.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Since the multi-line macro preprocessor is modified to expand
grouped parameter with braces. The escape character is not needed
any more.
The testcase converter script is also modified not to generate '\'.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Only generate a signed relocation if the displacement size is less
than the address size. This matters when involving address size
overrides.
It is technically impossible to do this one perfectly, because it is
never really knowable if the displacement offset is used as a base or
an index.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Reverted the redundant branch instruction patterns for bnd prefix.
And when a relaxed jmp instruction becomes a short (Jb) form,
bnd prefix is not needed because it does not initialize bnd registers.
So in that case, bnd prefix is silently dropped.
BND JMP foo -> drops bnd prefix
BND JMP short foo -> shows an explicit error
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
GAS uses *1 multiplier for explicitly marking an index register in mib operand.
e.g.) [rdx * 1 + 3] is equivalent to [3, rdx] in NASM's split EA format
So only for mib operands, this is encoded same as gas does.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
MPX test asm files are added. These include all three different styles of
mib syntax (NASM, ICC and gas).
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Mostly intended for the "mib" expressions in BNDLDX/BNDSTX.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
test/avx512*.asm files are now tested by using perfomtest.pl
Refer to pefomtest help message for the usage.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Added Prefetch (AVX-512PF) instructions.
These instructions are supported
if CPUID.(EAX=07H, ECX=0):EBX.AVX512PF[bit 26] = 1.
CPUID feature flag for PREFETCHWT1 is TBD
but PREFETCHWT1 is included in this commit.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Added Exponential and Reciprocal (AVX-512ER) instructions.
These instructions are supported
if CPUID.(EAX=07H, ECX=0):EBX.AVX512ER[bit 27] = 1.
IF_AVX512 is now shared by all AVX-512* instructions as a bit mask.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Added Conflict Detection (AVX-512CD) instructions.
These instructions are supported
if CPUID.(EAX=07H, ECX=0):EBX.AVX512CD[bit 28] = 1.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Added three-operand pseudo-ops for VCMPPD, VPCMPD and so on.
Test case is also updated to validate them.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Added K* instructions test cases in test/avx512f.asm.
The previous test case from GNU AS were repeating the same instruction twice,
so the repeated half part is removed.
Changed the python script (gas2nasm.py) to include opmask instructions.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Fixed or purged some old comments and added a comment for a previous patch.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
From gas testsuite file, a text file containing raw bytecodes
is useful when verifying the output of NASM.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
This was converted from a gas testsuite.
(gas/testsuite/gas/i386/x86-64-avx512f-intel.d)
A python script that is used for converting is also included.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
There are two instructions (VGATHERQPS, VPGATHERQD) where the only
separation between two forms is the vector length given to the vector
SIB. This means the *matcher* has to be able to distinguish
instructions by vector SIB length and the matcher only operates on the
operands and the instruction flags, not on the bytecode.
Export the vector index-ness into the operand flags and add to the
matcher.
This resolves BR 3392260.
Reported-by: Agner <agner@anger.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
The moffset opcodes A2 and A3 do not support HLE. Unfortunately
checkin
fb3f4e6d HLE: Change NOHLE to be an instruction flag
... inadvertently lost the NOHLE flag for opcode A2.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Try to implement the handling of MOVD as attempted in checkin:
70712c0df6
and reverted in:
d279fbbd80
due to BR3392199. This time make sure to use the SX flag to only
match when a size is explicitly given, and also don't duplicate the 0F
6F/7F opcodes, which are documented as MOVQ by AMD as well as Intel.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The way our matching system works we have to make NOHLE an instruction
flag rather than an byte code; by the time we run the byte code
interpreter we have already picked an instruction pattern once and for
all.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Clean up JMP/CALL patterns so they don't disassemble quite so uglily.
Fix a CALL pattern which would have incorrectly generated a (harmless)
REX.W prefix.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
It is more logical, it cleans up the code and it makes implicit
operand size override prefixes come out in the same order as explicit
ones instead of after all other prefixes.
Suggested-by: H. Peter Anvin <hpa@zytor.com>
The implicit operand size override code didn't set the operand size
prefix, which confused the size calculation code for the range check.
The BITS 64 operand size calculation is still off, but "fixing" it by
making it 32-bit unless REX.W is set breaks PUSH and maybe others.
calcsize() had the wrong criterion for when C5 prefixes are permitted
(REX.R is permitted, REX.X is forbidden.) assemble() had the right
test already. This caused symbol value errors.
The implicit operand size override code didn't set the operand size
prefix, which confused the size calculation code for the range check.
The BITS 64 operand size calculation is still off, but "fixing" it by
making it 32-bit unless REX.W is set breaks PUSH and maybe others.
Handle immediate-size optimization for "mov r64,imm" -- reduce it to
"mov r32,imm32" or "mov r64,imm32" as appropriate.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
H. Peter Anvin pointed
|
| Btw, test/imm64.asm needs test engine annotations.
|
Make it so.
Reported-by: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Two fixes:
1. Optimization of [bx+0xFFFF] etc
0xFFFF is an sbyte under 16-bit semantics,
so make sure to check it right.
2. Don't optimize displacements in -O0
Displacements that fit into an sbyte or
can be removed should *not* be optimized in -O0.
Implicit zero displacements are still optimized, e.g.:
[eax] -> 0 bit displacement, [ebp] -> 8 bit displacement.
However explicit displacements are not optimized:
[eax+0] -> 32 bit displacement, [ebp+0] -> 32 bit displacement.
Because #2 breaks compatibility with 0.98,
I introduced a new optimization level: -OL, legacy.
Under particular circumstances %strlen may cause SIGSEG. A typical
example is %strlen with nonexistent macro argument.
[ Testcase test/strlen.asm ]
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Allow non-identifier characters in the name of environment variables,
by surrounding them with string quotes (subject to ordinary
string-quoting rules.)
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Add the RD*SBASE, WR*SBASE and RDRAND instructions from version 7 of
the AVX specification, Intel document 319433-007.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The former changes have been committed to binutils.
From initial message:
|
| 2010-03-22 Quentin Neill <quentin.neill@amd.com>
| Sebastian Pop <sebastian.pop@amd.com>
|
| opcodes/
| * i386-dis.c (OP_LWP_I): Removed.
| (reg_table): Do not use OP_LWP_I, use Iq.
| (OP_LWPCB_E): Remove use of names16.
| (OP_LWP_E): Same.
| * i386-opc.tbl: Removed 16bit LWP insns. 32bit LWP insns
| should not set the Vex.length bit.
| * i386-tbl.h: Regenerated.
|
| gas/
| * testsuite/gas/i386/x86-64-lwp.s: Remove use of 16bit LWP insns.
| * testsuite/gas/i386/lwp.s: Same.
| * testsuite/gas/i386/x86-64-lwp.d: Updated.
| * testsuite/gas/i386/lwp.d: Updated.
|
So there is no 16 bit instructions anymore.
Also xop.l field should be set to 0.
Based on patch from nasm64developer
Reported-by: nasm64developer
Signed-off-by: nasm64developer
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Check if the offset and the representation are equivalent.
Disallow REL on absolute addresses.
I'm not sure what that would mean and the output formats don't support it.
Warn about ignored displacement size modifiers.
Unify the token-pasting code between the macro expansion and the
preprocessor parameter case. Parameterize whether or not to handle %+
tokens during expansion (%+ tokens have late binding semantics.)
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
"+" can be a separate token that ends up having to get pulled into the
middle of a floating-point constant. It's not even that strange.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Especially when token pasting involves floating-point numbers, we can
have some really strange effects from token pasting: for example,
pasting the two tokens "xyzzy" and "1e+10" ends up with *three*
tokens: "xyzzy1e" "+" "10". The easiest way to deal with this is to
explicitly combine the string and then run tokenize() on it.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>