It is more logical, it cleans up the code and it makes implicit
operand size override prefixes come out in the same order as explicit
ones instead of after all other prefixes.
Suggested-by: H. Peter Anvin <hpa@zytor.com>
The implicit operand size override code didn't set the operand size
prefix, which confused the size calculation code for the range check.
The BITS 64 operand size calculation is still off, but "fixing" it by
making it 32-bit unless REX.W is set breaks PUSH and maybe others.
calcsize() had the wrong criterion for when C5 prefixes are permitted
(REX.R is permitted, REX.X is forbidden.) assemble() had the right
test already. This caused symbol value errors.
The implicit operand size override code didn't set the operand size
prefix, which confused the size calculation code for the range check.
The BITS 64 operand size calculation is still off, but "fixing" it by
making it 32-bit unless REX.W is set breaks PUSH and maybe others.
Handle immediate-size optimization for "mov r64,imm" -- reduce it to
"mov r32,imm32" or "mov r64,imm32" as appropriate.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
H. Peter Anvin pointed
|
| Btw, test/imm64.asm needs test engine annotations.
|
Make it so.
Reported-by: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Two fixes:
1. Optimization of [bx+0xFFFF] etc
0xFFFF is an sbyte under 16-bit semantics,
so make sure to check it right.
2. Don't optimize displacements in -O0
Displacements that fit into an sbyte or
can be removed should *not* be optimized in -O0.
Implicit zero displacements are still optimized, e.g.:
[eax] -> 0 bit displacement, [ebp] -> 8 bit displacement.
However explicit displacements are not optimized:
[eax+0] -> 32 bit displacement, [ebp+0] -> 32 bit displacement.
Because #2 breaks compatibility with 0.98,
I introduced a new optimization level: -OL, legacy.
Under particular circumstances %strlen may cause SIGSEG. A typical
example is %strlen with nonexistent macro argument.
[ Testcase test/strlen.asm ]
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Allow non-identifier characters in the name of environment variables,
by surrounding them with string quotes (subject to ordinary
string-quoting rules.)
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Add the RD*SBASE, WR*SBASE and RDRAND instructions from version 7 of
the AVX specification, Intel document 319433-007.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The former changes have been committed to binutils.
From initial message:
|
| 2010-03-22 Quentin Neill <quentin.neill@amd.com>
| Sebastian Pop <sebastian.pop@amd.com>
|
| opcodes/
| * i386-dis.c (OP_LWP_I): Removed.
| (reg_table): Do not use OP_LWP_I, use Iq.
| (OP_LWPCB_E): Remove use of names16.
| (OP_LWP_E): Same.
| * i386-opc.tbl: Removed 16bit LWP insns. 32bit LWP insns
| should not set the Vex.length bit.
| * i386-tbl.h: Regenerated.
|
| gas/
| * testsuite/gas/i386/x86-64-lwp.s: Remove use of 16bit LWP insns.
| * testsuite/gas/i386/lwp.s: Same.
| * testsuite/gas/i386/x86-64-lwp.d: Updated.
| * testsuite/gas/i386/lwp.d: Updated.
|
So there is no 16 bit instructions anymore.
Also xop.l field should be set to 0.
Based on patch from nasm64developer
Reported-by: nasm64developer
Signed-off-by: nasm64developer
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Check if the offset and the representation are equivalent.
Disallow REL on absolute addresses.
I'm not sure what that would mean and the output formats don't support it.
Warn about ignored displacement size modifiers.
Unify the token-pasting code between the macro expansion and the
preprocessor parameter case. Parameterize whether or not to handle %+
tokens during expansion (%+ tokens have late binding semantics.)
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
"+" can be a separate token that ends up having to get pulled into the
middle of a floating-point constant. It's not even that strange.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Especially when token pasting involves floating-point numbers, we can
have some really strange effects from token pasting: for example,
pasting the two tokens "xyzzy" and "1e+10" ends up with *three*
tokens: "xyzzy1e" "+" "10". The easiest way to deal with this is to
explicitly combine the string and then run tokenize() on it.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Show an artificial case where we bind to the wrong symbol, due to the
confusion in the output system between the size of relative symbols
and their position.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Add a new opcode for 32->64 bit sign-extended immediate, with warning
on the number not matching.
This unfortunately calls for an audit of all the \4[0123] opcodes, if
they should be replaced by \25[4567]. This only replaces one
instruction (MOV reg64,imm32); other instructions need to be
considered.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
New opcodes to deal with 8-bit immediates which are then sign-extended
to the operand size. These allow us to warn appropriately.
Not sure I'm using these in all the proper places; need audit of all
uses of the \14..\17 opcodes.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
When there is an immediate in the instruction, a RIP-relative offset
may not be relative to the end of the offset itself, since it is
relative to the end of the *instruction*, not the end of the *offset*.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
More tests for immediate warnings, with notes for the ones where we
currently fail to do the right thing.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Using hidden files are rather antisocial, and rather pointless in this
particular context. Change .stdout and .stderr to simply stdout and
stderr.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Do a best attempt at a comprehensive test of the various CVT* SSE
instructions. This includes the bug of BR 2148476.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
There exists a fair bit of code out there which relies on the
optimizer in order to fit inside a predefined envelope. NASM 2.04rc4
breaks this; write a simple test to demonstrate.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Now that there is proper forward reference resolution,
we can get rid of this junk. Wiping the flags also
removed the SBYTEnn flags, causing
cmp eax, a-b
a: nop
b:
to assemble with -Ox like
cmp eax, strict dword -1
This is now fixed.
The testcase illustrates the problem. After "nasm -f obj
alonesym.nasm"
let's look to dump:
======
PUBDEF386(91) recnum:5, offset:0000005bh, len:03f9h, chksum:bbh(bb)
Group: 0, Seg: 1
00020000h - 'sym0000' Type:0
00020004h - 'sym0001' Type:0
....
00020134h - 'sym0077' Type:0
PUBDEF(90) recnum:6, offset:00000457h, len:000ah, chksum:b6h(b6)
Group: 0, Seg: 1
00000138h - 's' Type:2
0000b600h - '' Type:0
======
The problem is while 's' offset is 20138h it is marked as type 90h not
91h. The root cause is located in obj_x():
static ObjRecord *obj_x(ObjRecord * orp, uint32_t val)
{
if (orp->type & 1)
orp->x_size = 32;
if (val > 0xFFFF)
orp = obj_force(orp, 32);
if (orp->x_size == 32)
return (obj_dword(orp, val));
orp->x_size = 16;
return (obj_word(orp, val));
}
It sets up x_size and than writes data. In the testcase data are the
offset and this offset overflows a record. In this case the record is
emitted and its x_size is cleared. Because this is last PUBDEF the new
record with only 's' symbol is emitted also but its x_size is not 32
(it's still zero) so obj_fwrite doesn't switch to 91h type.
The problem seems to be very generic and expected to be occurred on
many other record types as well.
----
And the fix is simple:
if (orp->x_size == 32)
{
ObjRecord * nxt = obj_dword(orp, val);
nxt->x_size = 32; /* x_size is cleared when a record overflows */
return nxt;
}
BR 1974170: VCVTPD2PS, VCVTPD2DQ, VCVTTPD2DQ with a memory operand are
ambiguous without a specific operand size, so force one to be added.
Split the instruction pattern due to our current clunky handling of
MMX/XMM/YMM registers together with sizes. Fix in the future, please!
Support is4 bytes without meaningful information in the bottom bits.
This is equivalent to /is4=0 for the assembler, but makes the bottom
bits don't care for the disassembler.
Initial NDISASM support for AVX instructions and VEX prefixes. It
doesn't mean it's correct, but it seems to match my current
understanding. It can disassemble *some*, but not *all*, of the AVX
test cases (which are known to be at least partially incorrect...)
First cut at AVX machinery support. The only instruction implemented
is VPERMIL2PS, and it's probably buggy. I'm checking this in with the
hope that other people can start helping out with (a) testing this,
and (b) adding instructions.
NDISASM support is not there yet.
This checkin creates the following date and time macros:
__DATE__, __TIME__, __UTC_DATE__, __UTC_TIME__: strings
__DATE_NUM__, __TIME_NUM__, __UTC_DATE_NUM__, __UTC_TIME_NUM__:
civil dates in digit-string formats
__POSIX_TIME__: time in POSIX time_t format
Support the zero-operand form of floating-point instructions. Note
that in most cases, the form generated is actually the "popping" form,
e.g. "FADD" becomes "FADDP st0,st1". This is in accordance with the
Intel documentation. "FADDP" is also supported.
Un-special-case "xchg rax,rax"; allow it to be encoded as 48 90 for
orthogonality's sake. It's a no-op, to be sure, but so are many other
instructions.
"xchg eax,eax" is still special-cased in 64-bit mode since it is not a
no-op; unadorned opcode 90 is now simply "nop" and nothing else.
Make the disassembler detect unused REX.W and display them as an "o64"
prefix.
Correct the implementation of %arg and %local.
It's questionable how much they make sense for 64-bit mode; even in
32-bit mode one normally make references off the stack pointer instead
of the base pointer (frame pointer), but that requires keeping track
of the stack pointer offset.
Permit opcode names to be used as labels if and only if they are
succeeded by a colon. Opcode names occurring when parsing expressions
are all treated as labels; a leading colon occurred when parsing an
instruction forces a parser restart with the instruction forcibly
treated as an identifier.