Show an artificial case where we bind to the wrong symbol, due to the
confusion in the output system between the size of relative symbols
and their position.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Add a new opcode for 32->64 bit sign-extended immediate, with warning
on the number not matching.
This unfortunately calls for an audit of all the \4[0123] opcodes, if
they should be replaced by \25[4567]. This only replaces one
instruction (MOV reg64,imm32); other instructions need to be
considered.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
New opcodes to deal with 8-bit immediates which are then sign-extended
to the operand size. These allow us to warn appropriately.
Not sure I'm using these in all the proper places; need audit of all
uses of the \14..\17 opcodes.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
When there is an immediate in the instruction, a RIP-relative offset
may not be relative to the end of the offset itself, since it is
relative to the end of the *instruction*, not the end of the *offset*.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
More tests for immediate warnings, with notes for the ones where we
currently fail to do the right thing.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Using hidden files are rather antisocial, and rather pointless in this
particular context. Change .stdout and .stderr to simply stdout and
stderr.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Do a best attempt at a comprehensive test of the various CVT* SSE
instructions. This includes the bug of BR 2148476.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
There exists a fair bit of code out there which relies on the
optimizer in order to fit inside a predefined envelope. NASM 2.04rc4
breaks this; write a simple test to demonstrate.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Now that there is proper forward reference resolution,
we can get rid of this junk. Wiping the flags also
removed the SBYTEnn flags, causing
cmp eax, a-b
a: nop
b:
to assemble with -Ox like
cmp eax, strict dword -1
This is now fixed.
The testcase illustrates the problem. After "nasm -f obj
alonesym.nasm"
let's look to dump:
======
PUBDEF386(91) recnum:5, offset:0000005bh, len:03f9h, chksum:bbh(bb)
Group: 0, Seg: 1
00020000h - 'sym0000' Type:0
00020004h - 'sym0001' Type:0
....
00020134h - 'sym0077' Type:0
PUBDEF(90) recnum:6, offset:00000457h, len:000ah, chksum:b6h(b6)
Group: 0, Seg: 1
00000138h - 's' Type:2
0000b600h - '' Type:0
======
The problem is while 's' offset is 20138h it is marked as type 90h not
91h. The root cause is located in obj_x():
static ObjRecord *obj_x(ObjRecord * orp, uint32_t val)
{
if (orp->type & 1)
orp->x_size = 32;
if (val > 0xFFFF)
orp = obj_force(orp, 32);
if (orp->x_size == 32)
return (obj_dword(orp, val));
orp->x_size = 16;
return (obj_word(orp, val));
}
It sets up x_size and than writes data. In the testcase data are the
offset and this offset overflows a record. In this case the record is
emitted and its x_size is cleared. Because this is last PUBDEF the new
record with only 's' symbol is emitted also but its x_size is not 32
(it's still zero) so obj_fwrite doesn't switch to 91h type.
The problem seems to be very generic and expected to be occurred on
many other record types as well.
----
And the fix is simple:
if (orp->x_size == 32)
{
ObjRecord * nxt = obj_dword(orp, val);
nxt->x_size = 32; /* x_size is cleared when a record overflows */
return nxt;
}
BR 1974170: VCVTPD2PS, VCVTPD2DQ, VCVTTPD2DQ with a memory operand are
ambiguous without a specific operand size, so force one to be added.
Split the instruction pattern due to our current clunky handling of
MMX/XMM/YMM registers together with sizes. Fix in the future, please!
Support is4 bytes without meaningful information in the bottom bits.
This is equivalent to /is4=0 for the assembler, but makes the bottom
bits don't care for the disassembler.
Initial NDISASM support for AVX instructions and VEX prefixes. It
doesn't mean it's correct, but it seems to match my current
understanding. It can disassemble *some*, but not *all*, of the AVX
test cases (which are known to be at least partially incorrect...)
First cut at AVX machinery support. The only instruction implemented
is VPERMIL2PS, and it's probably buggy. I'm checking this in with the
hope that other people can start helping out with (a) testing this,
and (b) adding instructions.
NDISASM support is not there yet.
This checkin creates the following date and time macros:
__DATE__, __TIME__, __UTC_DATE__, __UTC_TIME__: strings
__DATE_NUM__, __TIME_NUM__, __UTC_DATE_NUM__, __UTC_TIME_NUM__:
civil dates in digit-string formats
__POSIX_TIME__: time in POSIX time_t format
Support the zero-operand form of floating-point instructions. Note
that in most cases, the form generated is actually the "popping" form,
e.g. "FADD" becomes "FADDP st0,st1". This is in accordance with the
Intel documentation. "FADDP" is also supported.
Un-special-case "xchg rax,rax"; allow it to be encoded as 48 90 for
orthogonality's sake. It's a no-op, to be sure, but so are many other
instructions.
"xchg eax,eax" is still special-cased in 64-bit mode since it is not a
no-op; unadorned opcode 90 is now simply "nop" and nothing else.
Make the disassembler detect unused REX.W and display them as an "o64"
prefix.
Correct the implementation of %arg and %local.
It's questionable how much they make sense for 64-bit mode; even in
32-bit mode one normally make references off the stack pointer instead
of the base pointer (frame pointer), but that requires keeping track
of the stack pointer offset.
Permit opcode names to be used as labels if and only if they are
succeeded by a colon. Opcode names occurring when parsing expressions
are all treated as labels; a leading colon occurred when parsing an
instruction forces a parser restart with the instruction forcibly
treated as an identifier.
Use a 32-bit limb size ("like a digit, but bigger") for floating-point
conversion. This cuts the number of multiplications per constant by a
factor of four.
This means supporting fractional-limb-sized numbers, so while we're at
it, add support for 8-bit floating point numbers (apparently used in
graphics and in audio compression applications.)
Correct the handling of floating-point tokens in the preprocessor.
The preprocessor scanner and the main scanner really are painfully
divergent for no good reason.
For consistency, support binary and octal floating-point, and accept
a "0d" or "0t" prefix for decimal floating-point. However, we do not
accept a binary exponent (p) for a decimal mantissa, or vice versa.
Allow any radix letter from the set [bydtoqhx] to be used either
"Intel-style" (0...x) or "C-style" (0x...). In Intel style, the
leading 0 remains optional as long as the first digit is in the range
0-9.
As a consequence, allow the prefix "0h" for hexadecimal floating
point.
Since we allow the prefix $ instead of 0x for integer constants, do
the same for floating point. No suffix support at this time; we may
want to consider if that would be appropriate.
1e30 is a floating-point constant, but 1e30h is not. The scanner
won't know that until it sees the "h", so make sure we keep enough
state to be able to distinguish "1e30" (a possible hex constant) from
"1.e30", "1e+30" or "1.0" (unabiguously floating-point.)
Remove bogus "treat labels different from immediates" code, which
would result in generating of a relative mod/rm but without adjusting
the address accordingly.
Update addressing mode test.
Add special operators to allow the use of floating-point constants in
contexts other than DW/DD/DQ/DT/DO.
As part of this checkin, make MAX_KEYWORD generated by tokhash.pl,
since it knows what all the keywords are so it can tell which one is
the longest.
INVLPGA is defined as taking rax,ecx but "the portion of rax used to
form the address is determined by the effective address size", so it
is really ax/eax/rax.
Unify all the standard IEEE formats into one function, add support for
IEEE standard 128-bit floating point numbers.
The 80-bit format is still special since it explicitly represents the
integer portion.
This checkin completes what is required to actually generate SSE5
instructions. No support in the disassembler yet.
This checkin covers:
- Support for actually generating DREX prefixes.
- Support for matching operand "operand X must match Y"
Simple scripts to generate performance benchmarks for label,
macro and token lookups. The label and macro lookups are simple
numerical sequences; it may be desirable to add some more
sophisticated algorithms for producing tokens in case we want to
compare different hash functions against each other.