Intel APX (Advanced Performance Extensions) introduces multiple new features, mostly to existing instructions. APX is only available in 64-bit mode.
There are 16 new general purpose registers, R16 to
R31.
Many instructions now support a non-destructive destination operand.
The ability to suppress the setting of the arithmetic flags.
The ability to zero the upper parts of a full 64-bit register for 8- and 16-bit operation size instructions. (This zeroing is always performed for 32-bit operations; this has been the case since 64-bit mode was first introduced.)
New instructions to conditionally set the arithmetic flags to a user-specified value.
Performance-enhanced versions of the PUSH and
POP instructions.
A 64-bit absolute jump instruction.
A new REX2 prefix.
See
https://www.nasm.us/specs/apx
for a link to the APX technical documentation. NASM generally follows the
syntax specified in the Assembly Syntax Recommendations for Intel
APX document although some syntax is relaxed, see below.
When it comes to register size, the new registers
(R16–R31) work the same way as registers
R8–R15 (see also
section 12.1):
R31 is the 64-bit form of register 31,
R31D is the 32-bit form,
R31W is the 16-bit form, and
R31B is the 8-bit form. The form R31L can also
be used if the altreg macro package is used
(%use altreg), see section
6.1.
Extended registers require that either a REX2 prefix (the default, if possible) or an EVEX prefix is used.
There are some instructions that don't support EGPRs. In that case, NASM will generate an error if they are used.
Using the new data destination register (when supported) is specified by
adding an additional register in place of the first operand. For example an
ADD instruction:
add rax, rbx, rcx
... which would add RBX and RCX and store the
result in RAX, without modifying neither RBX nor
RCX.
The {nf} prefix on a supported instruction inhibits the
update of the flags, for example:
{nf} add rax, rbx
... will add RAX and RBX together, storing the
result in RAX, while leaving the flags register unchanged.
NASM also allows the {nf} prefix (or any other curly-brace
prefix) to be specified after the instruction mnemonic. Spaces
around curly-brace prefixes are optional:
{nf} add rax, rbx ; Standard syntax
{nf}add rax, rbx ; Prefix without space
add {nf} rax, rbx ; Suffix syntax
add{nf} rax, rbx ; Suffix without space
The {zu} prefix can be used meaning – "zero-upper",
which disables retaining the upper parts of the registers and instead
zero-extends the value into the full 64-bit register when the operand size
is 8 or 16 bits (this is always done when the operand size is 32 bits, even
without APX). For example:
{zu} setb al
... zeroes out bits [63:8] of the RAX register. For this
specific instruction, NASM also eccepts these alternate syntaxes:
{zu} setb ax
setb {zu} al
setb {zu} ax
setb {zu} eax
setb {zu} rax
setb eax
setb rax
The source condition code (Scc) instructions,
CCMPScc and CTESTScc,
perform a test which if successful set the arithmetic flags to a user
specfied value and otherwise leave them unchanged.
NASM allows the resulting default flags value to be specified
either using the {dfv=}...} syntax, containing a
comma-separated list of zero or more of the CPU flags OF,
SF, ZF or CF or simply as a numeric
immediate (with OF, SF, ZF and
CF being represented by bits 3 to 0 in that order.)
The PF flag is always set to the same value as the
CF flag, and the AF flag is always cleared. NASM
allows {dfv=pf} as an alias for {dfv=cf}, but do
note that it still affects both flags.
NASM allows, but does not require, a comma after the {dfv=}
value; when using the immediate syntax a comma is required; these examples
all produce the same instruction:
ccmpl {dfv=of,cf} rdx, r30
ccmpl {dfv=of,cf}, rdx, r30
ccmpl 0x9, rdx, r30 ; Comma required
The immediate syntax also allows for the {dfv=} values to
be stored in a symbol, or having arithmetic done on them. Note that when
used in an expression, or in contexts other than EQU or one of
the Scc instructions, parenteses are required; this
is a safety measure (programmer needs to explicitly indicate that use as an
expression is what is intended):
ccmpl ({dfv=of}|{dfv=cf}), rdx, r30 ; Parens, comma required
ocf1 equ {dfv=of,cf} ; Parens not required
ccmpl ocf1, rdx, r30 ; Comma required
ofcf equ ({dfv=of,sf,cf} & ~{dfv=sf}) ; Parens required
ccmpl ofcf2, rdx, r30 ; Comma required
PUSH and POP ExtensionsAPX adds variations of the PUSH and POP
instructions that:
informs the CPU that a specific PUSH and POP
constitute a matched pair, allowing the hardware to optimize for this
common use case: PUSHP and POPP;
operates on two registers at the same time: PUSH2 and
POP2, with paired variants PUSH2P and
POP2P.
These extensions only apply to register forms; they are not supported for memory or immediate operands.
The standard syntax for (P)PUSH2 and
(P)POP2 specify the registers in the order they
are to be pushed and popped on the stack:
push2p rax, rbx
; rax in [rsp+8]
; rbx is [rsp+0]
pop2p rbx, rax
... would be the equivalent of:
push rax
push rbx
; rax in [rsp+8]
; rbx is [rsp+0]
pop rbx
pop rax
NASM also allows the registers to be specified as a register
pair separated by a colon, in which case the order is always specified
in the order high:low and thus is the same
for PUSH2 and POP2. This means the order of the
operands in the POP2 instruction is different:
push2p rax:rbx
; rax in [rsp+8]
; rbx is [rsp+0]
pop2p rax:rbx
JMPABS)A new near jump instruction takes a 64-bit absolute address immediate.
NASM allows this instruction to be specified either as:
jmpabs target
... or:
jmp abs target
The generated code is identical. The ABS is required
regardless of the DEFAULT setting.
When the optimizer is enabled (see section 2.1.24), NASM may apply a number of optimizations, some of which may apply non-APX instructions to what otherwise would be APX forms. Some examples are:
The {nf} prefix may be ignored on instructions that already
don't modify the arithmetic flags.
When the {nf} prefix is specified, NASM may generate
another instruction which would not modify the flags register. For example,
{nf} ror rax, rcx, 3 can be translated into
rorx rax, rcx, 3.
The {zu} prefix may be ignored on instruction that already
zero the upper parts of the destination register.
When the {zu} prefix is specified, NASM may generate
another instruction which would zero the upper part of the register. For
example, {zu} mov ax, cs can be translated into
mov eax, cs.
New data destination or nondestructive source operands may be contracted
if they are the same (and the semantics are otherwise identical). For
example, add eax, eax, edx could be encoded as
add eax, edx using legacy encoding. NASM does not perform
this optimization as of version 3.00, but it probably will in the
future.
APX encoding, using REX2 and EVEX, respectively, can be forced by using
the {rex2} or {evex} instruction prefixes.