up:
Chapter 2 -- Basic Programming Model
prev: 2.4 Instruction Format
next: 2.6 Interrupts and Exceptions
2.5 Operand Selection
An instruction can act on zero or more operands, which are the data
manipulated by the instruction. An example of a zero-operand instruction is
NOP (no operation). An operand can be in any of these locations:
- In the instruction itself (an immediate operand)
- In a register (EAX, EBX, ECX, EDX, ESI, EDI, ESP, or EBP in the case
of 32-bit operands; AX, BX, CX, DX, SI, DI, SP, or BP in the case of
16-bit operands; AH, AL, BH, BL, CH, CL, DH, or DL in the case of 8-bit
operands; the segment registers; or the EFLAGS register for flag
operations)
- In memory
- At an I/O port
Immediate operands and operands in registers can be accessed more rapidly
than operands in memory since memory operands must be fetched from memory.
Register operands are available in the CPU. Immediate operands are also
available in the CPU, because they are prefetched as part of the
instruction.
Of the instructions that have operands, some specify operands implicitly;
others specify operands explicitly; still others use a combination of
implicit and explicit specification; for example:
-
Implicit operand: AAM
-
By definition, AAM (ASCII adjust for multiplication) operates on the
contents of the AX register.
-
Explicit operand: XCHG EAX, EBX
-
The operands to be exchanged are encoded in the instruction after the
opcode.
-
Implicit and explicit operands: PUSH COUNTER
-
The memory variable COUNTER (the explicit operand) is copied to the top of
the stack (the implicit operand).
Note that most instructions have implicit operands. All arithmetic
instructions, for example, update the EFLAGS register.
An 80386 instruction can explicitly reference one or two operands.
Two-operand instructions, such as MOV, ADD, XOR, etc., generally overwrite
one of the two participating operands with the result. A distinction can
thus be made between the source operand (the one unaffected by the
operation) and the destination operand (the one overwritten by the result).
For most instructions, one of the two explicitly specified --
the source or the -- be either in a register or in memory.
The other operand must be in a register or be an immediate source operand.
Thus, the explicit two-operand instructions of the 80386 permit operations
of the following kinds:
- Register-to-register
- Register-to-memory
- Memory-to-register
- Immediate-to-register
- Immediate-to-memory
Certain string instructions and stack manipulation instructions, however,
transfer data from memory to memory. Both operands of some string
instructions are in memory and are implicitly specified. Push and pop stack
operations allow transfer between memory operands and the memory-based
stack.
2.5.1 Immediate Operands
Certain instructions use data from the instruction itself as one (and
sometimes two) of the operands. Such an operand is called an immediate
operand. The operand may be 32-, 16-, or 8-bits long. For example:
SHR PATTERN, 2
One byte of the instruction holds the value 2, the number of bits by which
to shift the variable PATTERN.
TEST PATTERN, 0FFFF00FFH
A doubleword of the instruction holds the mask that is used to test the
variable PATTERN.
2.5.2 Register Operands
Operands may be located in one of the 32-bit general registers (EAX, EBX,
ECX, EDX, ESI, EDI, ESP, or EBP), in one of the 16-bit general registers
(AX, BX, CX, DX, SI, DI, SP, or BP), or in one of the 8-bit general
registers (AH, BH, CH, DH, AL, BL, CL,or DL).
The 80386 has instructions for referencing the segment registers (CS, DS,
ES, SS, FS, GS). These instructions are used by applications programs only
if systems designers have chosen a segmented memory model.
The 80386 also has instructions for referring to the flag register. The
flags may be stored on the stack and restored from the stack. Certain
instructions change the commonly modified flags directly in the EFLAGS
register. Other flags that are seldom modified can be modified indirectly
via the flags image in the stack.
2.5.3 Memory Operands
Data-manipulation instructions that address operands in memory must specify
(either directly or indirectly) the segment that contains the operand and
the offset of the operand within the segment. However, for speed and compact
instruction encoding, segment selectors are stored in the high speed segment
registers. Therefore, data-manipulation instructions need to specify only
the desired segment register and an offset in order to address a memory
operand.
An 80386 data-manipulation instruction that accesses memory uses one of the
following methods for specifying the offset of a memory operand within its
segment:
- Most data-manipulation instructions that access memory contain a byte
that explicitly specifies the addressing method for the operand. A
byte, known as the modR/M byte, follows the opcode and specifies
whether the operand is in a register or in memory. If the operand is
in memory, the address is computed from a segment register and any of
the following values: a base register, an index register, a scaling
factor, a displacement. When an index register is used, the modR/M
byte is also followed by another byte that identifies the index
register and scaling factor. This addressing method is the
most flexible.
- A few data-manipulation instructions implicitly use specialized
addressing methods:
- For a few short forms of
MOV that implicitly use the EAX register,
the offset of the operand is coded as a doubleword in the
instruction. No base register, index register, or scaling factor
are used.
- String operations implicitly address memory via DS:ESI,
(MOVS,
CMPS,
OUTS,
LODS,
SCAS) or via ES:EDI
(MOVS,
CMPS,
INS,
STOS).
- Stack operations implicitly address operands via SS:ESP
registers; e.g.,
PUSH,
POP,
PUSHA,
PUSHAD,
POPA,
POPAD,
PUSHF,
PUSHFD,
POPF,
POPFD,
CALL,
RET,
IRET,
IRETD,
exceptions, and interrupts.
2.5.3.1 Segment Selection
Data-manipulation instructions need not explicitly specify which segment
register is used. For all of these instructions, specification of a segment
register is optional. For all memory accesses, if a segment is not
explicitly specified by the instruction, the processor automatically chooses
a segment register according to the rules of Table 2-1. (If systems
designers have chosen a flat model of memory organization, the segment
registers and the rules that the processor uses in choosing them are not
apparent to applications programs.)
There is a close connection between the kind of memory reference and the
segment in which that operand resides. As a rule, a memory reference implies
the current data segment (i.e., the implicit segment selector is in DS).
However, ESP and EBP are used to access items on the stack; therefore, when
the ESP or EBP register is used as a base register, the current stack
segment is implied (i.e., SS contains the selector).
Special instruction prefix elements may be used to override the default
segment selection. Segment-override prefixes allow an explicit segment
selection. The 80386 has a segment-override prefix for each of the segment
registers. Only in the following special cases is there an implied segment
selection that a segment prefix cannot override:
- The use of ES for destination strings in string instructions.
- The use of SS in stack instructions.
- The use of CS for instruction fetches.
Table 2-1. Default Segment Register Selection Rules
Memory Reference Needed Segment Implicit Segment Selection Rule
Register
Used
Instructions Code (CS) Automatic with instruction prefetch
Stack Stack (SS) All stack pushes and pops. Any
memory reference that uses ESP or
EBP as a base register.
Local Data Data (DS) All data references except when
relative to stack or string
destination.
Destination Strings Extra (ES) Destination of string instructions.
2.5.3.2 Effective-Address Computation
The modR/M byte provides the most flexible of the addressing methods, and
instructions that require a modR/M byte as the second byte of the
instruction are the most common in the 80386 instruction set. For memory
operands defined by modR/M, the offset within the desired segment is
calculated by taking the sum of up to three components:
- A displacement element in the instruction.
- A base register.
- An index register. The index register may be automatically multiplied
by a scaling factor of 2, 4, or 8.
The offset that results from adding these components is called an effective
address. Each of these components of an effective address may have either a
positive or negative value. If the sum of all the components exceeds 2^(32),
the effective address is truncated to 32 bits.
Figure 2-10
illustrates the
full set of possibilities for modR/M addressing.
The displacement component, because it is encoded in the instruction, is
useful for fixed aspects of addressing; for example:
- Location of simple scalar operands.
- Beginning of a statically allocated array.
- Offset of an item within a record.
The base and index components have similar functions. Both utilize the same
set of general registers. Both can be used for aspects of addressing that
are determined dynamically; for example:
- Location of procedure parameters and local variables in stack.
- The beginning of one record among several occurrences of the same
record type or in an array of records.
- The beginning of one dimension of multiple dimension array.
- The beginning of a dynamically allocated array.
The uses of general registers as base or index components differ in the
following respects:
- ESP cannot be used as an index register.
- When ESP or EBP is used as the base register, the default segment is
the one selected by SS. In all other cases the default segment is DS.
The scaling factor permits efficient indexing into an array in the common
cases when array elements are 2, 4, or 8 bytes wide. The shifting of the
index register is done by the processor at the time the address is evaluated
with no performance loss. This eliminates the need for a separate shift or
multiply instruction.
The base, index, and displacement components may be used in any
combination; any of these components may be null. A scale factor can be used
only when an index is also used. Each possible combination is useful for
data structures commonly used by programmers in high-level languages and
assembly languages. Following are possible uses for some of the various
combinations of address components.
-
DISPLACEMENT
-
The displacement alone indicates the offset of the operand. This
combination is used to directly address a statically allocated scalar
operand. An 8-bit, 16-bit, or 32-bit displacement can be used.
-
BASE
-
The offset of the operand is specified indirectly in one of the general
registers, as for "based" variables.
-
BASE + DISPLACEMENT
-
A register and a displacement can be used together for two distinct
purposes:
- Index into static array when element size is not 2, 4, or 8 bytes.
The displacement component encodes the offset of the beginning of
the array. The register holds the results of a calculation to
determine the offset of a specific element within the array.
- Access item of a record. The displacement component locates an
within record. The base register selects one of several
occurrences of record, thereby providing a compact encoding for
this common function.
An important special case of this combination, is to access parameters
in the procedure activation record in the stack. In this case, EBP is
the best choice for the base register, because when EBP is used as a
base register, the processor automatically uses the stack segment
register (SS) to locate the operand, thereby providing a compact
encoding for this common function.
-
(INDEX * SCALE) + DISPLACEMENT
-
This combination provides efficient indexing into a static array when
the element size is 2, 4, or 8 bytes. The displacement addresses the
beginning of the array, the index register holds the subscript of the
desired array element, and the processor automatically converts the
subscript into an index by applying the scaling factor.
-
BASE + INDEX + DISPLACEMENT
-
Two registers used together support either a two-dimensional array (the
displacement determining the beginning of the array) or one of several
instances of an array of records (the displacement indicating an item
in the record).
-
BASE + (INDEX * SCALE) + DISPLACEMENT
-
This combination provides efficient indexing of a two-dimensional array
when the elements of the array are 2, 4, or 8 bytes wide.
up:
Chapter 2 -- Basic Programming Model
prev: 2.4 Instruction Format
next: 2.6 Interrupts and Exceptions