Computer Systems Org I - Prof. Grishman

Lecture 22 - Nov. 17, 2005

x86 Architecture

RISC and CISC

The LC-3 is an extreme example of a RISC ... a reduced instruction set computer.  It has a very small instruction set (15 instructions);  commercial RISC machines have quite a few more (maybe 50 to 70 or more).  But they share many of the LC-3 characteristics:  all instructions are the same size;  the instructions conform to one of a few simple formats;  loads and stores are separate from arithmetic and logical operations;  and the operation performed by an indvidual instruction is relatively simple.

In contrast, the x86 is a CISC ... a complex instruction set computer.  It has a relatively large set of instructions (for the Pentium, several hundred);  the instructions are of different sizes, from one byte up, and many different formats;  arithmetic operations can involve registers or memory;  and some instructions perform complex operations, such as scanning a string for a value.  The complexity of the x86 series is compounded by the need for upward compatibility from the original 8086 to the current Pentiums, so instructions and features are added but cannot be removed or 'cleaned up'.

The Netwide Assembler manual includes a description of the full x86 instruction set.

For a more tutorial presentation on the x86, we recommend Paul Carter's PC Assembly Tutorial.

To keep things simple, we will look primarily at the instructions from the original  x86 machine, the 8086.

Basic x86 register structure and instructions

8086 Registers:  
    AX (= AH + AL)
    BX (= BH + BL)
    CX (= CH + CL)
    DX (= DH + DL)
    SI  (source index)
    DI  (destination index)
    SP  (stack pointer)
    BP  (base pointer)

For 386 and later, these are part of 32-bit registers, EAX, EBX, etc.

Basic data manipulation instructions
    MOV
    ADD
    SUB
    CMP  (like subtract, but only sets condition code)

All these instructions take two operands:  
    MOV x, y copies contents of y into x;  ADD x, y performs x = x + y
Operand formats (for each of the data manipulation instructions)
    register, register
    register, immediate (constant)
    register, memory
    memory, register
    memory, immediate

Condition codes:  set by ADD, SUB, CMP
    SF:  sign flag, set if sign bit of result == 1
    ZF:  zero flag, set if result == 0
    CF:  carry flag, set if carry out from last arithmetic operation
    OF:  overflow flag, set if overflow from last arithmetic operation (2's complement arithmetic)

Jumps
    JMP  unconditional
    JS, JZ, JC, JO  conditional, jump if flag is set
    JNS, JNZ, JNC, JNO  jump if flag is not set
    additional combinations, mnemonics provided

Halt
    HLT
    INT 20H (return to operating system -- like a LC-3 trap)

Programming environment
    We will use an assembler and visual debugger, as we did for the LC-3.
    This debugger (DDB) was developed by Prof. Dewar for the original PC, and uses the 8086 instruction set.
    The assembler (DAS) takes one argument, the file to be assembled (extension .ASM assumed),
    and is normally run within a command line (DOS) shell.
    Both can be downloaded by following a link on the course home page.

Instruction format (Dewar assembler)
    similar to LC-3, except instruction labels must be followed by a colon (:)
    in place of .FILL, assembler accepts DB (define byte) and DW (define word)
    labels on assembler directives like DB and DW do not take a colon