Lecture Material from 201

Robert Dewar

Suppose we have the C function:

unsigned max3 (unsigned a, unsigned b, unsigned c) {
  int t;
  if (a > b)
    t = a;
  else t = b;
  if (t > c)
    return t;
  else return c;
}

A call to this function might look like

   r = max3 (j,k,l);

The corresponding assembler code for the call would be something
like:

   PUSH   L
   PUSH   K
   PUSH   J
   CALL   MAX3
   ADD    SP, 2*3
   MOV    R, AX

Note that the convention is that the result is returned in AX. The push
instructions leave the stack looking like:

     +--------------------+
     |    value of L      |
     +--------------------+
     |    value of K      |
     +--------------------+
     |    value of J      |  <-------------  SP (stack pointer)
     +--------------------+

Note a couple of things here

  The parameters are pushed in reverse order. Why? We will answer this
  question later.

  The ADD instruction strips the parameters off the stack which are no
  longer needed once the function returns to the caller. An alternative
  (and seemingly better) scheme would be to have the called function be
  in charge of removing the parameters from the stack. That is in fact
  done for other languages (notably Pascal). Why in C do we have the
  caller responsible for stripping the parameters? That's another
  question we will answer later. The value of 3*2 is because we have
  three words on the stack, each two bytes long.

  We could pop off the parameters instead of doing the ADD, but this would
  take three memory references, instead of a single add instruction, which
  would be much less efficient.

The value of SP starts at 0FFFEH, and the stack builds down in memory, so
in the above picture, and other pictures like it later on, high memory
is at the top of the page, and low memory is at the bottom of the page.
Why is the initial SP 0FFFEH instead of zero? It seems like we are wasting
one stack location. This is another question we will answer later.

The called function receives control with the stack looking like:

     +--------------------+
     |    value of C      |
     +--------------------+
     |    value of B      |
     +--------------------+
     |    value of A      |
     +--------------------+
     |    return point    |  <-------------  SP (stack pointer)
     +--------------------+

That's because the CALL instruction pushes the return point on to the stack.
In this case, the return point is the address of the ADD instruction that
immediately follows the call.

Note that within the function we label the stack locations with the formal
parameter names (we have no idea of the names of the actual parameters 
within the function). So for example, what the caller pushed as L, the
third actual parameter, is called C within the function.

The code for the C function itself is divided into three parts, the prolog,
the actual function of the code, and the epilog. The prolog sets things up
and arranges for access to the parameters. The epilog arranges to return
control to the caller, with the stack restored to look the way it did at
the point of the call (with just the three parameters pushed).

The prolog looks like

   MAX3:   PUSH    BP
           MOV     BP, SP
           SUB     SP, 2

The stack now looks like:


     +--------------------+
     |    value of C      |
     +--------------------+
     |    value of B      |
     +--------------------+
     |    value of A      |
     +--------------------+
     |    return point    |
     +--------------------+
     |   saved BP value   |  <-------------  BP (base or frame pointer)
     +--------------------+
     |    value of T      |  <-------------  SP (stack pointer)
     +--------------------+

The reason we save BP is that each function uses this, and we are not
supposed to destroy our caller's BP value, so we need to save and
restore it.

The MOV BP,SP sets the base pointer (usually called the "frame pointer")
which will be used to access both arguments and local variables.

The SUB instruction allocates space on the stack for local variables. In
this case, we have just one 2-byte word, so we need to allocate 2 bytes.

The parameters A,B, and C, and the local variable T can both be accessed
using BP (note that SP cannot be used as an index register on the 8086).

The following set of EQU directives show how these can be addressed

   A   EQU   WORD PTR [BP + 4]
   B   EQU   WORD PTR [BP + 6]
   C   EQU   WORD PTR [BP + 8]

Note that these EQU are simply directives to the assembler. The meaning is
that if we reference A, it will be replaced by the WORD PTR reference. We
don't have to use these EQU's, we could simply write "WORD PTR [BP + 4]"
every time we need to reference A, but it sure is easier to just write A.

To reference the local variable T, we also use BP:
   
   T   EQU   WORD PTR [BP - 2]

Note that the value of T is junk to start with (it is an uninitialized
variable). That means we had better assign a value to it before we
reference it (the code does indeed obey this rule!)

Now we have everything we need to write the assembler code corresponding
to the body of the function:

         MOV   AX, A      ; if (a > b)
         CMP   AX, B
         JNA   L1
   ;
         MOV   AX, A      ;   t = a;
         MOV   T, AX       
         JMP   L2
   ;
   L1:   MOV   AX, B      ; else t = b;
         MOV   T, AX
   ;
   L2:   MOV   AX, T      ; if (t > c)
         CMP   AX, C
         JNA   L3
   ;
         MOV   AX, T      ;   return t;
         JMP   EPLG
   ;
   L3:   MOV   AX, C      ; else return c;
         JMP   EPLG

The above code is not very clever. A clever compiler may do quite a bit
better. In particular there are some move instructions that can obviously
be eliminated (which ones?)

All we have to do now is to write the epilog, which is passed control with
the result in AX. Since that's where we expect the result to be, all that
have to do is to restore the values of SP and BP, and return to the caller.

Here is the code for the epilog:

   EPLG: MOV   SP, BP        
         POP   BP
         RET

The MOV SP,BP restores SP to where it was immediately before the SUB SP,2
instruction in the prolog. Note that we could also achieve this by doing
ADD SP,2, but we prefer the MOV SP,BP for two reasons. First it is a 
shorter instruction (2 bytes instead of 3), but more specifically, we
may in general have added some rubbish on the bottom of the stack, e.g.
by doing PUSH intructions. The MOV SP,BP ensures that all such rubbish
is removed.

The stack after the MOV looks like:

     +--------------------+
     |    value of C      |
     +--------------------+
     |    value of B      |
     +--------------------+
     |    value of A      |
     +--------------------+
     |    return point    |
     +--------------------+
     |   saved BP value   |  <-------------  BP and SP
     +--------------------+

The POP instruction restores the old BP and leaves the stack looking like:

     +--------------------+
     |    value of C      |
     +--------------------+
     |    value of B      |
     +--------------------+
     |    value of A      |
     +--------------------+
     |    return point    |  <-------------  SP
     +--------------------+

Now the RET instruction removes the return point and goes back to the ADD
instruction following the CALL instruction with the stack restored to look
like:

     +--------------------+
     |    value of C (L)  |
     +--------------------+
     |    value of B (K)  |
     +--------------------+
     |    value of A (J)  |  <-------------  SP (stack pointer)
     +--------------------+

which is just how it looked at the point of the CALL instruction. The result
is safely in AX and everything is fine!

Now some answers to deferred questions

1. Why push the parameters in reverse order. Answer, in original C, it was
permissible to pass extra parameters, which are simply supposed to be ignored.
Suppose we call Max3 with four parameters

   t = max3 (j, k, l, m);

Now pushing the parameters in reverse order, we have the stack looking like

     +--------------------+
     |    value of M      |
     +--------------------+
     |    value of L      |
     +--------------------+
     |    value of K      |
     +--------------------+
     |    value of J      |  <-------------  SP (stack pointer)
     +--------------------+

If you examine how the prolog works, you will see that the value of M
is simply ignored, which is what is supposed to happen. If we pushed
the parameters left to right, the extra parameter would end up between
the frame pointer and the real parameters, so that would not work.

This also explains why the caller must strip the parameters. Since extra
parameters can be passed, only the caller knows if this is the case, and
knows how many parameters must be stripped.

Finally, why is the stack pointer 0FFFEH on entry? The answer is that before
control is passed to your program, a zero value is pushed on the stack.
Now suppose that a program does a RET, as in:

      MOV   AX, WORD PTR 0200H
      MOV   WORD PTR 020AH, AX
      RET
      END

The effect of the RET is to pop off the "return point" on the stack, which
is zero, and then jump to location zero. At location zero the operating
system always puts an INT 20H instruction, so the result is to return to
the operating system.