Having generated intermediate code, you
are now faced with actually generating assembly.
Unless it is impossible for you, you should be generating assembly code
for an x86 machine running Linux, with GCC installed. If you do not
have easy access to such a machine, please contact Ben or Emily.
On the course web page, there are several assembly references. The
first one to look at is
this one, although the only sections that are relevant are those for
GCC and GAS (the Gnu assembler).
Here's the basic idea:
gcc foo.s bar.c bif.o
main(), which will be
the first procedure called.
gcc -S simple.c
You'll have one of these at the top of your program, which will have the space for all the global variables. It will also have a global string table, containing all the string constants used in the program, so that you can refer to them. After this, you will have the declaration for each function defined in your program. Functions will consist of the following:
You will need to write a Java function for each kind of
intermediate code construct (label, jump, binary op, etc.),
which generates an equivalant assembly language statement. Note that these
functions will of course have to use the space/type info, in order to
calculate offsets into the activation record, etc.. For a label
statement, for example, you would have to generate a label. For an if
statement, you will have to generate a compare statement and then the
appropriate branch statement. Notice that param statements allocate space on the
stack before the function call. You will need to keep track of that to be
able to get at the arguments to a function.
You will also need to generate calls to C code, so that whenever
the
compiler runs into $printf("hello"):Int, it calls printf
on
"hello", using standard C-calling conventions. It should get the
value
returned as an int. C-calls don't need to handle arguments or return
values that are not 32 bit values. Since pointless didn't provide a
built-in print, or any other output, this is not optional!
This phase also has a number of details to get right, which will likely be the cause of many bugs. Expect to spend significant time debugging your code, and allow for that time. A short list of things to watch for is:
main()
is just another function to be called, from the OS's point of
view.
Messing up here will cause the program to crash before entering,
or
after exiting, which is frustrating.if v1 relop
v2
goto y. One problem with using them is that there are
usually
only a finite number of them, so you may run out. We suggest generating
all of your labels yourself.printf without crashing.
32-bit arguments are pushed down stack in reverse syntactic
order, above the 32-bit near return address. %ebp, %esi, %edi, %ebx
are callee-saved (i.e. saved upon entering a function and restored just
before exiting), other
registers are caller-saved (i.e. saved right before a call to a function
and restored after the call finishes) ; %eax is to hold the result, or
%edx:%eax for 64-bit results. See
/usr/doc/HOWTO/Assembly-HOWTO.txt.gz on a linux system or
click here.
gcc -S, as described above.Note that this is not a complete list of "gotchas". Many others will exist.