Computer Systems Organization


Start Lecture #6

Remark: Must do 4.5 (skipped last time by mistake).

Homework: Write a C funcion int (int x) that returns 1 if x is odd and returns 0 if x is even. Can you do it without an if statement?

4.9: Initialization

Default Initialization

Static and external variables are, by default, initialized to zero. Automatic, internal variables (the only kind left) are not initialized by default.

Initializing Scalar Variables

  int x=4;
  int y=x-1;

As in Java, you can write int X=5-2;. For external or static scalars, that is all you can do.

For automatic, internal scalars the initialization expression can involve previously defined values as shown on the right (even function calls are permitted).

Initializing Arrays

  int BB[8] = {4,9,2}
  int AA[] = {3,5,12,7};
  char str[] = "hello";
  char str[] = {'h','e','l','l','o','\0'}

You can initialize an array by giving a list of initializers as shown on the right.

4.10: Recursion

The same as Java

4.11: The C Preprocessor

Normally, before the compiler proper sees your program, a utility called the C preprocessor is invoked to include files and perform macro substitutions.

4.11.1: File Inclusion

  #include <filename>
  #include "filename"

We have already discuss both forms of file inclusion. In both cases the file mentioned is textually inserted at the point of inclusion. The difference between the two is that the first form looks for filename in a system-defined standard place; whereas, the second form first looks in the current directory.

4.11.2: Macro Substitution

  #define MAXLINE 20
  #define MAX(A,B) (A>B ?(A) : (B))
  #undef getchar

We have already used examples of macro substitution similar to the first line on the right. The second line, which illustrates a macro with arguments is more interesting.

Without all the parentheses on the RHS, the macro would be legal, but would (sometimes) give the wrong answers. Why?
Ans: Consider 3+MAX(X+Y,Z-W)

Note that macro substitution is not the same as a function call (with standard call-by-value or call-by-reference semantics). Consider MAX(x++,5). It may increment x twice. If you know call-by-name from algol 60 fame, you will now see where it came from.

We probably will not use the third form. It is used to un-define a macro from a library so that you can write another version.

There is some fancy stuff involving # in the RHS. See the book for details; I do not intend to use it.

4.11.3: Conditional Inclusion

  #if integer-expr
  #elif integer-expr
  ...
  #elif integer-expr
  #else
  #endif

The C-preprocessor has a very limited set of control flow items. On the right we see how the familiar

    if cond1 then / else if cond2 then / ... / else if condn then / end if
  
construct is written. The individual conditions are simple integer expressions consisting of integers, some basic operators and little else. Perhaps the most useful additions are the preprocessor functions defined(name), which evaluates to 1 (true) if name has been #define'd, and the ! operator, which converts true to false and vice versa.


  #if !defined(HEADER22)
  #define HEADER22
  // The contents of header22.h
  // goes here
  #endif

We can use defined(name) as shown on the right to ensure that a header file, in this case header22.h, is included only once.

You might ask how could a header file be included more than once unless a programmer by mistake wrote the same #include statement twice? One possibility is a that a user file joe.c might include two systems headers h1.h and h2.h each of which includes h3.h.

Two other directives ifdef and ifndef test whether a name has been defined. Thus the first line of the previous example could have been written ifndef HEADER22.

  #if SYSTEM == MACOS
    #define HDR macos.h
  #elsif SYSTEM == WINDOWS
    #define HDR windows.h
  #elsif SYSTEM == LINUX
    #define HDR linux.h
  #else
    #define HDR empty.h
    #define MSG System not known
  #endif
  #include HDR

On the right we see a slightly longer example of the use of preprocessor directives. Assume that the name SYSTEM has been set to the name of the system on which the current program is to be run (not compiled). Then the code shown includes an appropriate header and, if the SYSTEM is not one for which the program is designed to be run defines MSG.


Chapter 5: Pointers and Arrays

  public class X {
    int A;
    public static void main(String args[]) {
      int i1=1;
      int i2=2;
      i2 = i1;
      i1 = 3;
      System.out.println("i2 is " + i2);
      X x1 = new X();
      X x2 = new X();
      x2 = x1;
      x1.A = 3;
      System.out.println("x2.A is " + x2.A);
    }
  }

Much of the material on pointers has no explicit analogue in Java; it is there kept under the covers. If in Java you have an Object obj;, then obj is actually what C would call a pointer. The technical term is that Java has reference semantics for all objects. In C this will all be quite explicit

To give an example, look at the Java snippet on the right. The first part works with integers. We define 2 integers; initialize both; set the second to the first; change the first; and print the second. Naturally, the second has its initial value of the first, namely 1.

The second part deals with X's, a trivial with just one data component, an integer. What we do is: define 2 X's; initialize both; set the second to the first; change the first; and print the second. The result is different! The second has the altered value of the first, namely 3.

This is on i5 for demo.

5.1: Pointers and Addresses

You will learn in 202, that the OS finagles memory in ways that would make Bernie Madoff smile. But, in large part thanks to those shenanigans, user programs can have a simple view of memory. For us C programmers, memory is just a large array of consecutively numbered addresses.

The machine model we will use in this course is that the fundamental unit of addressing is a byte and a character (a char) exactly fits in a byte. Other types like short, int, double, float, long take more than one byte but always a consecutive range of bytes.

l-values and r-values

One consequence of our memory model is that associated with int x=5 are two numbers. The first number is the address of the location in which x is stored. The second number is the value stored in that location; in this case that value is 5. The first number, the address, is often called the l-value; the second number, the contents, is often called the r-value. Why?

Consider x = x + 1; To evaluate the RHS we need to add 5 to 1. In particular, we need the value contained in the memory location assigned to x, i.e., we need 5. Since this is what is needed to evaluate a RHS it is called the r-value.

We compute 6=5+1. Where should we put it? We look at the LHS and see that we put it into x; that is, into the memory location assigned to x. Since it is the location that is needed when evaluating a LHS, that is called the l-value.

The Unary Operators & and *

As we have just seen, when a variable appears on the LHS, its l-value or address is used. What if we want the address of a variable that appears on the RHS; how do we get it?

In a language like Java the answer is simple; we don't.

In C we used the unary operator & and write p=&x; to assign the address of x to p. After executing this statement we say that p points to x or p is a pointer to x.

The unary operator does the reverse action. When * is applied to a pointer, it gives the value of the object (object is used in the English not OO sense) pointed. The * operator is called the dereferencing or indirection operator.

Consider the code sequence on the right. The first line we have seen before; the second is new. Recall that in a C declaration, all the doodads around a variable name tell you what you must do the variable to get the base type at the beginning of the line. Thus the second line says that if you dereference ip you get an integer. Common parlance is to say ip is an integer pointer (which is why we often use the name ip. Similarly, jp is another integer pointer.

  int x=1, y=2, z[10];
  int *ip, *jp;
  ip = &x;
  y = *ip;
  *ip = 0;
  ip = &z[0];
  *ip = 0;
  jp = ip;
  *jp = 1;

At this point both ip and jp are uninitialized. The next line sets ip to the l-value, i.e., the address, of x. Note that the types are correct, both ip and &x are pointers to an int.

The next line sets y equal to 1. Make sure you understand this. Slowly, ip at this time points to x, * does the dereference so *ip is x. Since we are on the RHS, we take the contents not the address of x and get 1.

The next line sets x=0;. The RHS is clearly 0. Where do we put this zero; look at the LHS. Slowly, ip at this time points to x, * does the dereference so *ip is x. Since we are on the LHS, we take the address and not the contents of x and hence we put 0 in the address that is the address of x. That is, we put 0 in x.

The next line changes ip; it now points to z[0]. The next line sets z[0]=0;

Pointers can be used without the deferencing operator. The next line sets jp to ip. Since ip at this time points to z[0], jp does as well. Hence the next line sets z[0]=1;