Computer Systems Org I - Prof. Grishman

Lecture 19 - Nov. 8, 2005

Memory allocation: malloc

(Note: the lectures in class used 'gettoken', which stopped at a blank, instead of getline, which stops at a newline, but the rest is identical; see the Programs directory for the actual programs written in class.)

In our last class, after getting getline to work, we were more ambitious and tried to read and write two lines with the following main program ...

main () {
        char* p;
        char* q;
        p = getline();
        q = getline();
        printf ("First line is %s\n", p);
        printf ("Second line is %s\n", q);
}

However, when we typed in 'moo' followed by 'quack' it responded with two 'quack's.  We just had a single array 'line' which was being used for both lines.

To fix this, we copy each string to the heap as it is read in.  We use the malloc function to get space on the heap.  This led to the final version (which worked)

#include <stdio.h>

char* getline();


main () {
        char* p;
        char* q;
        p = getline();
        q = getline();
        printf ("First line is %s\n", p);
        printf ("Second line is %s\n", q);
}

char* getline () {
        char c;
        char* m;
char line[20];
        int i = 0;
        while (i<19 && (c = getchar()) != '\n')
                line[i++] = c;
        line[i] = 0;
        m = malloc(i+1);
        strcpy (m, line);
        return m;
}

For more info on malloc, see Patt and Patel sec. 19.4 (formal definition, page 613).

Arrays of strings

For out final version, we want to be able to read and then print multiple lines, up to an EOF.  We modify getline to return NULL (a constant defined in stdio.h) if we read an EOF, and modify the main procedure to call getline repeatedly until a NULL is returned.  The lines are stored in p, which is now an array of strings:

#include <stdio.h>

char* getline();

char line[20];

main () {
        char* p[10];
        char* r;
        int i = 0;
        int j;
        while (i < 10 && (r = getline()) != NULL)
                p[i++] = r;
        for (j = 0; j < i; j++)
                printf ("Line %d is %s\n", j, p[j]);
}

/* read next line, return NULL if no more characters */

char* getline () {
        char c;
        char* m;
        int i = 0;
        while (i<19 && (c = getchar()) != '\n' && c != EOF)
                line[i++] = c;
        if (i == 0 && c == EOF)
                return NULL;
        line[i] = 0;
        m = malloc(i+1);
        strcpy (m, line);
        return m;
}

Managing the heap

In a large program, we may be reading in lines, processing the lines, and then go on to do something else and no longer need those lines of input.  In such a case, we may want to free up the space occupied by those lines so that it can be used for something else.  C lets us do so with the free function (described in Patt and Patel p. 614 and discussed on p. 508).  free takes one argument, which should be a pointer to a block in the heap previously allocated by mallocfree adds that block to the free (unallocated) part of the heap, so that that space can be reused by a subsequent call to malloc.  (The freed-up blocks are typically kept by the system as a linked list in the heap.)

In this way, C places full responsibility on the programmer for managing the heap.  If the programmer makes a mistake and frees up a block and then subsequently tries to reference the freed block, the result will be unpredictable (it will depend on whether the space has been reallocated) and the bug may be hard to trace.

Java takes a different approach to heap management:  it is done by the system through garbage collection.  When an object is created, it is allocated space on the heap.  When Java runs out of space, it goes through each object on the heap and determines whether there are any variables pointing to that object.  If there aren't, that object can no longer be referenced, so it is safe to free the space occupied by that object.  Garbage collection is safer (you can't accidentally free up something you will later use) and simpler (you don't have to call free) but it can add significant overhead (the time spent garbage collecting).