next up previous
Next: About this document

Programming Assignment 2 Computer Science 102

Spring 1996

First draft due: Fri, March, 15, 9:00 pm; but you should have done much of the work by the midterm, March 10.


Your project is to design a program to implement string operations. Recall that in Turbo Pascal there is already a string type. A string is an array of 255 characters, plus a length field that indicates how many characters are actually being used. There are two problems with this built-in implementation. First, you are prevented from having a string with more than 255 characters. Second, even if your string is very short, depending on your system, the system may still allocate space for 255 characters. Turbo Pascal allocates space for only the amount of characters in the string.

Your project is to implement an alternative string type, called string/_type, that avoids these two problems.

Your string type will use a linked list instead of an array. For example, to represent the string 'compute', you can use the following linked list.

Your project divides into two parts: a StrngADT unit that will >implement several string operations, and a main program that will use this unit to implement a calculator for strings.

The StrngADT unit

In the unit StrngADT, you will implement an abstract data type, called string_type, that will allow several useful string operations. The interface for this unit is given below.

unit StrngADT;


  pointer_type = ^node_type;

  node_type = 
      info : char;
      next : pointer_type

  string_type = 

procedure Concat(string1, string2 : string_type; 
                 var string3 : string_type);
{Precondition: None.
 Postcondition: string3 is the concatenation of string1 and string2;
                the nodes of string3 are separate from the nodes 
                of string1 and string2.}

procedure Copy(string1 : string_type; var string2 : string_type);
{Precondition: None.
 Postcondition: string2 is a copy of string1.  The nodes of string2
                are separate from the nodes of string1.}

procedure Repeats(string1 : string_type; n : integer; 
                 var string2 : string_type);
{Precondition: n >= 0.
 Postcondition: string2 is the concatenation of n copies of string1.  
 If n = 0, then string2 is the empty string represented by a linked list
 consisting of only the NIL pointer.}

procedure Assign(string1 : string; var string2 : string_type);
{Precondition: None.  (Notice that string1 is a Pascal string.)
 Postcondition: string2 represents the same string as string1.
 Use ord(string1[0] ) to obtain the string length since you will be
 using the function length as a method in this object.}

procedure Substring(string1 : string_type; x, y : integer; 
                    var string2 : string_type);
{Precondition: 1 <= x <= y <= length(string1).
 Postcondition: string2 is the substring of string1 that goes from 
                the xth character to the yth character.}

procedure delete(var string1 : string_type);
{Precondition: string1 is defined.
 Postcondition: All the nodes of string1 are returned to the heap
                and string1.head is assigned the nil pointer.

 Do not use delete in cases where x appears on both sides of the
 ``:='', like, x := x + y, only for cases like x := 'nyu' and then x
 is reassigned in for instance, x := 'hello}

function Length(string1 : string_type) : integer;
{Precondition: None.
 Postcondition: Returns the number of characters of string1.}

procedure Print(string1 : string_type);
{Precondition: None.
 Postcondition: Displays the characters of string1 on the terminal. 
                The display will have up to 75 characters per line.} 
        head : pointer_type;

The Main Program

Your main program will be a calculator of strings. In other words, the user will use a file with a series of string commands. After each command, your program will parse the command and then print out the result of the command. Each command will have one of the following forms:

  1. x := y + z

  2. x := y

  3. x := 17 * y

  4. x := 'Hello'

  5. x := y [10..20]

  6. length x

  7. x

  8. quit

Here x, y, and z represent string variables (possibly the same variables). The only variable names allowed will be a single character, either a lower-case letter (' a' to ' z') or an upper-case letter (' A' to ' Z'). As in Pascal, an upper-case letter refers to the same variable as its lower-case counterpart.

Each of the tokens in the input line will be separated by one or more blanks. Your lexical analyser should use this fact to separate the tokens,

Command 1 assigns x the concatenation of strings y and z.

Command 2 copies y into x.

Command 3 assigns x the concatenation of 17 copies of y. In place of 17, any integer between 0 and maxint may be used.

Command 4 assigns x the string 'Hello'. In place of 'Hello', and string of up to 75 characters may be used.

Command 5 assigns x the substring of y going from the 10th character to the 20th character. In place of 10 and 20, any integers between 1 and maxint may be used.

In all of the assignment commands above, your program should display the new value of x, in addition to performing the assignment.

Command 6 outputs the length of string x.

Command 7 outputs the value of string x.

Command 8 stops the program.

If the input line is not in one of the formats above, your program should print an error message, and then skip that input line. Similarly, if one of the variables on the right side of an assignment has not been initialized, your program should complain and skip that line. Use a symbol table to keep track of those variables assigned values. Also, the actual string type parameters should be elements of an array of string type.

Discussion and explanation
Project 2 requires you to write a string interpreter that will execute an input stream of string operations. For instance, if the input is:

        x := 'nyu'
        y:= 'hello'
        z := y + x
where z is the instruction to print the string stored in z, then the result would be "hellonyu". A variable in the input can only be a single letter. The interpreter is not case sensitive, so that "z" and "Z" represent the same variable.

For draft 1, you are required to first write an object called string_type which contains as methods the usual string operations: assignment (x := 'nyu'), concatenation (z := x + y), multiplication (w := 12 * z), substring (w := x [2..5]), copying (u := x), printing (x), deleting and quiting (quit). These operations are to be implemented with linked lists using dynamic pointers, so the object also contains a delete method to return the nodes of a reassigned string to the heap. The private field of the object is a pointer.

An example of a method heading is PROCEDURE ASSIGNMENT( ST1 :STRING; ST2: STRING_TYPE). So if ST1 is 'nyu', ST2 is the pointer to the first node of the three node linked list containing 'n', 'y', and 'u'. The object is defined and implemented in a unit strnADT. There is a program test1.pas (available on the homepage) that checks your strnADT.

Draft 2 requires the writing of the interpreter. I have written the scanner for you, and the initial stage of the parser (again, these are available on the homepage in analyse.pas). The scanner is elementary because the tokens in the input must be separated by at least one blank. The parser indicates which procedure further parses a statement. The parsing up to this point is done simply by determining how many tokens appear in the input. Thus x := 'hello' and x := y are parsed such that the same procedure is executed. It is up to you to further parse them by distinguishing between an assignment and a copy statement.

Use a symbol table to check whether a variable that must have a value in any operation, has in fact been assigned a value. This in x := y, you would check whether y has been assigned a value. Do this by using a type symbol = array['A'..'Z'] of boolean. If the input contains an undefined variable, the interpreter must print an error message.

How do you associate a variable with the pointer to the linked list? The natural way is to have an array of the object, i.e., table_type = array ['A'..'Z'] of string_type. Assume that Table : table_type and if x := 'nyu' and y := 'hello'; and Px points to the linked list for x; and Py, for y, then Table['X'] contains Px and Table['Y'] contains Py. This also facilitates deleting linked lists for redefined variable.

How is this implemented? For instance in x := 'nyu', stream[1] is 'x', stream[2] is ':=' and stream[3] is the string 'nyu'. First you must strip the quotes from 'nyu'. The quotes are the first and last elements of stream[3]. These are referred to as stream[3][1] and stream[3][5] respectively. The subscript of Table is a character, so if you wrote Table[ upcase( stream[1] ) there would be a type incompatability between the subscript stream[1] (a string) and the type that the subscript should be, a character. Thus you must write Table[ upcase( stream[1][1] ). This is second parameter of method ASSIGN. Remember that the header for the method is: PROCEDURE ASSIGN( ST1 :STRING; ST2: STRING_TYPE). This will result in the pointer for the linked list that stores 'nyu' being the contents of the element 'X' of Table. The data for draft 2 is also on the home page.