Lecture 14: Set Operations. Prefix and postfix expression: 4/8

Set operations are not in DJW. Neither is prefix. Postfix is discussed in section 3.8, p. 214.

Set operations

Data Structures

Most of these we've either discussed or the only techniques are the obvious ones.

Bit vectors

If the sets are all subsets of a fixed universal set U, then use a bit vector of length |U|. For instance, if U is the set of months then |U|=12.
The vector [1,0,1,1,0,0,0,0,0,0,0,0] represents the set { January, March, April }.
The vector [0,0,1,0,0,0,1,1,0,0,0,1] represents the set { March, July, August, December }. If you are manipulating sets that are comparable in size to |U| then this is extremely efficient.

Intersect, union, subset, set difference for unordered lists

We'll use headed linked lists throughout, so that we can easily represent the null list.

Double loop

I illustrate with intersect; the others are similar.
List intersect(a,b) {
   c = new List();   // null list  
   for (x in a)       
      for (y in b)
          if (y.value == x.value) {
            break; // exit inner loop
   return c;
Running time: Each iteration of the inner loop takes time O(|b|) in the worst case, and there are |a| iterations of the outer loop. Worst case running time is therefore O(|a|*|b|) (quadratic).

With sorting

Sort the two lists and use the method for ordered list below.

Running time: As we will see, fast sorting algorithms run in time O(n log(n)) and the time for the set operation on ordered lists is linear. Therefore this runs in time O(n log n), where n = max(|a|,|b|).

With a hash table

List intersect(a,b) {
   c = new List();   // null list  
   h = new Hashtable(); 
   for (x in a)
      if (x is not in h) add x to h;
   for (y in b)
      if (y is in h) add y to c;
   return c;
Expected running time: O(|a| + |b|)

Intersect, union, subset, set difference for ordered lists

I illustrate with intersect; the others are similar.

The "two finger" method for computing the intersection of sets A and B represented as ordered lists.

List intersect(a,b) {
x = a.first();    
y = b.first();
c = new List(); 
while (x != null and y != null) {
    if (x.value == y.value) {
    else if (x.value < y.value) 
    else   // x.value > y.value

Postfix Expression

If every operator has a fixed number of arguments, then any expression can be written unambiguously without parentheses either in prefix or in postfix order (Prefix is also known as "Polish notation" because it was invented by Jan Lucasiewicz; postfix is known as "reverse Polish notation'.)

Prefix notation: Write the operator, then, recursively, each of the arguments in prefix notation.


The expression (2+x)*((1+y)/(x+y)) becomes

* + 2 x / + 1 y + x y
If function f(x) has one argument, g(x,y) has two, and h(x,y,z) has three, then the expression g(h(f(x),2,3),f(g(x,2))) becomes
g h f x 2 3 f g x 2
Same as the original expression but with no parentheses or commas.

Postfix notation Write the two arguments, recursively, in postfix then the operator.

The two examples become

2 x + 1 y + x y + / *
x f 2 3 h x 2 g f g

Expression tree to prefix/postfix

Print out in prefix/postfix order.

Evaluating postfix on a stack

evaluatePostfix(expression E) {
S = emptystack;
for (symbol Y:E)
  if (Y is a number or variable)
  else { // Y is an operator
     for (i=k-1; i >= 0; i--) 
        X[i] = S.pop();
     V = apply Y to X[0] ... X[k-1]
return S.pop();
To evaluate prefix, do the same thing but scan E from right to left.

Prefix to expression tree

prefixToExpTree(S,K) takes as arguments an array of symbols S and an index K.

This reads the first complete expression off string S starting at K and returns (a) the root of a tree corresponding to that expression; (b) the starting index of the next expression in S. (Returning multiple values is a useful feature which is strangely rarely incorporated in programming languages.)

< Tree, int > prefixToExpTree(String S, int K) {
   N = new Node();
   N.value = S[K];
   if (S[K-1] is an operator)
     for (i=0; i < S[K-1].numArguments; i++) {
         < T, K> = prefixToExpTree(S,K);
         make T a child of N;
   return < N,K >