Real-time Simulation of SETL2 Cycle Decision Procedure On A RAM

The SETL2 cycle decision procedure below is defined in terms of primitive SETL2 operations in which each variable identifier has only one scope (which is achieved by variable renaming), and all set and map update operations can be performed destructively.

--
-- graph e is a binary edge relation
--
program fastcycle;
read(e);	-- read the set e of edges
refcount := {};	-- code to establish invariant refcount
for x1 in domain e loop
  for c1 in e{x1} loop
    if refcount(c1) = om then
	refcount(c1) := 1;
    else
	refcount(c1) +:= 1;
    end if;
  end loop;
end loop;
zeroset := {};
for x2 in domain e loop
  if x2 notin domain refcount then
    zeroset with:= x2;  -- refcount(x2) = om <-> refcount(x2) = 0
  end if;
end loop;
while exists u in zeroset | true loop
  for c2 in e{u} loop	   -- code to reestablish refcount and zeroset
	if c2 in domain e and refcount(c2) = 1 then
		  zeroset with:= c2;
	end if;
	refcount(c2) -:= 1;
  end loop;
  zeroset less := u;	    -- reestablish zeroset
  e(u) := om;
end loop;
print(domain e /= {});
end;

A method is sketched for simulating each primitive set and map operation in realtime on a RAM (i.e., each set operation is implemented in unit worst case time on a RAM). The method will only work on a highly restrictive class of SETL2 programs that read all of their input at the beginning of execution. To use the method on any program outside of this class, we need to transform that program into an equivalent program inside the class.

The simulation method depends on the assumption that each data value in a SETL2 program must belong to a set denoted by a type. For convenience, we require the much stronger restriction that each distinct program variable v must store values in a set denoted by some type tau throughout the execution of the program. Such a program invariant is specified by a type assignment v: tau.

Consider the following simple type system, where each type TAU represents a set val(TAU). Type INT stands for the set of integers. If TAU is a type, then type SET(TAU) represents the set of finite subsets of val(TAU). If TAU1 and TAU2 are types, then smap(TAU1,TAU2) and mmap(TAU1,TAU2) represent the set of finite partial single-valued (resp. multi-valued) maps from domain val(TAU1) to range val(TAU2). Variables B are called base types, each of which is uniquely associated with a subtype constraint of the form B < TAU (where TAU is any type except for a base type), which indicates that val(B) is a subset of val(TAU).

For any program, each of its variables v must have a type assignment of the form v: TAU, and every base type B must be associated with a subtype constraint of the form B < TAU1. The subtype constraints for a given program are restricted so that the graph

{[B1,B2]: B1 < TAU2 is a subtype constraint and base type B2 occurs in TAU2}

is acyclic. Later, we will see how to calculate val(B) for any base type B as a finite set of values obtained from the input and from program constants.

Besides representing specific sets, each type is also used to model specific data structures needed to implement set and map operations efficiently. For this purpose, we augment the type system slightly by allowing a base type B to appear in two forms: (1) (weak) B, and (2) (strong) B-strong. Three kinds of data structures are considered - unbased, weakly based, and strongly based - see figures for unbased, weakly based, and strongly based representations(best viewed in printed form, because the ghostview images of these files are not accurate).

If TAU and TAU1 are any types except for base variables, and if TAU2 is any type except for strong base variables, then unbased sets are denoted by type set(TAU), and unbased single- and multi- valued maps are denoted by smap(TAU1,TAU2) and mmap(TAU1,TAU2) respectively (where TAU1 is the domain element type and TAU2 is the range element type). If B is a base type with subtype constraint B < TAU, then weakly based sets are denoted by set(B), and weakly based maps are denoted by smap(B,TAU2) and mmap(B,TAU2). Strongly based sets are denoted by set(B-strong); strongly based maps are denoted by smap(B-strong,TAU2) and mmap(B-strong,TAU2). Strongly based types are restricted to program variables, and can appear only in type assignments (and not in subtype constraints).

One final restriction must be made so that the sets represented by base types only contain values obtained directly from the input. First we itemize every program variable (and the domain and range of every map) together with the operations on it. These operations are categorized into INITIALIZE (for assigning input or constants to variables), RETRIEVE, ACCESS (or SELFACCESS in the case where the search argument is always retrieved from the same set that is subsequently accessed), ADD, and DELETE. We represent these operations as edges that appear as comments just after each set operation in the program below.

x ACCESS> y means that x is used to access set y; e.g. x in y

x SELFACCESS> y is a simplified form of access that requires no search, because x was originally retrieved from y

y RETRIEVE> x means that x is retrieved from set y; e.g., exists x in y

x ADD> y means that x is added to set y; e.g., y with:= x

x DELETE> y means that x is delected from y; e.g., y less:= x

x INITIALIZE> y means that y is assigned to x, where x is either input or a program constant; e.g., read(y), where x = INPUT

Sufficient conditions for real-time simulation are given by the following rules:

  1. (ACCESS) For each edge of the form x ACCESS> y, the element type of y is B-strong for some finite set B. This rule supports unit time associative access.
  2. (COMPATIBILITY) For every edge of the form x ACCESS> y, x ADD> y, or x DELETE> y, where the element type of y is B (or B-strong), then x is of type B; for every edge of the form y RETRIEVE> x, where x is of type B, then the element type of y must also be of type B. This rule is a consequence of rule 1.
  3. (SIMPLICITY) Any based representation that isn't the strong representation is the weak representation. This rule prefers the simpler weakly based structure to the strongly based one.
  4. (DESTRUCTIVE UPDATE) If there are operations of either form y ADD> x or y DELETE> x, then there can be no operation of the form x ADD> w, x DELETE> w, or x ACCESS> w. This rule allows destructive updates without violating copy/value semantics.
  5. (STATIC BASE) No value with base type can result from any arithmetic operation. Together with the other rules, this rule ensures that base types represent finite sets obtained directly from the input and from program constants.

When all of the preceding conditions are satisfied, then for each base type B, val(B) is defined to be the smallest set that satisfies the program type assignments (restricted to input variables), the program subtype constraints, and the defined correspondences between types and the sets they denote.


For example, let us calculate the sets represented by base types A and I in a program with type assignments

   X:set(A), F: mmap(I,A)
and subtype constraints
  A < INT, I < set(A)
From the type assignments we get the constraints,
1. {X} subset val(set(A)), and
2. {F} subset val(mmap(I,A))
where X and F are the runtime input values. From the subtype constraints we obtain two more constraints,
3.  val(A) subset val(INT)
4.  val(I) subset val(set(A))
From the definition of val, val(set(A)) = {S: S subset val(A) | A is finite}, we can replace constraint (1.) by the simpler constraint,
1.  X subset val(A)
Likewise, constraint (2.) can be replaced by two constraints
2a. domain F subset val(I), and
2b. range F subset val(A)
Finally, constraint (4.) can be simplified to,
4.  +/val(I) subset val(A)
where +/val(I) computes the union of all the set valued elements of val(I).

The solution is now clear. The smallest value of val(I) is domain F at input time; the smallest value of val(A) is the union of range F, X, and +/(domain F) for X and F at input time.


Below we apply the method to simulate a low level SETL2 cycle testing program in real-time.

program fastcycle;
read(e);
--input initialize> e (domain e, range e)
refcount := {};
--{} initialize> refcount (domain refcount, range refcount)
for x1 in domain e loop
--domain e retrieve> x1
  for c1 in e{x1} loop
--x1 selfaccess> domain e
--range e retrieve> c1
    if refcount(c1) = om then
--c1 access> domain refcount
	refcount(c1) := 1;
--c1 selfaccess> domain refcount
--c1 add> domain refcount
    else
	refcount(c1) +:= 1;
--c1 selfaccess> domain refcount
    end if;
  end loop;
end loop;
zeroset := {};
--{} initialize> zeroset
for x2 in domain e loop
--domain e retrieve> x2
  if x2 notin domain refcount then
--x2 access> domain refcount
    zeroset with:= x2;
--x2 add> zeroset
  end if;
end loop;
while exists u in zeroset | true loop
--zeroset retrieve> u
  for c2 in e{u} loop	
--u access> domain e
--range e retrieve> c2
	if c2 in domain e and refcount(c2) = 1 then
--c2 access> domain e
--c2 access> domain refcount
		  zeroset with:= c2;
--c2 add> zeroset
	end if;
	refcount(c2) -:= 1;
--c2 selfaccess> domain refcount
  end loop;
  zeroset less := u;	
--u selfaccess> zeroset
--u delete> zeroset
  e(u) := om;
--u selfaccess> domain e
--u delete> domain e
end loop;
print(domain e /= {});
end;
With the following type assignments,

zerocount: set(B-strong)
e: map(B-strong,B)
refcount: map(B-strong,int)
x1,x2,c1,c2,u: B

the subtype constraint

B < INT   (which means that nodes are represented by integers)
the preceding program satisfies the five sufficient conditions for realtime simulation, and B = domain e U range e.

More efficient data structures than the general ones modeled by our type system can be obtained by removing linked structures that are never used by the program. This approach yields the following data structures, which are drawn here.

zeroset is a one-way list with a first but not a last finger;

domain refcount will not be a linked list;

domain e is a doubly linked list, and each image e{x} is a one way linked list for x in domain e; each of the lists in e has a first but not a last finger

HINT: In your homework, linked lists will have to be represented in terms of arrays (dynamically allocated but static tuples).