Next: Adding non-strict comparisons Up: Extensions and Consequences Previous: Extensions and Consequences

## An efficient implementation of solve_constraints

It is possible to implement algorithm solve_constraints somewhat more efficiently than the naive encoding of the above description. The key is to observe that the graph H of connected components does not have to be computed explicitly; it suffices to compute it implicitly using merge-find sets (union-find sets). Combining this with suitable back pointers from edges to constraints, we can formulate a more efficient version of the algorithm.

We use the following data structures and subroutines:

• Each node N of the cluster tree contains pointers to its parents and children; a field N.label, holding the integer label; a field N.symbols, holding the list of symbols in the leaves of N; and a field N.mfsets, holding a list of the connected components of the symbols in N. As described below, each connected component is implemented as an merge-find set (MFSET).
• An edge E in the graph over symbols contains its two endpoints, each of which is a symbol; a field E.shorts, a list of the constraints in which E appears as a short; and a field E.longs, a list of the constraints in which E appears as a long.
• A constraint C has two fields, C.short and C.long, both of them edges. It also has pointers into the lists C.short.shorts and C.long.longs, enabling C to be removed in constant time from the constraint lists associated with the individual edges.
• We will use the disjoint-set forest implementation of MFSETs (Cormen, Leiserson, and Rivest, 1990, p. 448) with merging smaller sets into larger and path-compression. Thus, each MFSET is a upward-pointing tree of symbols, each node of the tree being a symbol. The tree as a whole is represented by the symbol at the root. A symbol A has then the following fields:
• A.parent is a pointer to the parent in the MFSET tree.
• A.cluster_leaf is a pointer to the leaf in the cluster tree containing A.
• If A is the root of the MFSET then A.size holds the size of the MFSET.
• If A is the root of the MFSET, then A.symbols holds all the elements of the MFSET.
• If A is the root of the MFSET then A.leaf_ptr holds a pointer to the pointer to A in N.mfsets where N = A.cluster_leaf.

We can now describe the algorithm.

```
procedure solve_constraints1(
in S: a system of constraints of the form od(a,b) << od(c,d)).
return either a cluster tree T satisfying S if S is consistent;
or false if S is inconsistent.

variables: m is an integer;
a,b are symbols;
C is a constraint in S;
H is an undirected graph;
E,F are edges;
P is an MFSET;
N,M are nodes of T;

0. begin if S contains any constraint of the form, ``od(a,b) << od(c,c)'' then return false;
1.       H := emptyset;
2.       for each constraint C in S with short E and long Fdo
3.           add E and F to H;
4.           add C to E.shorts and to F.longs endfor;
5.       m := the number of variables in S;
6. 	 initialize T to contain the root N;
7. 	 N.symbols := the variables in S;

8.	 repeat for each leaf N of T, INITIALIZE_MFSETS(N);
9. 		for each edge E=ab in H do
10.   		    if E.shorts is non-empty and FIND(a) <> FIND(b)then
11. 		      MERGE(FIND(a), FIND(b)) endif endfor
12. 		if every edge E=ab in H satisfies FIND(a) = FIND(b)
13. 	 	    then return(false) endif
14. 		for each current leaf N of T do
15. 		    if N.mfsets has more than one element then
16. 		      for each mfset P in N.mfsets do
17. 	 		  construct node M as a new child of N in T;
18. 		 	  M.symbols:= P.symbols;
19. 		      endfor endif endfor
20. 		for each edge E = ab in H do
21. 		    if FIND(a) <> FIND(b) then
22. 		      for each constraint C in E.longs do
23. 		 	  delete C from S;
24. 		 	  delete C from E.longs;
25. 		 	  delete C from C.short.shorts endfor
26. 		      delete E from H endif endfor
27. 		m := m-1;
28.		until S is empty;

29.      for each leaf N of T
30. 		N.label := 0;
31.		if N.symbols has more than one symbol
32. 		    then create a leaf of N with label 0for each symbol in N.symbols;
33. 		    endif endfor end solve_constraints1.

procedure INITIALIZE_MFSETS(N : node)
var A : symbol;
N.mfsets := emptyset;
for A in N.symbols do
A.parent := null;
A.cluster_leaf := N;
A.symbols := {A};
A.size := 1;
N.mfsets := cons(A,N.mfsets);
A.leaf_ptr := N.mfsets;
endfor  end INITIALIZE_MFSETS.

procedure MERGE(in A,B : symbol)
if A.size > B.size then swap(A,B);
A.parent := B;
B.size := B.size + A.size;
B.symbols := B.symbols union A.symbols;
Using A.leaf_ptr, delete A from A.cluster_leaf.mfsets;
end MERGE.

procedure FIND(in A : symbol) return symbol;
var R : symbol;
if A.parent = null then return A
else R := FIND(A.parent);
A.parent := R;    /* Path compression */
return(R)
end FIND.
```

Let n be the number of symbols in S; let e be the number of edges; and let s be the number of constraints. Note that n/2 <= e <= n(n-1)/2 and that e/2 <= s <= e(e-1)/2. The running time of solve_constraints1 can be computed as follows. As each iteration of the main loop 8-28 splits at least one of the connected components of H, there can be at most n-1 iterations. The MERGE-FIND operations in the for loop 9-11 take together time at most O(max (n a(n), e)) where a(n) is the inverse Ackermann's function. Each iteration of the inner for loop lines 16-18 creates one node M of the tree. Therefore, there are only O(n) iterations of this loop over the entire algorithm. Lines 14, 15 of the outer for loop require at most n iterations in each iteration of the main loop. The for loop 22-26 is executed exactly once in the course of the entire execution of the algorithm for each constraint C, and hence takes at most time O(s) over the entire algorithm. Steps 20-21 require time O(e) in each iteration of the main loop. It is easily verified that the remaining operations in the algorithm take no more time than these. Hence the overall running time is O(max (n2a(n), ne, s)).

Next: Adding non-strict comparisons Up: Extensions and Consequences Previous: Extensions and Consequences