# SchedMatch: An efficient algorithm to match schedules and other partial orders

Dennis Shasha
Courant Institute of Mathematical Sciences
Department of Computer Science
New York University
shasha@cs.nyu.edu
http://cs.nyu.edu/cs/faculty/shasha/index.html

Jason Wang
Department of Computer Science
New Jersey Institute of Technology
jason@cis.njit.edu

Kaizhong Zhang
Department of Computer Science
University of Western Ontario
kzhang@csd.uwo.ca

# Motivation

Suppose you have two schedules S1 and S2. Each is represented by a set of dependency edges with redundant transitive edges removed. (Note however that if there are edges from task A to task B and from B to C, there may still be an edge from A to C if that edge presents a separate constraint from what is implied by the path through B.) The software assumes that each schedule by itself is acyclic (i.e. if there is a dependency from A to B either directly or transitively, then there is no dependency from B to A).

A schedule is a special case of a partial order. This software applies to all partial orders, but we continue to use the word schedule for the sake of concreteness. Other possible applications include ontology matching and in general hierarchy comparison.

The problem is to reconcile the two schedules. Our definition of reconcilation is to find as few edges as possible to remove from either S1 or S2 to render the two schedules consistent. Informally, consistency means that if one schedule insists on ordering task A before B then the other schedule does not insist on ordering B before A. Formally, consistency means that there is some total order of the tasks that both can agree on. An equivalent formal definition is that two schedules are consistent if combining them introduces no cycles.

This problem was inspired by listening to the presentation of "Behavioral matchmaking for service retrieval: application to conversation protocols," by Juan Carlos Corrales, Daniela Grigori, Mokrane Bouzeghoub and published in BDA 2006. The authors use a different matching criterion.

Whereas this software is a form of graph matching, it makes strong use of transitivity. If you are interested in graph comparison without that assumption, please see our package for graph comparison.

You can also find a software package (GraphGrep) to search for a query graph in a database of graphs here.

# Installation

The approximate schedule matcher runs in a high performance interpreted environment called K.

• To begin with, therefore, please download trial K from here for a sun version. If you are on a PC, then download k.exe and k20.dll . (if this doesn't work, then go to k20dll and then rename the file to k20.dll) If you're on, linux then download the linux version . K and our program run equally well on linux and windows. The schedule matcher, our sample file, and all the K files fit comfortably in 0.5 megabytes of disk space.
• schedmatch.k -- the software
• schedin -- a sample database of two schedules, also the name of the input file.
• You can then run the program by typing
k schedmatch [-v] [SCHEDNAME] , where SCHEDNAME is an optional argument denoting one of the schedules of the input file schedin. (Schedule names in schedin should consist of numbers and letters only.)
• The output will give you: a consistent total ordering resulting from the combined scehdules after a set of dependencies D are removed followed by a list of those dependencies D. The set D is the smallest set found in our heuristic algorithm. In addition, there will be a file called schedout which will contain the combined graph in the same format as the input.

Often, having the option to remove dependencies from either schedule will reduce the total number of dependencies it is necessary to remove. For example suppose S1 has edges from A to B, B to C, A to D, and D to C, whereas S2 has an edge from C to A. If one restricts removals to come from S1 as in the call k schedmatch S1, then one will have to remove two edges from S1 to get compatibility (e.g. A to B and A to D). By contrast, if one calls k schedmatch then one needs to remove only the C to A edge from S2.

• The optional -v flag will give you most results in which a minimal number of dependencies are removed.

# Semantics of the Input and of the Result

The dependency distance is defined as the minimum number of dependency edges to remove from one or two schedules to make them consistent. For example, consider the following two schedules denoted bob and alice. (Recall that schedule names should consist of alphabetic characters and numbers only.)

```bob|A|B
bob|B|C
bob|C|D
bob|B|E
bob|E|D
bob|A|D
bob|F|D
alice|A|B
alice|B|C
alice|D|C
alice|B|E
alice|D|E
alice|G|D
```
Schedule bob requires event A to precede B, B to precede C, and so on up to F precedes D. Schedule alice requires A to precede B, B to precede C, and so on up to G precedes D. These are in the file schedin. We will illustrate the result from four different calls
• k schedmatch
```Non-verbose mode gives just one solution. Add -v to commandline for verbose.
Combined graph can be found in schedout
Consistent ordering: ABFGEDC
Deleted edges: C D bob|D E alice
```

This means that by removing the C to D dependency from Bob and the D to E dependency from Alice, we get at least one consistent ordering of tasks. An example is given: ABFGEDC.

The file schedout has the combined graph with these edges removed:

```Consistent ordering: ABFGEDC
Deleted edges: C D bob|D E alice
bob_alice|A|B
bob_alice|B|C
bob_alice|B|E
bob|E|D
bob|A|D
bob|F|D
alice|D|C
alice|G|D
```
The label on each edge in the combined graph says where the edge came from.
• k schedmatch -v
```Verbose output
Combined graphs can be found in schedout
Consistent ordering: FGABECD
Deleted edges: D C alice|D E alice
Consistent ordering: AGBFCDE
Deleted edges: D C alice|E D bob
Consistent ordering: FGABDCE
Deleted edges: C D bob|E D bob
Consistent ordering: AGFBEDC
Deleted edges: C D bob|D E alice
```

This gives different alternative edge removals that all consist of the smallest number of edges that need to be removed, as far as we have found. This is a heuristic so may not be exhaustive and may not yield the smallest number of edges, though it normally does. The file schedout has a new set of edges for each possibility.

```Consistent ordering: ABFGEDC
Deleted edges: C D bob|D E alice
bob_alice|A|B
bob_alice|B|C
bob_alice|B|E
bob|E|D
bob|A|D
bob|F|D
alice|D|C
alice|G|D

Consistent ordering: ABFGECD
Deleted edges: D C alice|D E alice
bob_alice|A|B
bob_alice|B|C
bob_alice|B|E
bob|C|D
bob|E|D
bob|A|D
bob|F|D
alice|G|D

Consistent ordering: ABFGCDE
Deleted edges: D C alice|E D bob
bob_alice|A|B
bob_alice|B|C
bob_alice|B|E
bob|C|D
bob|A|D
bob|F|D
alice|D|E
alice|G|D

...
```
• k schedmatch bob
This constrains the edges to be removed to come only from the bob schedule.
```Non-verbose mode gives just one solution. Add -v to commandline for verbose.
Combined graph can be found in schedout
Consistent ordering: FGABDEC
Deleted edges: C D bob|E D bob
```
• k schedmatch -v alice
```Verbose output
Combined graphs can be found in schedout
Consistent ordering: FGABECD
Deleted edges: D C alice|D E alice
```
These are the smallest number of edges that can be removed from the alice schedule alone to make the two graphs consistent. Even though a verbose output was requested, the program found only one minimal output.

# Advanced Use: three or more schedules

Suppose one has three or more schedules and one wants to find the smallest number of edges to remove in order to find a consistent schedule. For example, let the input be this:

```bob|A|B
bob|B|C
bob|C|D
bob|B|E
bob|E|D
bob|A|D
bob|F|D
alice|A|B
alice|B|C
alice|D|C
alice|B|E
alice|D|E
alice|G|D
carol|B|D
carol|D|C
carol|G|C
carol|A|B
ted|A|B
ted|C|B
ted|D|B
ted|H|I
```

For the more-than-two schedule case, the only two possible calls are:

• k schedmatch
```Non-verbose mode gives just one solution. Add -v to commandline for verbose.
Combined graph can be found in schedout
Consistent ordering: AFGHIBEDC
Deleted edges: C B ted|C D bob|D B ted|D E alice
```
Here is schedout:
```Consistent ordering: AFGHIBEDC
Deleted edges: C B ted|C D bob|D B ted|D E alice
bob_alice_carol_ted|A|B
bob_alice|B|C
bob_alice|B|E
bob|E|D
bob|A|D
bob|F|D
alice_carol|D|C
alice|G|D
carol|B|D
carol|G|C
ted|H|I
```
• k schedmatch -v
```Verbose output
Combined graphs can be found in schedout
Consistent ordering: AFGHIBEDC
Deleted edges: C B ted|C D bob|D B ted|D E alice
Consistent ordering: HFGAIBDCE
Deleted edges: C B ted|C D bob|D B ted|E D bob
Consistent ordering: GFHAIDBCE
Deleted edges: B D carol|C B ted|C D bob|E D bob
```

# Support

This work has been partly supported by the U.S. National Science Foundation under grants IIS-0414763, DBI-0445666, N2010 IOB-0519985, N2010 DBI-0519984, DBI-0421604, and MCB-0209754. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. This support is greatly appreciated.