Assigned: Feb. 8

Due: Mar. 8

The following problem is known as EXACT SET COVER: You are given a universal set U of elements and a collection W of subsets of U. The object is to find a subcollection Y of W such that every element of U is in exactly one set in Y.

For instance, suppose that U is the set {a,b,c,d,e,f,g,h,i} and that W is the following collection:

W1 = {a,d,f,h} W2 = {b,e,g} W3 = {e,g} W4= {d,e,i} W5 = {b,d,f,h} W6 = {a,c,i} W7 = {c,d,e}Then one solution would be Y={W3,W5,W6}

This assignment has two parts. In Part I, you are to implement the Davis-Putnam algorithm and use it to solve EXACT-SET-COVER. Part II is experimentation.

In part 1, you will write three programs.

- 1. An implementation of the Davis-Putnam procedure, which takes as input a set of clauses and outputs either a satisfying valuation, or a statement that the clauses cannot be satisfied.
- 2. A front end, which takes as input an instance of the EXACT SET COVER problem and outputs a set of clauses that can be input to (1).
- 3. A back end, which takes as input the output of (1) and translates it into a solution to the original problem.

- 1. Assert W1 V W2 V ... V Wk. At least one of these must be in Y in order to cover E.
- 2. For each pair Wi, Wj both containing E, assert ~Wi V ~Wj. It is not possible that Wi and Wj are both in Y, since they both contain E.

W1 V W4 V W5 V W7. ~W1 V ~W4. ~W1 V ~W5. ~W1 V ~W7. ~W4 V ~W5. ~W4 V ~W7. ~W5 V ~W7.and similarly for all the other elements in U.

The output from the Davis-Putnam procedure has the following form: First, a list of pairs of atom (a natural number) and truth value (either T or F). Second, a line containing the single value 0. Third, the back matter from the input file, reproduced.

Example: Given the input

1 2 3 -2 3 -3 0 This is a simple example with 3 clauses and 3 atoms.Davis-Putnam will generate the output

1 T 2 F 3 F 0 This is a simple example with 3 clauses and 3 atoms.This corresponds to the clauses

P V Q V R. ~Q V R. ~R.

If the clauses have no solution, then Davis-Putnam outputs a single line containing a 0, followed by the back-matter in the input file.

Note: Your implementation of Davis-Putnam must work on _any_ set of clauses, not just those that are generated by the EXACT SET COVER program.

An empty line represents the null clause. Davis-Putnam should immediately reject any set of clauses containing the null clause.

You may assume that there are no more than 1000 atoms and no more than 10,000 clauses.

The format of the input contains the following elements:

- A list of the elements of U, in a single line. An element is a character string of up to 2 characters. Elements are separated by white space.
- A list of the sets in W. There is one set per line. Each line contains the line number followed by the elements separated by white space.

a b c d e f g h i 1 a d f h 2 b e g 3 e g 4 d e i 5 b d f h 6 a c i 7 c d eYou may assume that the line numbers are consecutive and that the input is correctly formatted. You do not have to do any error checking on the input. (The point of including the line number is to make it more readable for humans.) The output consists of

- 1. A set of clauses suitable for inputting to Davis-Putnam as described above. Note that these constraints are already in clausal form (CNF) and therefore you do not have to implement a program to translate arbitrary Boolean formulas to CNF.
- 2. A key for translating the numbers used for propositional atoms in the clauses into sets of elements. This is to allow the front end to communicate to the back end. The format of this is up to you. My suggestion would be a sequence of lines of the form "number, set of elements."

3 e g 5 b d f h 6 a c iIf the input indicates that the clauses have no solution, the back end should output the message "NO SOLUTION".

Second, try to map out the phase-transition behavior as a function of K where N=20 and S=4. That is: when K < 5 there are no solutions, so these are not worth bothering with. When K = 5 and not much larger than 5, few random instances will have solutions, and presumably your program will run quickly. If K is sufficiently large (I don't know how large is "sufficiently") then almost all instances will have solutions, and again presumably the program will generally run quickly. At some value of K in between, about half the instances will have solution; at that value and values close to it, the problem will probably run much more slowly.

So begin by trying small values of K and large values of K and then squeeze toward the middle to hone in on the transition value of K as close as you can. At each value of K that you test, you should run at least 10 instances. If this becomes impossibly time consuming for some value of K, then you can report that as a transitional value, and stop search at that point. For each value of K that you test, you should report: (a) the percentage of instances that have a solution; (b) the average and median time required for Davis-Putnam to execute.