Inference in Datalog

Datalog: The subset of the predicate calculus consisting of sentences with Since all variables are universally quantified, with a scope of the entire sentence, there is no need to actually write down the quantifiers, as long as there is some convention about which symbols are variables and which are constants. Here, I will use upper case for variables and lower case for constants.

Thus, all sentences have either the form.

1. PRED(TERM ... TERM); or
2. [PRED(TERM ... TERM) ^ PRED(TERM ... TERM) ^ ... ^ PRED(TERM ... TERM)] =>
PRED(TERM ... TERM)
where all the TERM's are either variables or constants.

Sentences of form (1) are called "facts"; sentences of form (2) are called "rules". The formula on the right hand of the implication of a rule is called the "head" of the rule; the formulas on the left hand are the "tail" of the rule.

Example

The following is a Datalog knowledge base describing a family tree and rules for relations:
Rules
1. parent(X,Y) ^ male(X) => father(X,Y).
2. father(X,Y) => parent(X,Y).
3. father(X,Y) => male(X).
4. parent(X,Y) ^ female(X) => mother(X,Y).
5. mother(X,Y) => parent(X,Y).
6. mother(X,Y) => female(X).
7. child(X,Y,Z) => parent(Y,X).
8. child(X,Y,Z) => parent(Z,X).
9. son(X,Y,Z) => child(X,Y,Z).
10. son(X,Y,Z) => male(X).
11. male(X) ^ child(X,Y,Z) => son(X,Y,Z).
12. daughter(X,Y,Z) => child(X,Y,Z).
13. daughter(X,Y,Z) => female(X).
14. female(X) ^ child(X,Y,Z) => daughter(X,Y,Z).

Facts
15. male(philip).
16. female(elizabeth).
17. son(charles,philip,elizabeth).
18. daughter(anne,philip,elizabeth).
19. son(andrew,philip,elizabeth).
20. son(edward,philip,elizabeth).
21. son(william,charles,diana).
22. son(harry,charles,diana).
The following sentences in predicate calculus cannot be expressed in Datalog:

Inference

The method of inference we will discuss is a mixture of two ideas:

I. Modus Ponens. From the sentences

[P^Q^R^S] => W.
P.
Q.
R.
S.
you can infer W.

II. (Universal instantiation) From the sentence "forall(X) P(X)" you may infer P(T) for any term T.

Now, suppose we have the sentences

1. forall(X) man(X) => mortal(X). (All men are mortal)
2. man(socrates).
Then we can proceed as follows: The question is, how did we know to replace X in (1) by "socrates" rather than, say "eiffel_tower" or anything else? The answer is that we chose to replace X by socrates in order to make possible the use of modus ponens. This is done by unifying the atomic formula "man(X)" in (1) with the formula "man(socrates)" in (2).

Substitution and Unification in Datalog

A substitution in Datalog is a mapping of variable symbols to variable or constant symbols. That is, it is a set of pairs of the form V -> W where:

Example: The set { X -> a, Y -> a, Z -> b, P -> Q, M -> Q} is a substitution. (Again, variables are upper-case; constants are lower case.)

Non-examples. The set {X -> a, X -> b, Y -> z} is not valid because X is on the left hand of two different pairs. The set {X -> a, b -> a, Y -> b} is not valid because the constant b appears on the left hand of a pair. The set { X -> a, Y -> X, Z -> b } is not valid because X appears on both the left and right side of a pair. appears in two sets). The collection {{X,a,b}, {Y}} is not a valid

Applying substitution U to sentence S = PRED(T1 ... Tk)

If V -> W occurs in U, then replace every occurrence of V in S by W.

Example Let U be the substitution { X -> a, Y -> a, Z -> b, P -> Q, M -> Q}.
Let S be the sentence f(X,Y,c) ^ g(Y,Z,W) ^ h(P,c,X) => h(X,c,Q)

The result of applying U to S is then
f(a,a,c) ^ g(a,b,W) ^ h(Q,c,a) => h(a,c,Q)

Unifying two atomic formulas

Let F1 and F2 be two atomic formulas with no variables in common. A substitution B unifies formulas F1 and F2 if the result of apply(B,F1) = apply(B,F2).

Let F1 be the formula "P(T1 ... Tk)" and let F2 be the formula "Q(W1 ... Wm)"

unify(F1,F2) : returns either a unifying substitution, or NIL if there is no
               unifying substitution

{ if (P != Q or k != m) then return(NIL);
  C = collection containing each of the symbols in different sets;
  for i = 1 to k do { 
    let E(Ti) and E(Vi) be the sets in C containing Ti and Vi;
    if E(Ti) and E(Vi) each contain a constant (necessarily different)
       then return(NIL);
    if (Ti != Vi) then merge Ti with Vi in C;
    } 
  B := emptyset;
  for (S in C) {
    if (S has more than one element) {
      if (S contains constant symbol X) /* S contains at most one constant */
        for (each variable Q in S) add Q -> X to B;
      else { X = some variable in S;
             for (Q != X in S) add Q -> X to B;
    } }
}
This can be made very efficient using the UNION-FIND or MERGE-FIND algorithm.

Example: To unify the formula F1="p(a,X,b,X,Y,W)" with F2="p(U,U,T,V,c,T)":

Since both formulas have predicate "p" and 6 arguments,

Initialize C to be {{a}, {b}, {c}, {T}, {U}, {V}, {W}, {X}, {Y}}

First loop:
Step 1: T1="a", V1="U". Merge {a} with {X}.
C  = {{a,U}, {b}, {c}, {T}, {V}, {W}, {X}, {Y}}
Step 2: T2="X", V2="U". Merge {X} with {a,U} 
C  = {{a,U,X}, {b}, {c}, {T}, {V}, {W}, {Y}}
Step 3: T3="b", V3="T". Merge {b} with {T} 
C  = {{a,U,X}, {b,T}, {c}, {V}, {W}, {Y}}
Step 4: T4="X", V4="V". Merge {a,U,X} with {V} 
C  = {{a,U,X,V}, {b,T}, {c}, {W}, {Y}}
Step 5: T5="Y", V5="c". Merge {Y} with {c} 
C  = {{a,U,X,V}, {b,T}, {c,Y}, {W}} 
Step 6: T6="W", V6="T". Merge {W} with {b,T}
C = {{a,U,X,V}, {b,T,W}, {c,Y} }

Second loop:
B = { U -> a, X -> a, V -> a, T -> b, W -> b, Y -> c}.
Note that apply(C,F1) = apply(C,F2) = "p(a,a,b,a,c,b)"

Forward Chaining

The basic idea in forward chaining is this: You look for unifications that will match each of the formulas in the left hand side of a known rule to known facts. You can then infer the right hand side of the rule under the specified substitution. This is a fact that you can add to the knowledge base. You keep doing this until there is nothing new to be derived, then stop.

Example: Suppose you start with the following knowledge base (a subset of the knowledge base at the start of these notes: hence the gappy numbering).

Rules
1. parent(X,Y) ^ male(X) => father(X,Y).
2. father(X,Y) => parent(X,Y).
3. father(X,Y) => male(X).
4. parent(X,Y) ^ female(X) => mother(X,Y).
5. mother(X,Y) => parent(X,Y).
6. mother(X,Y) => female(X).
7. child(X,Y,Z) => parent(Y,X).
8. child(X,Y,Z) => parent(Z,X).
9. son(X,Y,Z) => child(X,Y,Z).
10. son(X,Y,Z) => male(X).
11. male(X) ^ child(X,Y,Z) => son(X,Y,Z).
Facts
15. male(philip).
16. female(elizabeth).
17. son(charles,philip,elizabeth).
We can proceed as follows:
Combining rule (9) with fact (17) with substitution {X -> charles, Y -> philip, Z -> elizabeth}, infer (23) child(charles,philip,elizabeth)

Combining rule (10) with fact (17) with substitution {X -> charles, Y -> philip, Z -> elizabeth}, infer (24) male(charles).

Combining rule (7) with fact (23) with substitution {X -> charles, Y -> philip, Z -> elizabeth}, infer (25) parent(philip,charles)

Combining rule (8) with fact (23) with substitution {X -> charles, Y -> philip, Z -> elizabeth}, infer (26) parent(elizabeth,charles)

Combining rule (1) with facts (15) and (25) with substitution {X -> philip, Y -> charles} infer (27) father(philip,charles).

Combining rule (4) with facts (16) and (26) with with substitution {X -> elizabeth, Y -> charles} infer (28) mother(elizabeth,charles).

No further facts can be derived.

IMPORTANT NOTE: Each time you are trying to unify a fact with a set of rules, you must rename the variables so that no two of the sentences use the same variable names.

Here is some pseudo-code to do forward chaining exhaustively. I am not going to go through this in detail.

fc(RR : set of rules; FF: set of facts) return set of facts. {
repeat { NEW_FACT := false;
         for each rule R = "T1 ^ T2 ^ ... ^ Tk => H" do {
           for each fact F1 in consequences(R,FF) do {
             if (F1 is not in FF)  then
               { NEW_FACT := true;
                 FF := add F1 to FF;
                 }
            }
         }
        }
  until not(NEW_FACT);
}

/* "consequences" recurs down the formulas in the tail of rule R */
consequences(R : rule or fact; FF : set of facts) return set of facts;
{ if (R is a fact) then return(R); /* Base case of the recursion */
  T1 := the first literal in the tail of R;
  NEW_FACTS := empty set;
  for each fact (F in FF) {
     rename the variables in F so as not to conflict with the variables in R;
     U = unify(T1,F);
     if (U != NIL) then {
       R1 := drop T1 from R;
       R1 := apply(U,R1);
       NEW_FACTS := NEW_FACTS union consequences(R1,FF)
      }
    }
  return(NEW_FACTS);
A trace of the subroutine "consequences" for a reasonably complicated example is given here

Backward Chaining

In backward chaining you start with a goal , a statement that you want to prove, in addition to the knowledge base, and you look for derivations that allow you to prove the goal. You look for a rule R or a fact F. whose head unifies with the goal, and then recursively try to prove all the formulas in the tail of R.

For instance, suppose that you start with the following knowledge base,

Rules
1. parent(X,Y) ^ male(X) => father(X,Y).
2. parent(X,Y) ^ female(X) => mother(X,Y).
3. child(X,Y,Z) => parent(Y,X).
4. child(X,Y,Z) => parent(Z,X).
5. male(X) ^ child(X,Y,Z) => son(X,Y,Z).
6. female(X) ^ child(X,Y,Z) => daughter(X,Y,Z).

Facts
7. male(philip).
8. female(elizabeth).
9. child(charles,philip,elizabeth).
10. child(anne,philip,elizabeth).
11. male(charles).
12. female(anne).
and the goal is "mother(elizabeth,charles)". Backward chaining proceeds as follows:
Goal G0: mother(elizabeth,charles).
  Unify G0 with the head of rule (1) with substitution {X1->elizabeth, Y1->charles}}
  Subgoals: G1 = "parent(elizabeth,charles)", G2="female(elizabeth)".
    Goal G1: parent(elizabeth,charles)
      Unify G1 with the head of rule (3) with substitution X2 -> charles, Y2 -> elizabeth
      Subgoal G3 = "child(charles,elizabeth,Z2")
         No fact or rule unifies with G3.
         Fail G3
      Try again with G1.
      Unify G1 with the head of rule (4) with substitution X2 -> charles, Y2 -> elizabeth
      Subgoal G3 = "child(charles,Y3,elizabeth)"
         G3 unifies with fact (9) with substitution Y3=elizabeth.
         G3 succeeds.
      G1 succeeds.
    Goal G2: female(elizabeth)
       G2 unifies with fact 8.
       G2 succeeds
G0 succeeds.
If you have a variable in the goal in backward chaining, then the goal is to find _some_ value of the variable that satisfies the formula. For example, if the goal is "child(charles,J,K)", that means "find J and K such that charles is the child of J and K. Backward chaining in that case proceeds as follows.
G0: child(charles,J,K)
  G0 unifies with fact 9, under substitution {J->philip, K->elizabeth}.
G0 succeeds with J=philip and K=elizabeth.
Another example: if the goal is "daughter(J,K,L)" (find J,K,L such that J is the daughter of K and L) then backward chaining proceeds as follows:
Goal G0: daughter(J,K,L)
   G0 unifies with rule 6 under the substitution {X1->J, Y1->K, Z1->L}
   Subgoals: G1="female(J)", G2="child(J,K,L)"

   Goal G1: female(J).
     Unify G1 with fact 8, with binding J->elizabeth.
     G1 succeeds with binding J->elizabeth

   Goal G2: child(elizabeth,K,L)
     G2 does not unify with any fact or rule.
     G2 fails

   Return to G1 (last point where there was a choice).
     Try unifying G1 with fact 12, with binding J=anne.
     G1 succeeds with binding J=anne.

   Goal G2: child(anne,K,L)
      G2 unifies with fact 10 with binding K->philip, L->elizabeth.
   
   G0 succeeds with J=anne, K=elizabeth, L=philip.
Unlike forward chaining, backward chaining can potentially go into an infinite loop. There are two ways to avoid this: In Datalog, repeated goal detection is sufficient to remove all infinite loops.

Reformulating Some Predicate Calculus Sentences into Datalog

In some cases a predicate calculus sentence which does not immediately appear to satisfy the conditions for Datalog can be reformulated into a logically equivalent form that does satisfy the conditions. The most important case involves existentially quantified variables on the left-hand side of an implication. Suppose you have the rule, ``X is a grandparent of Z if there exists a Y such that X is a parent of Y and Y is a parent of Z.'' The natural formulation of this in predicate calculus is,
(A) forall(X,Z) [exists(Y) parent(X,Y) ^ parent(Y,Z)] => grandparent(X,Z).
In that form, this does not satisfy the conditions on Datalog because of the existential quantifier. However, it is logically equivalent to
(B) forall(X,Y,Z) parent(X,Y) ^ parent(Y,Z) => grandparent(X,Z)
which is valid Datalog.

Let me give two arguments for this equivalence: an intuitive argument and a formal argument.

Intuitive argument:
Sentence (A) means "For any X and Z that we choose, if we can find a Y such that parent(X,Y) and parent(Y,Z) then grandparent(X,Z)." Now suppose that Elizabeth is a parent of Charles and Charles is a parent of William, thus satisfying the condition of (B). Then the condition of (A) is satisfied for X=Elizabeth and Z=William we can choose a Y such that parent(elizabeth,Y) and parent(Y,william); namely, Charles. Conversely, suppose that Jesse and Solomon satisfy the condition for (B). That is, there exists some particular Y1 such that Jesse is a parent of Y1 and Y1 is a parent of Solomon. Therefore, the condition in (A) is satisfied with X=Jesse, Y=Y1, and Z=Solomon. Thus, (A) and (B) are equivalent.

Formal argument:
(A) forall(X,Z) [exists(Y) parent(X,Y) ^ parent(Y,Z)] => grandparent(X,Z)
is equivalent to
(C) forall(X,Z) ~[exists(Y) parent(X,Y) ^ parent(Y,Z)] V grandparent(X,Z)
which is equivalent to
(D) forall(X,Z) [forall(Y) ~[parent(X,Y) ^ parent(Y,Z)]] V grandparent(X,Z)
which is equivalent to
(E) forall(X,Z) [forall(Y) ~[parent(X,Y) ^ parent(Y,Z)] V grandparent(X,Z)]
which is equivalent to
(B) forall(X,Z) [forall(Y) [parent(X,Y) ^ parent(Y,Z)] => grandparent(X,Z)

(A) is equivalent to (C) and (E) is equivalent to (B) because P=>Q is always equivalent to ~P V Q.
(C) is equivalent to (D) because ~exists(X) P(X) is always equivalent to forall(X) ~P(X). (If there does not exists anything satisfying P, then nothing satisfies P.)
(D) is equivalent to (C) because [forall(X) P(X)] V Q is equivalent to [forall(X) P(X) V Q] if variable X does not occur in Q.