** Definition: ** A * Horn clause * is a clause with
at most one positive literal.

Any Horn clause therefore belongs to one of four categories:

- A
*rule*: 1 positive literal, at least 1 negative literal. A rule has the form "~P1 V ~P2 V ... V ~Pk V Q". This is logically equivalent to "[P1^P2^ ... ^Pk] => Q"; thus, an if-then implication with any number of conditions but one conclusion. Examples: "~man(X) V mortal(X)" (All men are mortal); "~parent(X,Y) V ~ancestor(Y,Z) V ancestor(X,Z)" (If X is a parent of Y and Y is an ancestor of Z then X is an ancestor of Z.) - A
*fact*or*unit*: 1 positive literal, 0 negative literals. Examples: "man(socrates)", "parent(elizabeth,charles)", "ancestor(X,X)" (Everyone is an ancestor of themselves (in the trivial sense).) - A
*negated goal*: 0 positive literals, at least 1 negative literal. In virtually all implementations of Horn clause logic, the negated goal is the negation of the statement to be proved; the knowledge base consists entirely of facts and goals. The statement to be proven, therefore, called the goal, is therefore a single unit or the conjuction of units; an*existentially*quantified variable in the goal turns into a*free*variable in the negated goal. E.g. If the goal to be proven is "exists (X) male(X) ^ ancestor(elizabeth,X)" (show that there exists a male descendent of Elizabeth) the negated goal will be "~male(X) V ~ancestor(elizabeth,X)". - The null clause: 0 positive and 0 negative literals. Appears only as the end of a resolution proof.

Now, if resolution is restricted to Horn clauses, some interesting properties appear. Some of these are evident; others I will just state and you can take on faith.

I. If you resolve Horn clauses A and B to get clause C, then the positive literal of A will resolve against a negative literal in B, so the only positive literal left in C is the one from B (if any). Thus, the resolvent of two Horn clauses is a Horn clause.

II. If you resolve a negated goal G against a fact or rule A to get clause C, the positive literal in A resolves against a negative literal in G. Thus C has no positive literal, and thus is either a negated goal or the null clause.

III. Therefore: Suppose you are trying to prove Phi from Gamma, where ~Phi is a negated goal, and Gamma is a knowledge base of facts and rules. Suppose you use the set of support strategy, in which no resolution ever involves resolving two clauses from Gamma together. Then, inductively, every resolution combines a negated goal with a fact or rule from Gamma and generates a new negated goal. Moreover, if you take a resolution proof, and trace your way back from the null clause at the end to ~Phi at the beginning, since every resolution involves combining one negated goal with one clause from Gamma, it is clear that the sequence of negated goals involved can be linearly ordered. That is, the final proof, ignoring dead ends has the form

~Phi resolves with C1 from Gamma, generating negated goal P2 P2 resolves with C2 from Gamma, generating negated goal P3 ... Pk resolves with C2 from Gamma, generating the null clause.

IV. Therefore, the process of generating the null clause can be viewed as a state space search where:

- A state is a negated goal.
- A operator on negated goal P is to resolve it with a clause C from Gamma.
- The start state is ~Phi
- The goal state is the null clause.

V. Moreover, it turns out that it doesn't really matter which literal in P you choose to resolve. All the literals in P will have to be resolved away eventually, and the order doesn't really matter. (This takes a little work to prove or even to state precisely, but if you work through a few examples, it becomes reasonably evident.)

Putting all the above together, we formulate the following non-deterministic
algorithm for resolution in Horn theories. This is known as *
backward chaining.*

bc(in P0 : negated goal; GAMMA : set of facts and rules;) { if P0 = null then succeed; pick a literal L in P0; choose a clause C in GAMMA whose head resolves with L; P := resolve(P0,GAMMA); bc(P,GAMMA) }

If bc(~Phi,Gamma) succeeds, then Phi is a consequence of Gamma; if it fails, then Phi is not a consequence of Gamma.

Moreover: Suppose that Phi contains existentially quantified variables.
As remarked above, when ~Phi is Skolemized, these become free variables.
If you keep track of the successive bindings through the successful
path of resolution, then the final bindings of these variables gives you
a *value* for these variables; all proofs in Horn theories are
constructive (assuming that function symbols in Gamma are constructive.)
Thus the attempt to prove a statement like "exists(X,Y) p(X,Y)^q(X,Y)"
can be interpreted as "* Find * X and Y such that p(X,Y) and q(X,Y)."

The succcessive negated goals Pi can be viewed as negations of
*subgoals* of Phi. Thus, the operation of resolving
~P against C to get ~Q can be interpreted, "One way to
prove P would be to prove Q and then use C to infer P". For
instance, suppose P is "mortal(socrates)," C is "man(X) =>
mortal(X)" and Q is "man(socrates)." Then the step of resolving
~P against C to get ~Q can be viewed as, "One way to prove
mortal(socrates) would to prove man(socrates) and then combine that with
C."

Propositional Horn theories can be decided in polynomial time. First-order Horn theories are only semi-decidable, but in practice, resolution over Horn theories runs much more efficiently than resolution over general first-order theories, because of the much restricted search space used in the above algorithm.

Backward chaining is complete for Horn clauses. If Phi is a consequence of Gamma, then there is a backward-chaining proof of Phi from Gamma.

- The clauses in Gamma are ordered by the programmer, and the state space search above is carried out in depth-first search order.
- The negative literals in each clause are ordered by the programmer.
When negative goal P is resolved with clause C, the negative literals in
C are placed in order
*at the start*of P. - The "pick" operation picks the first literal in P. Thus, P is implemented as a LIFO stack.

A Prolog "program" is a knowledge base Gamma. The program is invoked by posing a query Phi. The value returned is the bindings of the variables in Phi, if the query succeeds, or failure. The interpreter returns one answer at a time; the user has the option to request it to continue and to return further answers.

(Actual Prolog differs from pure Prolog in three major respects:

- There are additional functionalities besides theorem proving, such as functions to assert statements, functions to do arithmetic, functions to do I/O.
- The "cut" operator allows the user to prune branches of the search tree.
- The unification routine is not quite correct, in that it does not check for circular bindings e.g. X -> Y, Y -> f(X).)

Notation: The clause "~p V ~q V r" is written in Prolog in the form "r :- p,q."

** Example: **
Let Gamma be the following knowledge base:

1. ancestor(X,X). 2. ancestor(X,Z) :- parent(X,Y), ancestor(Y,Z). 3. parent(george,sam). 4. parent(george,andy). 5. parent(andy,mary). 6. male(george). 7. male(sam). 8. male(andy). 9. female(mary).

Let Phi be the query "exists(Q) ancestor(george,Q) ^ female(Q)." (i.e. find a female descendant of george.) Then the Skolemization of Phi is "~ancestor(george,Q) V ~female(Q)." A Prolog search proceeds as follows: (The indentation indicates the subgoal structure. Note that the variables in clauses in Gamma are renamed each time.)

G0: ~ancestor(george,Q) V ~female(Q). Resolve with 1: Q=X1=george. G1: ~female(george) Fail. Return to G0. G0: ~ancestor(george,Q) V ~female(Q). Resolve with 2: X2=george. Z2=Q. G2: ~parent(george,Y2) V ~ancestor(Y2,Q) V ~female(Q). Resolve with 3: Y2=sam. G3: ~ancestor(sam,Q) V ~female(Q). Resolve with 1: Q=X3=sam. G4: ~female(sam). Fail. Return to G3. G3: ~ancestor(sam,Q) V ~female(Q). Resolve with 2: X4=sam, Z4=Q G5: ~parent(sam,Y2) V ~ancestor(Y2,Q) V ~female(Q). Fail. Return to G3. G3: ~ancestor(sam,Q) V ~female(Q). Fail. Return to G2. G2: ~parent(george,Y2) V ~ancestor(Y2,Q) V ~female(Q). Resolve with 4: Y2=andy. G6: ~ancestor(andy,Q) V ~female(Q). Resolve with 1: X5=Q=andy G7: ~female(andy). Fail. Return to G6. G6: ~parent(andy,Y6) V ~ancestor(Y6,Q) V ~female(Q). Resolve with 5: Y6=mary. G8: ~ancestor(mary,Q) V ~female(mary). Resolve with 1: X7=Q=mary. G9: ~female(mary) Resolve with 9. Null.Return the binding Q=mary.

Forward chaining is generally thought of as taking place in a dynamic knowledge base, where facts are gradually added to the knowledge base Gamma. In that case, forward chaining can be implemented in the following routines.

procedure add_fact(in F; in out GAMMA) /* adds fact F to knowledge base GAMMA and forward chains */ if F is not in GAMMA then { GAMMA := GAMMA union {F}; for each rule R in GAMMA do { let ~L be the first negative literal in R; if L unifies with F then then { resolve R with F to get C; if C is a fact then add_fact(C,GAMMA) else /* C is a rule */ add_rule(C,GAMMA) } } } end add_fact. procedure add_rule(in R; in out GAMMA) /* adds rule R to knowledge base GAMMA and forward chains */ if R is not in GAMMA then { GAMMA := GAMMA union {R}; let ~L be the first negative literal in R; for each fact F in GAMMA do if L unifies with F then { resolve R with F to get C; if C is a fact then add_fact(C,GAMMA) else /* C is a rule */ add_rule(C,GAMMA) } } end add_fact. procedure answer_query(in Q, GAMMA) return boolean /* Success or failure */ { QQ := {Q} /* A queue of queries while QQ is non-empty do { Q1 := pop(QQ); L1 := the first literal in Q1; for each fact F in GAMMA do if F unifies with L then { resolve F with Q1 to get Q2; if Q2=null then return(true) else add Q2 to QQ; } } return(false)

The forward chaining algorithm may not terminate if GAMMA contains recursive rules.

Forward chaining is complete for Horn clauses; if Phi is a consequence of Gamma, then there is a forward chaining proof of Phi from Gamma. To be sure of finding it if Gamma contains recursive rules, you have to modify the above routines to use an exhaustive search technique, such as a breadth-first search.

animal(X) :- sponge(X). animal(X) :- arthopod(X). animal(X) :- vertebrate(X). ... vertebrate(X) :- fish(X). vertebrate(X) :- mammal(X) ... mammal(X) :- carnivore(X) ... carnivore(X) :- dog(X). carnivore(X) :- cat(X). ...(I have skipped family and genus in the hierarchy.)

Now, suppose we have such a knowledge base of rules, we add the fact "dog(fido)" and we query whether "animal(fido)". In forward chaining, we will successively add "carnivore(fido)", "mammal(fido)", "vertebrate(fido)", and "animal(fido)". The query will then succeed immediately. The total work is proportional to the height of the hierarchy. By contast, if you use backward chaining, the query "~animal(fido)" will unify with the first rule above, and generate the subquery "~sponge(fido)", which will initiate a search for Fido through all the subdivisions of sponges, and so on. Ultimately, it searches the entire taxonomy of animals looking for Fido.

In some cases, it is desirable to * combine * forward and backward
chaining. For example, suppose we augment the above animal with features
of these various categories:

breathes(X) :- animal(X). ... backbone(X) :- vertebrate(X). has(X,brain) :- vertebrate(X). ... furry(X) :- mammal(X). warm_blooded(X) :- mammal(X). ...If all these rules are implemented as forward chaining, then as soon as we state that Fido is a dog, we have to add all his known properties to Gamma; that he breathes, is warm-blooded, has a liver and kidney, and so on. The solution is to mark these property rules as backward chaining and mark the hierarchy rules as forward chaining. You then implement the knowledge base with both the forward chaining algorithm, restricted to rules marked as forward chaining, and backward chaining rules, restricted to rules marked as backward chaining. However, it is hard to guarantee that such a mixed inference system will be complete.