Recursive Descent Parser for Natural Language

Input: A sentence Q and a grammar G.

Output: A parse tree for Q in G.

A recursive descent parser for Q in G is a depth-first search through the following state space:

State: A partial parse tree. Some of the words at the start of of the sentence may be leaves of the tree.

Operator: Let S be a state. The successors to S are formed as follows: Let N be the leftmost node of T that is not a word.

If N is a non-terminal, then choose a rule of the context-free rules in G with N on the left side and attach the right-hand side of the rule to N in the tree T as children.
If N is a terminal (part-of-speech) then let W be the leftmost unattached word in Q. If W=N, then attach W to N. If N is a possible part-of-speech for W, then attach W to N. Else fail.

If all of Q is attached to T but there are still leaves of T unattached to Q, then fail. If all of T is attached to Q but there are still parts of Q unattached to T, then fail.

Goal state: A tree T whose leaves are exactly the words in S.

Start state: A state S whose tree is just the root labelled S and whose sentence is Q.

Example

Grammar:
S -> NP VP
NP -> Name
NP -> Noun
VP -> VG
VP -> VG NP
VG -> Verb
VG -> Modal Verb

Lexicon:
ate -- Verb
can -- Modal, Verb, Noun
fish -- Noun, Verb
John -- Name.

Sentence: John ate fish.

State space