AI: Problem Set 6

Assigned: Apr. 13
Due: Apr. 27

Problem 1

Consider the same microworld as in problem set 5, with items, cases, and locations. Consider the following problem:
Start state: 
Item I1 is in case C1.
C1 is at location L1.
C2 is at location L2.


Goal:
C1 is at location L3.
I1 is inside C2.
NOTE: THE FOLLOWING IS CORRECTED FROM EARLIER VERSION. SEE CLASS EMAIL.

There are three categories of plans that accomplish this: one in which C2 ends up in L1, one in which C2 ends up in L2, and one in which C2 ends up in L3.

For any two of these categories of plans, show how the POP planner (Russell and Norvig, sec. 11.3) constructs the plan. Your answer should list each operation carried out by the planner on the successful path to the plan, namely:

Your answer should show the final state of the plan. (Note that in both cases, the final plan will be a partially ordered plan.) You need not show all the intermediate stages of plan construction, though you will probably find it useful to chart these out for yourself.

Problem 2

Consider this same microworld, with the modification that an attempt to carry out an action has only probability P of succeeding. If an action fails, it is a no-op.

Suppose you are playing the following game. Initially, item I is in case C which is at location L1. At each step, you have the option of attempting an action or of quitting the game. Each attempted action costs $2. When you quit, if I is still in C, you are fined $1; if I is out of C at location L1, you are paid $2; if I is out of C at location L2, you are paid $4. Assume that utility is equal to dollar amount, which is valid for small amounts.

A. Show how this can be construed as a Markov Decision Problem.

NOTE ADDED COMMENT HERE:
You should prune actions that are obviously useless or counterproductive and states that are not worth going to. For instance, there is obviously no point in ever moving C empty or in putting I into C, so you need not consider those actions. By the same token, there is no reason you would ever attain a state where C is at a different location than L, so you need not include those states.

B. If P=1/4, what is the optimal strategy? What is the expected value of the start state?

C. If P=3/4, what is the optimal strategy? What is the expected value of the start state?