TEACH PSTRIPS                             Allan Ramsay, October 17 1985

-- Contents -----------------------------------------------------------
-- Background
-- Representations
-- Variables and operator schemas
-- Implementation in PROLOG
-- Exercises
-- References

-- Background ---------------------------------------------------------

This file indicates how you might write a simple means-end (goal
reducing, backward chaining) problem solver in PROLOG. A lot of
difficult problems with such programs will be left open - see TEACH
STRIPS and TEACH SOLVER for a more sophisticated (but almost
incomprehensible) version written in POP-11.

The sort of problem solver described here is useful when you have a
repertoire of primitive actions which are described in terms of the
situations in which they may be applied and the changes they will bring
about in those situations when they are applied. The classic use for
such operations is the domain of robot planning in the blocks world,
where sequences of movements of a (simulated) robot hand and arm are
constructed for moving (simulated) toy blocks around in a (simulated)
world. These blocks world programs appear to have little to do with the
real world, but in fact any robot which is (really) going to move around
in the (real) world, picking (real) things up and moving them around
will have to construct its plans about what to do in some simulated
world first, so blocks world planners do have some serious applications.
Similar planning systems, with different basic operators but the same
planning algorithms, have been used in areas as diverse as speech act
understanding (where what a person says is interpreted in terms of what
effects on the world it might be designed to bring about [Allen &
Perrault]) and genetic engineering (where the effects of different
sequences of laboratory operations are investigated in order to come up
with some sequence which produces the desired substance [Stefik]). The
examples in this file will all be to do with blocks world manipulations.

The basic ideas behind most current means-end problem solvers were first
presented in [Fikes & Nilsson]. Since then a number of people have
proposed various elaborations, e.g. [Sacerdoti], [Waldinger], [Warren].
[Nilsson pp.275 - 361] gives a good (but very technical) overview of
nearly everything that has been done on planning - a program that
embodied everything he discusses there would be an achievement indeed.

-- Representations ----------------------------------------------------

We start by thinking about the structure of a fairly simple operation,
namely the act of picking up a single block. We have a rather primitive
robot, which can only pick blocks up if there is nothing on top of them.
Our description of the operations performed by this robot is fairly
coarse grained: it does not involve any specifications of absolute
positions, so the only things we know about the world are relations
between blocks, the table and the robot's hand. Everything in our world
has a name - blocks are called by single letter names , e.g. "a", "b",
"c", ..., the table is called "table" and the robot's hand is called
"hand". We can thus describe the action of picking up the block "a" in
English as follows -

    The block a may be picked up if it is clear and the robot's hand is
    empty. When it has been picked up, then the robot's hand will be
    holding it. The block a will no longer be on the surface it was on
    before the action was performed.

We see that the description may be split into three parts - things that
must be true for the action to be performed, things which were not true
before it was performed and are afterwards, and things that were true
before but are not after. Note that there are a large number of
statements about the world which are not mentioned in this description.
For some, such as "My first name is Allan", or "The first means-end
analysis program was written by Aristotle" the actions of the robot seem
to be irrelevant. For others, such as "A may now be lowered onto C", or
"The table is now clear", performing the action may well change their
truth value. What to do about these other statements is a difficult
problem, known as the "frame problem" - not to be confused with Minsky's
notion of "frames" as a means of chunking knowledge. One solution, which
is reasonably straightforward to implement in PROLOG, is to say that we
will regard a statement as true if either (i) we have it explicitly
stated, or (ii) we can infer it from things which are explicitly stated.
This will do as a first approximation, though there are a number of
problems with it.

We can try to rewrite the English description above in a more formal
way, grouping the three types of statement together and writing each
statement as a PROLOG fact.

    operation:      pickup(a)
    preconditions:  clear(a), empty(hand)
    newly_true:     holding(hand, a)
    newly_false:    empty(hand)

We have captured here the notion of an action call PICKUP(A), which may
be performed if A is CLEAR and HAND is EMPTY. When the action has been
performed HAND will be holding A, and HAND is EMPTY will have become
false. There are quite a number of things wrong with this
characterisation of the action of picking something up. We will note
here simply that if we rely on PROLOG facts, i.e. expressions without
any variables in, we will (i) have to have a different operation for
picking up A, picking up B, picking up C, ..., and (ii) that we will
have great difficulty in stating the other obvious, immediate
consequence of picking up A, namely that A is no longer on the surface
it used to be on (which may or not itself now be clear, depending on
what else it had on it when A was picked up - this is the frame problem
again).

For the moment we will see what we can do with the above description.
Suppose that HAND was EMPTY, and that (for some reason) we wanted it to
be holding A. It is clear that if we were to perform the operation
specified above the situation would be as we wanted it to be. This is
the basic idea behind means-end problem solving - if there is something
about the world that you would like changed, find some performable
action that will change it appropriately. The description of actions in
terms of facts that they will make true or false is exceptionally well
suited to this process - our model of the world consists of a set of
statements of facts which are currently true or false, so to change it
so that it is the way we want it, all (!!!) we have to do is find some
set of actions which adds the facts we want and removes the ones we
don't want.

But wait a minute. What if HAND was not EMPTY, if it was holding B ?
Then we can't apply our action, since its preconditions were not
satisfied. We now have a subsidiary problem - making HAND is EMPTY true.
This is exactly the same sort of problem as the first one, namely there
is a statement about the world which is not true but which we would like
to make true. Suppose we had another action, say

    operation:      putdown(b)
    preconditions:  holding(hand, b)
    newly_true:     on(b, table), empty(hand)
    newly_false:    holding(hand, b)

Then the same approach - find an action which makes the desired
statement true - will enable us to achieve this goal. Once we have
achieved it we can go back to our original goal. The action which
achieves that is now performable, so the job is done. We have a plan
{putdown(b), pickup(a)} which transforms a world in which [holding(hand,
b), clear(a), ...] are all true into one in which [on(b, table),
holding(a), clear(a), ...] are true. Note that both sets of statements
contain various things which were unaffected by the actions in the plan.
These include a statement about what A was on before we picked it up -
we still haven't worked out how to get rid of this one. We have
nonetheless got the outline of the basic planning algorithm, which is as
follows

(i)     Choose a goal statement which is not true in the current world
(ii)    Find an action which would make it true if performed
(iii)   Add the currently unsatisfied preconditions of this action to the
        list of goals
(iv)    Goto (i)

We can refine this basic algorithm in many, many ways, but it will
always be the basis of what we do.

-- Variables And Operator Schemas -------------------------------------

The first thing to do is to allow operators to contain variables. This
will enable us to replace our collection of operators for picking things
up (i.e. pickup(a), pickup(b), pickup(c), ...) by a single operator
schema pickup(X) (a,b,c are names of individuals, X, Y, Z are
variables). It will also enable us to give a reasonable specification of
facts such as "a" no longer being on the surface it started on. Here's
our new version of pickup

    operation:      pickup(X)
    preconditions:  empty(hand), clear(X), on(X,Y)
    newly_true:     holding(hand, X)
    newly_false:    empty(hand), on(X,Y)

This is a bit better. Now whenever we want the HAND to be holding some
individual, say "a", we simply instantiate the variable X to the
individual "a" throughout this schema to get

    operation:      pickup(a)
    preconditions:  empty(hand), clear(a), on(a,Y)
    newly_true:     holding(hand, a)
    newly_false:    empty(hand), on(a,Y)

We cannot actually perform this action as it stands, since we don't know
what Y is. But before we try to perform an action we have to check its
preconditions. If we have a complete description of the world, we are
(or should be) guaranteed to have a statement saying that "a" is on
something (since if we are not holding it and it is not on anything then
it must be suspended in mid-air), and examination of this statement will
tell us WHAT it is that "a" is on. Suppose it's on "b". Then we can
instantiate Y to "b" throughout the partially instantiated schema to get

    operation:      pickup(a)
    preconditions:  empty(hand), clear(a), on(a,b)
    newly_true:     holding(hand, a)
    newly_false:    empty(hand), on(a,b)

This is a perfectly performable action, which we can include in our
plan. It has been constructed from our single schema for picking things
up, and it includes a statement to the effect that after the action has
been performed "a" will not be on "b". It does not, however, enable us
to tell whether "b" will be clear - we will have to infer that (if we
want to know it) by looking to see if anything else is on "b". This
leads us to the idea that clearness is not a primitive notion - that it
might have been better if the original had been

    operation:      pickup(X)
    preconditions:  empty(hand), not(on(Z,X)), on(X,Y)
    newly_true:     holding(hand, X)
    newly_false:    empty(hand), on(X,Y)

or even, by similar reasoning,

    operation:      pickup(X)
    preconditions:  not(holding(hand,W)), not(on(Z,X)), on(X,Y)
    newly_true:     holding(hand, X)
    newly_false:    on(X,Y)

Finally we observe that, if we are prepared to allow negated statements
to appear in our operators we no longer really need to distinguish
between newly_true and newly_false, so we could rewrite our last version
of pickup as

    operation:      pickup(X)
    preconditions:  not(holding(hand,W)), not(on(Z,X)), on(X,Y)
    changes:        holding(hand, X), not(on(X,Y))

-- Implementation in PROLOG -------------------------------------------

How are we going to make use of these notions in a PROLOG problem
solver? We have three major questions to answer - how are we going to
represent the operations, how are we going to represent the world, and
how are we going to construct a plan?

The first of these is easy. We simply include in our program statements
like

operator(pickup(X),
        [not(holding(hand, _1)), not(on(_2, X)), on(X,Y)],
        [holding(hand, X), not(on(X,Y))]).

Try consulting a PROLOG file containing this, and then typing

?- listing(operator).

and

?- operator(Op, Preconditions, Changes).

Try typing

?- operator(Op, Preconditions, [holding(hand,a) | Other_effects]).

We see that PROLOG has not only retrieved the operator, it has also
instantiated quite a lot of its variables. If we tell PROLOG about the
operators that are available, we have a very simple mechanism for
retrieving and instantiating all the operators that might achieve a
given goal. This mechanism is not in fact good enough to be used in
practice, since it assumes that the first statement in the list of the
operator's effects is the only one that will ever be required as a goal,
but it is not difficult to upgrade so that it can be used to achieve any
of its effects.

How are we to represent the current state of the world? At first sight
it might seem convenient to use the ordinary PROLOG database, putting
statements like ON(A,B), HOLDING(HAND,C), ... in it. It turns out that
this can be a bit awkward, because we may want to do more complicated
things with our description of this state than just inspecting and
updating it. Most people prefer to use a list of known facts, with
changes being made by adding and deleting elements of the list. We would
thus expect to have some predicates like

    new_fact(not(Fact), Oldstate, Newstate) :-
        !, remove(Fact, Oldstate, Newstate).
    new_fact(Fact, Oldstate, [Fact | Oldstate]).

and

    true(not(Fact), State) :- !, not(member(Fact, State)).
    true(Fact, State) :- member(Fact, State).

for maintaining and inspecting the current state of the world. Note that
both these predicates have an initial clause for dealing with negative
facts, and that this clause is immediately cut - a negated statement is
TRUE if and only if the unnegated statement is not explicitly known. The
cut is essential, since without it we might attempt to backtrack through
a call of TRUE for a negated fact. It may be necessary in general to
elaborate these predicates - what, for instance, should we do about
facts that are inferrable from known facts, about doubly negated facts,
... ? We will not come back to this, but you should bear it in mind when
you write your implementation.

You may decide that it would in fact be easier to use the PROLOG
database for recording the current state of the world. It will certainly
make it easier to check things which can be inferred from facts which
are explicitly stated. If you do want to use the PROLOG database, you
will probably need to use "reversible" database operations, since you
may need to undo changes that you made while developing some plan which
subsequently turned out not to work. The following may be useful:

    soft_assert(Fact) :- asserta(Fact).
    soft_assert(Fact) :- retract(Fact).

    soft_retract(Fact) :- retract(Fact).
    soft_retract(Fact) :- asserta(Fact).

We now have some idea how to represent operators and world states. How
are we going to construct a plan to turn a given world state into one in
which some list of goals is satisfied ? Recalling our outline plan
algorithm above, we might start with something like

(i)     plan(World, [], []).
(ii)    plan(World, [Goal | Goals], Plan) :-
            true(Goal, World),
            plan(World, Goals, Plan).
(iii)   plan(World, [Goal | Goals], [Action | Plan]) :-
            achieves(Action, Goal),
            perform(Action, World, Newworld),
            plan(Newworld, Goals, Plan).

These three clauses say (i) if there's nothing left to do, you needn't
do anything, (ii) if your first goal is already true in the current
world, all you need to do is find some plan which achieves the others,
(iii) otherwise you'll have to find some action which achieves your
first goal, construct the world that results from performing that plan,
and then find some other plan to achieve the rest of your goals in this
new world.

The main problem with this simple-minded planner is that it doesn't
check that the action it has chosen to achieve the given goal is
actually performable in the current world. It doesn't check its
preconditions. If it did, we might find that we had to add them in to
our list of goals to be achieved before we could perform the chosen
action, so the program would look rather more like

(i)     plan(World, [], []).
(ii)    plan(World, [Goal | Goals], Plan) :-
            true(Goal, World),
            plan(World, Goals, Plan).
(iii)   plan(World, [Goal | Goals], [Op | Plan]) :-
            achieves(operator(Op, Preconditions, Changes), Goal),
            sub_plan(World, Preconditions, Sub_plan, Intermed_world),
            make_changes(Changes, Intermed_world, Newworld),
            plan(Newworld, Goals, Rest_plan),
            append(Sub_plan, Rest_plan, Plan).

The only changes are in clause (iii). Firstly (this is fairly minor) we
are now taking the chosen operation apart into its components as soon as
we get it (in the call on ACHIEVES). The big change is that we now
demand a whole new plan to achieve the preconditions of this chosen
operation. The call on SUB_PLAN is extremely similar to the original
call on PLAN, save that SUB_PLAN has to return the resulting
(intermediate) state of the world as well as the plan that produced it.
Finally we perform the action (with MAKE_CHANGES), which certainly OUGHT
to be possible, since we just constructed a plan to achieve all its
preconditions; construct a new plan to achieve the remainder of our
goals; and join our current chosen action, the intermediate plan that
made it performable, and the plan that achieved all the outstanding
goals, into a single plan which is the result of the planner.

The difference between the clauses PLAN and SUB_PLAN is irritating and
inessential - it would be better to add the extra parameter to PLAN as
well, which would lead us to

(i)     plan(World, [], [], World).
(ii)    plan(World, [Goal | Goals], Plan, New_world) :-
            true(Goal, World),
            plan(World, Goals, Plan, New_world).
(iii)   plan(World, [Goal | Goals], Plan, New_world) :-
            achieves(operator(Op, Preconditions, Changes), Goal),
            plan(World, Preconditions, Sub_plan, Intermed_world),
            make_changes(Changes, Intermed_world, Next_world),
            plan(Next_world, Goals, Rest_plan, New_world),
            append(Sub_plan, [Op | Rest_plan], Plan).

This is becoming more like something which would work. There are still
several important predicates to be defined - notably TRUE, ACHIEVES and
MAKE_CHANGES - but there is nothing terribly difficult about any of
them, so we will not develop them in detail here. The main immediate
problem with the program as it stands is that it is possible for later
actions to undo the effects of earlier ones, either in the top-level
call of PLAN or in the recursive calls which are trying to achieve
preconditions of required actions. Because of this we have to keep
checking ALL our goals, and replanning to achieve things which we have
already done if they are subsequently undone. Probably the easiest way
to do this is to use a predicate (say NEXT_UNTRUE_GOAL) which will
select the next goal to be achieved from our original list, rather than
assuming that we can get away with simply walking down the list goal by
goal. So the next version of our program is

(i)     plan(World, Goals, Plan, New_world) :-
            next_untrue_goal(Goals, Goal, World),
            achieves(operator(Op, Preconditions, Changes), Goal),
            plan(World, Preconditions, Sub_plan, Intermed_world),
            make_changes(Changes, Intermed_world, Next_world),
            plan(Next_world, Goals, Rest_plan, New_world),
            append(Sub_plan, [Op | Rest_plan], Plan).
(ii)    plan(World, Goals, [], World).

This is very close to working. There is still a rather subtle problem
with it. We are using NEXT_UNTRUE_GOAL in clause (i) to enumerate the
goals which are untrue in the current state, so that if we get stuck
trying to find a plan for the first untrue goal we can try again, from
the same position, with another one. With this use of NEXT_UNTRUE_GOAL,
we see that when we have enumerated all of them without successfully
completing our plan, we should conclude that there is no way of getting
from the current state to the desired state. So we should FAIL. But we
are also using NEXT_UNTRUE_GOAL to tell us when we have got a state in
which there are no untrue goals - if NEXT_UNTRUE_GOAL fails, we conclude
that everything has been achieved and we fall through to clause (ii),
which says that an empty plan will be sufficient if all our goals are
currently satisfied (which we are inferring from the fact that
NEXT_UNTRUE_GOAL failed). Unfortunately NEXT_UNTRUE_GOAL won't carry the
burden of performing both tasks at once. It can EITHER be used to tell
us that all our goals are currently satisfied, OR for enumerating all
the ones that aren't. Our final version of the program uses it in both
clauses, once to see if anything needs to be done, and once to enumerate
possible subgoals.

(i)     plan(World, Goals, [], World) :-
            not(next_untrue_goal(Goals, _, World)), !.
(ii)    plan(World, Goals, Plan, New_world) :-
            next_untrue_goal(Goals, Goal, World),
            achieves(operator(Op, Preconditions, Changes), Goal),
            plan(World, Preconditions, Sub_plan, Intermed_world),
            make_changes(Changes, Intermed_world, Next_world),
            plan(Next_world, Goals, Rest_plan, New_world),
            append(Sub_plan, [Op | Rest_plan], Plan).

The cut in clause (i) is for efficiency rather than to affect the
overall behaviour of the program. If we have checked that
NEXT_UNTRUE_GOAL is false, there is no possible point in going into
clause (i), which will immediately try to prove NEXT_UNTRUE_GOAL and
will inevitably fail.

This final version of the program works if you have simple operators,
simple initial worlds, and simple sets of goals. It needs to be
substantially extended in a variety of ways if it is to be any use in
more complex situations. The exercises below outline some of the things
that you might want to do to improve it. It would not be surprising if
some of these required radical reimplementation of the basic algorithm,
but some at least should be possible simply by extending it in
reasonably straightforward ways.

-- Exercises ----------------------------------------------------------

(1) Fill in the missing bits in the final version of the planner above
(i.e. the predicates NEXT_UNTRUE_GOAL, ACHIEVES, MAKE_CHANGES). Test
your program on a simple set of blocks world operations.

(2) As the program stands, operations are specified in terms of
preconditions and effects. It may try to find a plan to achieve any
unsatisfied precondition of any operation, even if it is clear (to us)
that there is no possibility of finding such a plan (e.g. if one of the
preconditions of the operation was that BLOCK(TABLE) should hold, and
there are no operations which can possibly turn something into a block
unless it is already known to be one). The program may also choose to
use some operation to achieve a goal when the goal of that operation is
more like a side-effect than a major reason for using it. See if
changing the basic representation of operations so that they have the
form

    operation(OP,
              UNACHIEVABLE_PRECONDITIONS,
              ACHIEVEABLE_PRECONDITIONS,
              PRIMARY_EFFECTS,
              SIDE_EFFECTS).

makes your program perform better (or worse?).

(3) Make your program check that its plans do not contain moves whose
effects are undone by later moves. You will have to distinguish between
undoing the effects of a move AFTER they have been used (i.e. if move M1
was performed to achieve some precondition of move M2, then it doesn't
matter if M2 undoes the effects of M1. But if both M1 and M2 are
intended to achieve preconditions of M3, then M2 should not undo the
effects of M1).

(4) Make your program check that it is not getting into an infinite
loop. It is quite possible for a means-end planner like this to do
something like decide it would like to perform action A1; in order to
perform action A1, it has to achieve precondition P1; it sees that A2
will achieve this precondition, so it decides to do this; A2 has a
precondition P2, which can be achieved by A1, so the program decides to
perform A1; in order to do this it needs to achieve P1, which leads it
to choose to do A2, which ... Dealing with loops of this sort requires
you to keep track not only of what goals are outstanding, but also why
you wanted to achieve them in the first place.

(5) Make your program plan hierarchically. A (fairly) simple way to do
this is to have levels of preconditions. You start by constructing a
plan which achieves all the initial goals using actions whose top level
preconditions are satisfied. Once you have this plan, you take each step
in it and treat this as a new problem and try to find a plan that
achieves this step with all the next level preconditions satisfied, and
so on until the lowest level preconditions of all steps in the plan are
satisfied.

If you do all the above, (i) feel pleased with yourself, (ii) think of
ways in which your program is unsatisfactory (if you can't to this bit,
have a look in [Nilsson]), (iii) solve them and go to (i).

-- References ---------------------------------------------------------

Allen JF & Perrault CR "Analyzing intention in utterances", AI 15 (1980)

Fikes RE & Nilsson NJ "STRIPS: a new approach to the application of theorem
proving to problem solving" AI 2 (1971)

Nilsson NJ "Principles of artifical intelligence" Springer-Verlag (1982)

Sacerdoti ED "Planning in a hierarchy of abstraction spaces" AI 5 (1974)

Stefik M "MOLGEN I & II" AI 16 (1980)

Waldinger RJ "Achieving several goals simultaneously"
Machine Intelligence 8 (1977)

Warren DHD "WARPLAN: a system for generating plans" Dept of Computational
Logic Memo 76, Univ. of Edinburgh (1974)
