The planning problem in Artificial Intelligence is about
the decision making performed by intelligent creatures like robots,
humans, or computer programs when trying to achieve some goal.
It involves choosing a sequence of actions that will (with a high likelihood)
transform the state of the world, step by step, so that it will
satisfy the goal. The world is typically viewed to consist of atomic
facts (state variables), and actions make some facts true and some facts false.
In the following we discuss a number of ways of formalizing planning,
and show how the planning problem can be solved automatically.
We will only focus on the simplest AI planning problem, characterized
by the restriction to one agent in a deterministic
environment that can be fully observed.
More complex forms of planning can be formalized e.g. in the framework
of Marvov decision processes, with uncertainty about
the effects of actions and therefore without the possibility to predict
the results of a plan with certainty.
The most basic planning problem is one instance of the general
s-t reachability problem for succinctly represented transition graphs,
which has other important applications in Computer Aided Verification
(reachability analysis, model-checking),
Intelligent Control,
discrete event-systems diagnosis, and so on.
All of the methods described below are equally applicable to all of these
other problems as well, and many of these methods were initially developed
and applied in the context of these other problems.
Further, more realistic planning and other problems can use the
basic problem as a subprocedure, or more general problems can be reduced
to it. For example, temporal planning can often be reduced to the
base case directly (Cushing et al. 2007) or the base case can be used
as a subprocedure (Rankooh & Ghassem-Sani, 2013)
The methods discussed in this tutorial have strengths in different types of problems.
Symbolic methods based on BDDs excel in problems with a relatively small number
of state variables (up to one or two hundred), with a complex but regular state space.
Explicit state-space search is generally limited to small state spaces, but the AI planning community has been successfully applying explicit state-space search also to very large state-spaces, when their structure is simple enough to allow useful heuristic distance estimates, and when there are plenty of plans to choose from.
Methods based on logic and constraints (SAT, constraint programming) are strong
on problems with relatively high numbers of state variables and not too long plans,
especially when constraints about the structure of the solution plans and the reachable
state-space are available.
Models of state transition systems
Most works on planning use a state-transition system model
(we often just write transition system), in
which the world/system consists of a (finite or infinite) number of
states, and actions/transitions change the current state to
a next state. (Exceptions to this are planning with Hierarchical Task Networks (HTN), and planning problems closer to Scheduling, in which the world states are not represented explicitly.)
There are several possible representations for state-transition systems.
A state-transition system can be represented as an arc-labeled directed
multi-graph.
Each node of the graph is a state, and the arcs
represent actions.
An "action" is usually something that can be taken in several states,
with the same or different effects. All the arcs corresponding to
one action are labeled with the name of the action.
There may be more than one action that moves us from state A to state B,
and this means that there are more than one arc from A to B (this is why we said that the graph is a multi-graph.)
Succinct representations of transition systems
In planning (similarly to other areas, including Discrete Event Systems and Computer-Aided Verification), transition systems are usually not represented
explicitly as a graph, simply because those graphs would be far too big,
with billions and trillions of nodes, or even far more.
Instead, a compact (succinct, factored) representation
is used.
These representations are typically based on state variables (corresponding to (atomic) facts).
A state, instead of being an atomic object, is a valuation
of a (finite) number of state variables.
If there are state variables A : {1,2,3}, B : {0,1} and C : {0,1}, then
there are 3*2*2* states 100, 200, 300, 101, 201, 301, 110, 210, 310, 111,
211, and 311.
If the domain of each state variable is finite, and there are finitely many state variables, then the number of states is finite as well.
When states are represented as valuations of state variables,
an action can be represented as a procedure
or a program for changing the values of the state variables
by making value assignments to them.
Below we explain a couple of possibilities how these procedures can be defined.
Each of these action representations describes a binary relation
on the set of all states: what are the possible pairs (s,s')
of current state s and its successor state s'.
STRIPS
The simplest language used for formalizing actions is the STRIPS
language.
In STRIPS, the state variables have the domain {0,1} (equivalently
{FALSE, TRUE}),
and an action consists of three sets of state variables,
the PRECONDITION, the ADD={a_{1},a_{2},...,a_{n}} list, and the DELETE={d_{1},d_{2},...,d_{m}} list (it is assumed that ADD and DELETE don't intersect.)
An action is possible in a state if all the variables in PRECONDITION
have the value 1.
Taking the action corresponds to executing the following program, consisting
of assignment statements only:
a_{1} := 1
a_{2} := 1
...
a_{n} := 1
d_{1} := 0
d_{2} := 0
...
d_{m} := 0.
All the assignments are instantaneous and take place simultaneously.
In STRIPS planning, a goal is usually expressed as a set of
state variables. A state is a goal state if all of the goals
have the value 1 in it.
Petri nets
Petri nets are a model of state transition
systems in which several transitions may take place simultaneously in the sense
that they are independent.
Some of the restricted Petri net models (e.g. 1-safe state-transition nets)
are equivalent to the propositional STRIPS model.
For Petri nets, the state variables are the places.
In state-transition nets, each place can hold 0 or more tokens.
Essentially, a place is a state variable with the set of natural numbers
as its domain.
The class of 1-safe Petri nets adds the condition that each place can
only hold 0 or 1 tokens. This way each state variable has the domain {0,1},
and there will be only a finite number of states.
Transitions (actions) in Petri nets are described slightly differently.
Each transition has a set of predecessor places and a set of
successor places.
The transition is possible if all predecessor places N have a token
(this corresponds to the PRECONDITION in STRIPS).
When the transition is fired, all the successor places will
receive an additional token.
STRIPS actions can be relatively easily translated into Petri nets with
the 1-safety property (see Hickmott et al., IJCAI'07),
but there are some small complications because in STRIPS the assignment
a_{1} := 1 has the same result no matter what the old value of a_{1} was,
whereas in Petri nets if a_{1} is a successor place, the number of tokens
is increased by one, and may become something else than 1
if the place already had one or more tokens.
Petri nets in general cannot be translated into STRIPS, because in general there may be any number of tokens in a place, and consequently an infinite number of states. Petri nets with the 1-safety property however can be translated.
Both STRIPS and Petri nets can be viewed in the general framework of simple programs.
While STRIPS is programs with assignment := 0 and := 1 only, Petri nets can be more complex.
General place-transition nets correspond to the following.
Effects can be of the form: x := x + c
Effects can be of the form: y := y - c
The preconditions can be of the form y >= c.
If there is effect y := y - c, then there must be precondition y > c' for some c' >= c.
1-safe Petri nets are ones in which each state variable has either value 0 or 1. They correspond to the following.
All effects are of the form x := 0 or x := 1.
Preconditions are of the form x = 1.
If there is effect x := 0, then there must be precondition x = 1.
Clearly, there is no essential difference between Petri nets and STRIPS, and both
can be seen as special cases of a slightly more general language, in which
state variables can be Boolean or integer-valued, preconditions are Boolean
combinations of Booleans and integer inequalities, and effects are assignments
of Boolean or integer-valued expressions to the state variables.
A simple generalization is to have if-then-else in the effect, as in PDDL, discussed below.
C. A. Petri, Kommunikation mit Automaten, Institut für instrumentelle Mathematik, Universität Bonn, Schriften des IIM, No. 2, Germany, 1962.
Murata, T., Petri nets: properties, analysis and applications, Proceedings of IEEE, 77(3), pp. 541-580, 1989.
PDDL/ADL
PDDL/ADL is a generalization of STRIPS.
Here we describe a propositional variant of PDDL/ADL.
(PDDL actually has a Lisp-like syntax, but we don't use it here.)
The differences of PDDL in comparison to STRIPS are
The PRECONDITION may be an arbitrary Boolean combination of atomic facts
about the state variables.
Atomic facts say something about one state variable,
for example a=0 or b=1.
A precondition could for example be (a=1)∨¬(b=1∧c=0).
Instead of the unconditional assignments represented by ADD and DELETE,
the effects may be conditional. This means that the effects are
of the form,
IF condition THEN a := v
where the condition is a Boolean combination of facts.
STRIPS corresponds to PDDL with trivial conditions that are always true (the condition is the constant TRUE).
Goals may be Boolean combinations of atomic facts (formulas).
PDDL can be reduced to STRIPS, but
there is NO one to one correspondence between the PDDL actions and
the STRIPS actions, because several STRIPS actions may be needed for
one PDDL action, and
the set of STRIPS actions may have a size that is exponential
in the size of the set of PDDL actions. This is because
a PDDL action may need to be reduced to an exponential number of STRIPS
actions. For example, a PDDL action with effects
IF a_{1}=0 THEN a_{1} := 1
IF a_{1}=1 THEN a_{1} := 0
IF a_{2}=0 THEN a_{2} := 1
IF a_{2}=1 THEN a_{2} := 0
...
IF a_{n}=0 THEN a_{n} := 1
IF a_{n}=1 THEN a_{n} := 0
corresponds to 2^{n} STRIPS actions because the effects of
the actions are different in 2^{n} different classes of states,
and for each class of state a different STRIPS action is needed.
Representation of actions in the propositional logic
All possible actions can be represented as formulas in the propositional logic.
This includes all STRIPS and PDDL actions, and also further actions that cannot
be represented in (deterministic variants) of PDDL.
The logic representation of actions is useful for various powerful logic-based
search methods which we will be describing later in detail.
Here we restrict to Boolean state variables, and will denote the atomic facts a=0 and a=1 respectively by a and ¬a.
In the representation of actions and change in the propositional logic
we need two sets of propositional variables: one expressing the values
of the state variables in the predecessor state, and one for the successor
state. We distinguish between the two by the ' sign so that the old
value of x is denoted by the propositional variable x and the new value by
the propositional variable x'.
All the above action representations (STRIPS, 1-safe Petri nets, PDDL) can
be compactly represented in the propositional logic.
Given an action, its logic representation consists of two parts:
the precondition formula P (exactly like in PDDL)
description of what happens to each state variable,
which is a conjunction of formulas, one conjunct for each state variable.
Each state variable a_{i} is represented as an equivalence
F(a_{1},a_{2},...,a_{n}) ↔ a'_{i}
where F is a Boolean function on the state variables (a formula).
For STRIPS the functions F are trivial: for a state variable x they are
either x, the constant TRUE, or the constant FALSE. This means that
in STRIPS either the value of a state variable remains the same, or the
variable unconditionally becomes TRUE or FALSE.
The reduction from 1-safe Petri nets to the propositional logic goes
through the STRIPS representation.
For PDDL the functions F may be arbitrarily complex. We don't give
a systematic mapping from PDDL to the propositional logic here, although
the mapping is straightforward.
An action consisting of the precondition d and
the conditional effect IF a∧b THEN c is represented by the propositional
formula
d∧(((a∧b)∨c)↔c').
The formula says that the new value of c will be TRUE if and only if
a and b are both true or c was true already.
Explicit state-space search
Explicit state-space search,
meaning the generation of states reachable from the initial state
one by one,
is the earliest and most straightforward method for solving some of the
most important problems about transition systems, including model-checking
(verification), planning, and others, widely used since at least the 1980s.
All the current main techniques for it were fully developed by 1990s,
including symmetry and partial order methods, informed search algorithms,
and optimal search algorithms (A* already in the 1960s).
It is relatively easy to implement efficiently, but, when
the number of states is high, its applicability is limited by
the necessity to do the search only one state at a time.
However, when the number of states is less than some tens of millions,
this approach is efficient and can give guarantees of finding solutions
in a limited amount of time.
Search algorithms
There are several types of search algorithms that are routinely
applied in planning. These include the following.
well-known uninformed search algorithms like depth-first search,
breadth-first search
systematic heuristic search algorithms with optimality guarantees, for example
A* and its variants like IDA*, WA*
systematic heuristic search algorithms without optimality guarantees,
for example the standard "best-first" search algorithm which is like
A* but ignores the cost-so-far component of the valuation function
incomplete unsystematic search algorithms, most notably stochastic
search algorithms
Incomplete search algorithms can be useful for specific problems when good
heuristics are available, or as a part of a portfolio of search algorithms
(see portfolios below.)
References
P. E. Hart, N. J. Nilsson and B. Raphael,
A formal basis for the heuristic determination of minimum-cost paths,
IEEE Transactions on System Sciences and Cybernetics,
SSC-4(2), pp. 100-107, 1968.
J. Pearl, Heuristics: Intelligent Search Strategies for Computer Problem Solving, Addison-Wesley, 1984.
S. Russell and P. Norvig, Artificial Intelligence: A Modern Approach.
Symmetry reduction
Symmetry reduction methods try to decrease the effective
search space by recognizing symmetries in the state-space
graph.
In planning, symmetries are typically caused by the interchangeability
of objects. If there is a plan that involves some interchangable objects A and B, there is a symmetric plan with the roles of A and B interchanged.
The standard works on symmetry reduction for state-space search are
from early 1990s, and they are directly applicable to all of the
explicit state-space search algorithms.
References
Starke, P. H., Reachability analysis of Petri nets using symmetries,
Journal of Mathematical Modelling and Simulation in Systems Analysis,
8(4/5), pp. 293-303, 1991.
Clarke, Edmund M., Reinhard Enders, Thomas Filkorn, and Somesh Jha,
Exploiting symmetry in temporal logic model checking.
Formal methods in system design 9(1-2), 1996.
Partial-order reduction
Partial-order reduction methods try to decrease the number of
search steps by recognizing different forms of independence of
actions/transitions. The basic idea is that if actions/transitions A and B
are independent in the sense that they can be taken in either order
A;B or B;A, one only needs to consider one of these orderings.
Many of the existing partial-order reduction methods address deadlock
detection, but there is a simple reduction from the problem of
reaching a goal state to the problem of reaching a deadlock state,
and therefore these methods are easily applicable to the planning problem, too.
Antti Valmari, Stubborn Sets for Reduced State Space Generation, Advances in Petri Nets 1990. 10th International Conference on Applications and Theory of Petri Nets, Bonn, Germany, Lecture Notes in Computer Science 483, pp. 491-515, Springer-Verlag, 1991.
Heuristics
There are several ways of systematically (automatically) creating heuristics.
The main approaches are different forms of relaxation, in which either
the original problem instance is simplified, and then the simplified
problem instance is solved, or
the problem instance is unmodified, but the definition of a solution
is simplified to be efficiently solvable. (In many cases the same
thing can be achieved in either of these ways.)
Well-known examples of this are the following.
Computing the max-heuristic of Bonet and Geffner for STRIPS
proceeds by ignoring all effects x := 0 and retaining all effects x := 1, and
then finding a plan for this simplified problem instance.
(This is easy to rephrase more generally for multi-valued and even real-valued
variables.)
This will give a lower bound
for the plan length for the original instance, and can be used as a heuristic
in a heuristic search algorithm such as A* to find shortest plans.
Another, general method for deriving heuristics is
pattern databases (Culberson & Schaeffer, 1998, with application
to planning by Edelkamp in 2001).
The idea is to relax the original problem instance in alternative ways,
by eliminating all state variables except for a small number n
of them (with n typically 10 or less, so that the problem
becomes almost trivial), and then solving the simplified problem instances.
The length of any of the optimal solutions is a lower bound for
the length of the shortest plan of the original problem instance.
generalized abstractions of Dräger et al. (2006), which
can be viewed as a generalization of pattern databases that allows
abstraction by arbitrary aggregation of states, not only by ignoring
some of the state variables (this was called combine-and-abstract by
Dräger et al., and planning researchers have renamed this to
merge-and-shrink).
There are methods for deriving composite heuristics from several individual heuristics.
The simplest one, used in connection with pattern databases,
is to take the maximum of different heuristics. Since all
of the constituent heuristics are lower bounds for the true shortest plan
length, their maximum is a lower bound as well.
There are variations of the "taking the maximum" scheme that try
to be more accurate: when two heuristics respectively give lower bounds
n and m, and their calculation (e.g. through the max-heuristics or pattern databases) involved completely disjoint sets of actions,
then also the sum of the heuristics is still a lower bound
for the length of the shortest plan of the original plan.
The use of this strategy is more complicated, because it may be difficult
to find a partition of the original planning problem in a way that the disjointness
property can be guaranteed in a useful way.
The above discussion is about admissible heuristics, which represent
true lower bounds for the actual shortest plan length.
Admissible heuristics are needed when trying to find optimal
plans with algorithms such as A*.
When optimality is not required, heuristics don't have to be admissible,
and then basically anything goes. Work on finding good non-admissible
heuristics is typically driven by their performance w.r.t. standard
benchmark problems, as they don't have to satisfy any special properties.
References
B. Bonet and H. Geffner, Planning as Heuristic Search, Artificial Intelligence Journal, 129(1-2), pp. 5-33, 2001.
K. Dräger, B. Finkbeiner and A. Podelski, Directed Model Checking with Distance-Preserving Abstractions, in Model Checking Software, LNCS 3925, pages 19-34, 2006.
J. C. Culberson and J. Schaeffer, Pattern Databases, Computational
Intelligence, 14(4), pp. 318-334, 1998.
S. Edelkamp, Planning with Pattern Databases, Proceedings of
the 6th European Conference on Planning (ECP-01).
Planning by SAT and constraint satisfaction
Many transition systems have too many states to consider them explicitly
one by one, and then factored representations that allow representing
large numbers of states and state sequences may be a more efficient
alternative.
Such representations and search methods are known as symbolic
or factored.
The earliest symbolic search methods were based on Binary Decision Diagrams
(BDDs, discussed below), pursued in state-space research since late 1980s.
The currently most scalable method (since the late 1990s)
is based on reduction to the propositional satisfiability problem SAT.
SAT methods are less prone to excessive memory consumption than BDDs.
Unlike explicit state-space search, their memory consumption can be (far) less than
linear in the number of visited (stored) states, and they can be implemented
with a memory consumption that is linear in the length of a plan or
transition sequence (as opposed to the in general exponential memory
consumption of explicit state-space search.) However, the currently
leading algorithms for the SAT problem gain much of their efficiency
by consuming more memory than theoretically optimal algorithms, and in practice
consume as much memory as explicit state-space search.
All the main improvements to explicit state-space search
(symmetry reduction,
partial-order reduction, heuristics) have counterparts in the SAT
setting, for example in the form of symmetry-breaking constraints
and parallel or partially-ordered plans.
We will not be discussing these in more detail here.
(For a representative planning system (including downloads), see this page.)
SAT algorithms
For references to the technology underlying the current generation of
efficient SAT algorithms see the following.
References
D. Mitchell, A SAT Solver Primer, EATCS Bulletin, 85, pp. 112-133, February 2005.
P. Beame, H. Kautz, and A. Sabharwal, Towards Understanding and Harnessing the Potential of Clause Learning, Journal of AI Research, 22, pp. 319-351, 2004.
Reduction to the SAT problem
Plans can be found by solving a satisfiability (SAT) problem, or solving
a constraint satisfaction problem. We describe the SAT case; CSP and SAT frameworks are very much similar, and the ideas expressed in terms of SAT are directly applicable in many other constraint-based formalisms, including Constraint Programming, CSP, Integer Programming.
The idea is to have formulas P_{T} (where T is a positive integer)
such that P_{T} is satisfiable if and only if there is a plan
with a horizon 0,1,...,T.
Depending on the definition of plans, at each time point 0,1,...,T-1
there may be one or more actions.
The formulas P_{T} can be constructed from the logic representation
of actions we described earlier.
Instead of using the propositional variables A∪A'
where A = { a_{1},..., a_{n} } and A' = { a'_{1},..., a'_{n} }
for the values of the state variables in the current and in the next time
point, we have the propositional variables
A@i = { a_{1}@i,..., a_{n}@i } for different time points
0≤i≤T.
Given a formula F for some action expressed in terms of
the variables in A∪A',
we obtain F@i by replacing all these variables by
a_{1}@i,..., a_{n}@i,a_{1}@(i+1),..., a_{n}@(i+1).
When we have formulas F_{1},...,F_{n} representing n different
actions, we can represent the transition relation between states
at time i as the formula
R@i = F_{1}@i∨F_{2}@i∨...∨F_{n}@i.
Now a sequence of T actions can be represented as
R@0∧R@1∧...∧R@(T-1).
Finally,finding a plan (of length T) from any state satisfying
a formula I to any state satisfying formula G can be expressed
as the formula
P_{T} = I@0∧R@0∧R@1∧...∧R@(T-1)∧G@T, where
I@0 and G@T are obtained from I and G by replacing each state variable a
in these formulas respectively by a@0 and a@T.
The time it takes to test the satisfiability of the formulas
P_{T} strongly depends on the number of propositional variables
in the formulas, which, for a given planning problem, is determined
by the parameter T. Most works that improve the efficiency
of the satisfiability tests do this by decreasing T by allowing more
than one action at each time point. See the references and my planning as satisfiability page for more details, including high-performance implementations.
References
H. Kautz and B. Selman, Pushing the envelope: planning, propositional logic, and stochastic search, AAAI'96, pp. 1194-1201, AAAI Press, 1996.
Jussi Rintanen.
Engineering efficient planners with SAT,
In ECAI 2012. Proceedings of the 20th European Conference on Artificial Intelligence, IOS Press, 2012.
Invariants (state constraints, mutexes)
Explicitly representing interdependencies of state variables is often critical
for efficient planning with SAT, and without them a SAT solver will
be spending a lot of effort inferring the implicit interdependencies.
Any dependency that holds in all reachable states is
an invariant of the problem instance (a given initial state and
actions).
The most primitive
non-trivial invariant has the form ¬a∨¬b, saying that a
and b cannot be true simultaneously.
The first method that produced invariants for pruning the search space
was the planning graph construction of GraphPlan (Blum and Furst, 1997).
The invariants were a by-product of an approximate reachability computation,
with invariants obtained as the fixpoint of the computation.
Later algorithms have first made the computation more explicit (Rintanen KR'98)
and then generalized (Rintanen, ECAI'08) to much more general forms of actions.
The general fixpoint construction, though
conceptually simple and natural, can be moderately expensive to implement,
and many implementations have used simpler methods that find specific
classes of invariants more efficiently.
References
A. L. Blum and M. L. Furst.
Fast planning through planning graph analysis.
Artificial Intelligence Journal 90(1-2),
pages 281-300, 1997.
J. Rintanen.
A planning algorithm not based on directional search.
in Principles of Knowledge Representation and Reasoning: Proceedings of the Sixth International Conference (KR '98),
A. G. Cohn, L. K. Schubert, and S. C. Shapiro, eds.,
pages 617-624, Trento, Italy,
June 1998. Morgan Kaufmann Publishers, San Francisco, California.
J. Rintanen.
Regression for classical and nondeterministic planning.
In Malik Ghallab, Constantine D. Spyropoulos, and Nikos Fakotakis,
editors, ECAI 2008. Proceedings of the 18th European Conference on
Artificial Intelligence. pages 568-571, IOS Press, 2008.
Planning by "symbolic" search methods such as Binary Decision Diagrams
Symbolic search methods here means the use of Binary Decision Diagrams (BDD)
and similar normal forms (such as DNNF) for propositional formulas for
representing sets of states and transition relations compactly.
"Symbolic" planning uses exactly the same formulas we used
above with SAT-based planning. The difference is that instead of
constructing the formulas
P_{T} = I@0∧R@0∧R@1∧...∧R@(T-1)∧G@T
for different values of T>0 and testing their satisfiability,
we incrementally and implicitly compute from the formulas
I@0
I@0∧R@0
I@0∧R@0∧R@1
I@0∧R@0∧R@1∧R@2
the sets of states that are respectively reachable by 0, 1, 2 or more actions,
and see if some of the goal states are reachable.
Instead of the timed variables a@i we only use the variables a and a' for
the current and the next state.
I represents the initial state(s).
Let R = F_{1}∨F_{2}∨...∨F_{n} where
the F_{i} formulas are as defined earlier.
Now I∧R represents the set S of state pairs (s,s') where state s' follows s
when one of the actions is taken and s is an initial state.
To compute the set { s' | (s,s') ∈S } we eliminate from I∧R all
propositional variables in A.
For this we use the existential abstraction operation defined
by ∃x.F = F[0/x]∨F[1/x].
For sets X = { x_{1},...,x_{k} } of propositional variables
we define ∃X.F = ∃x_{1}.∃x_{2}....∃x_{k}.F.
We also define a renaming operation: F[A'/A] means that all occurrences
of variables in A' are replaced by the corresponding variable in A
(for example a' is replaced by a.)
Now sets of states reachable by 0,1,2 and more steps are represented by
the following formulas.
I
(∃A.(I∧R))[A'/A]
(∃A.(((∃A.(I∧R))[A'/A])∧R))[A'/A]
(∃A.((∃A.(((∃A.(I∧R))[A'/A])∧R))[A'/A])∧R)[A'/A]
This can be more clearly written as a small program which computes the sets
of states reachable with 0,1,2 and more steps until a set that intersects
the goals G is reached.
states[0] := I;
t := 0;
while states[t]∧G is unsatisfiable do
t := t+1;
states[t] := (∃A.(states[t-1]∧R))[A'/A];
end while
The above algorithm terminates at iteration t iff t is the length
of the shortest action sequence from a state in I to a state in G.
At this point a plan can be output by the following algorithm.
goal := any valuation that satisfies states[t]∧G;
while goal does not equal I do
i := any action index i such that states[t]∧F_{i}↓(goal[A/A']) is satisfiable;
output action i;
goal := any valuation (on A) that satisfies states[t]∧F_{i}↓(goal[A/A']);
end while
Above, goal[A/A'] is a valuation like goal, except that it is defined on
A' instead of A, that is. goal[A/A'](a') = goal(a) for all a∈A.
And R↓v for a valuation v on A is a formula obtained from R by replacing
each occurrence of a∈A by TRUE if v(a)=true and by FALSE if v(a)=false.
The scalability of BDD-based traversal methods can be dramatically improved
by doing away with the optimality of the solution length. The saturation
method of Ciardo et al. (2001, 2007) replaces the breadth-first style search with a far more efficient traversal order.
Adnan Darwiche, Decomposable negation normal form, Journal of the ACM, 48(4), 2001.
This paper proposed another normal form that is often more compact than BDDs.
Coudert, Berthet and Madre, Verification of Synchronous Sequential Machines Based on Symbolic Execution, in Automatic Verification Methods for Finite State Systems, LNCS 407, Springer-Verlag, 1990. This paper and the next are the first ones to use BDDs for solving
state-space reachability problems.
Burch, Clarke, Long, MacMillan and Dill, Symbolic Model Checking for Sequential Circuit Verification, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 13(4), 1994.
Yu, Andy Jinqing, Gianfranco Ciardo, and Gerald Lüttgen. Decision-diagram-based techniques for bounded reachability checking of asynchronous systems. International Journal on Software Tools for Technology Transfer 11.2 (2009): 117-131. Dramatically improved BDD-based state-space search, with scalability sometimes matching SAT-based methods.
Meta-level search strategies
When solving a planning problem, there are some meta-level strategies
that can be applied to any type of search algorithm (or their combinations).
Algorithm portfolios
The idea of using a combination of several techniques has already been
mentioned in the context of heuristics, where aggregates of two or
more unrelated heuristics can be formed.
Algorithm portfolios do the same at the highest level of a program
that solves a planning problem.
The general idea is to utilize the complementarities between different
search methods.
If the strengths of algorithm 1 are sufficiently complementary to
the strengths of algorithm 2, the combination of these two algorithms
may be much stronger than either of the components.
A portfolio can be used in different ways.
Selection: Choose one algorithm from the portfolio, based on the properties
of the problem instance, and run it.
Sequential composition:
Run two or more algorithms according to a given schedule, for example
the first algorithm for n seconds or until some termination
criterion is satisfied, and if no solution has been found,
terminate the run and continue with the next algorithm.
Parallel composition:
Run several algorithms in parallel (in one or several CPUs/cores), with
the same or different rates, until one of them finds a solution.
The advantage of parallel composition is that it finds a solution quickly if one of the component algorithms finds a solution quickly.
Its disadvantage is that all the algorithms are run simultaneously, requiring more memory.
The first planner that extensively used portfolios was Kautz and Selman's Blackbox planner from 1999 (others, with fixed portfolios, include LPG and FF).
In the planning area, the construction of portfolios has been more of
an art than a science, typically constructing a fixed portfolio by
selecting a suitable combination of algorithms that have been experimentally (or trial and error) found to perform well with the target problem set.
References
Gomes & Selman, Algorithm portfolio design: theory vs. practice,
Uncertainty in AI, 1997.