[SOLVED] IJCAI95), pp. 1636

$25

File Name: IJCAI95),_pp._1636.zip
File Size: 169.56 KB

5/5 - (1 vote)

PowerPoint Presentation

Classical Planning
Building Heuristics from RPG

Copyright By Assignmentchef assignmentchef

6CCS3AIP Artificial Intelligence Planning
Dr Tommy Thompson

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

Hello and welcome to this chapter on classical planning.In this chapter were going to look at how we can expand on the RPG heuristics to come up with two new heuristics that help explore the search space in different ways.

Domain Independent Heuristics
Building heuristics that can adapt to different domains/problems.

Not reliant on specific information about the problem.

We analyse aspects of the search and planning process to find potential heuristics.

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

As weve discussed throughout this segment, were dealing with the idea of domain independent heuristics.We want to find a way for to calculate quick and useful pieces of information in our heuristics such that we can move through the search space effectively.

Delete Relaxation
Delete Relaxation
Estimate cost to the goal by removing negative effects of actions.
i.e. any PDDL effect that removes a fact is no longer considered.

Example: FreeCell Solitaire
Free cells, tableau positions remain available after moving cards onto them.
Cards remain movable and remain valid targets for other cards after moving cards on top of them.

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

To this end, weve looked at delete relaxation, whereby we remove the negative facts from our actions.By removing the negative effects, we no create a broken and unrealistic problem.Known as the relaxed problem.We know that solving the relaxed problem while usually easier is actually a good metric for calculating whether were moving towards the solution.

Relaxed Planning Graph Heuristic
Relaxed Planning Graph (RPG) heuristic creates a simple layered approach to exploring the state space.

Inspired by the GraphPlan system
GraphPlan covered later in this module.
For reference, see (Blum & Furst, 1995)

The relaxed solution plan
Where each Oi is the set of actions selected in parallel at time step i, and m is the number of the first fact layer containing all goals
h(s) := |Oi|
Blum, A. L., & Furst, M. L. (1995). Fast planning through planning graph analysis.
In Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI95), pp. 1636

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

And this was of course the basis of our RPG heuristic, given we calculate the relaxed plan solution by enumerating fact and action layers until we find the goal, then iterate backwards from that point to see how many sets of actions exist at each time step. The relaxed planning layers, give us an admissible heuristic, even if the number of actions itself isnt always admissible.

What Else Can We Gain From Relaxation?
Relaxed Planning Graph (RPG) heuristic gives us useful information from the relaxed problem.

The relaxed solution is one useful piece of information we can grab from the planning graph.

But is there any other useful information we can gather here?

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

But what else can we do with this information?The relaxed planning graphs visualise this breakdown of information as we go from the initial set of facts to the goal facts.At present were only interested in focussing on the number of action layers and the number of actions within those layers.But there is a lot of information in here that we could still use, so lets look at some additional heuristics that we can pull out of a relaxed planning problem.

What Else Can We Gain From Relaxation?
One thing we have ignored throughout is the cost of actions involved.

If we factor the costs of actions in the planning graph, we can learn more about how close a given state is to the goal.

This works even in state spaces with uniform action costs.

Lets consider two heuristics: hAdd and hMax.

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

So one thing we have completely ignored to-date is the notion of the costs of actions.If you recall from the earlier work in areas such as A* search, were interested in how expensive an action is.
A* calculates the g value which is the cost of reaching this state based on the path taken, as well as the heuristic value.

But everything weve since looked it is only interested in the heuristic value.But what if we can still use those costs inside the heuristic, then it will still prove useful to us.
Now you might be wondering why we dont just use the costs as is, but theyre no longer accurate in the relaxed problem.But still, we can gain some useful information from them in a relaxed planning graph.Even in circumstances where the action costs are uniform meaning they all have the same value.

To this end, well take a look at two specific heuristics: hAdd and hMax two heuristics that aim to calculate how expensive it is to reach a given state variable.

Understanding Costs
Consider the cost of the actions in order to satisfy a fact in our fact layers.
If the facts are already achieved, the value is 0.
If the facts are not achieved, the value is

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

So lets look at our planning graph in a slightly different way.And start thinking about the actions we require to take in order for the facts to become true, as well as the cost of those actions along the way.
If were in a situation where all the facts are currently achieved, such as the initial fact layer, it has a value of 0, given we didnt have to execute any actions to achieve those facts.
But if there is a fact that isnt satisfied yet, it has a value of infinity.Because we havent reached it yet and dont know how expensive it will be to do so.

Understanding Costs
Consider the cost of the actions in order to satisfy a fact in our fact layers.
If the facts are already achieved, the value is 0.
If the facts are not achieved, the value is

Image Credit: Heuristics for Planning, Keyder & Bonet, ICAPS 2009
[a = 0, b = , c = , d = , e = , G = ]

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

So lets consider this different approach, whereby we visualise all of that facts that can become true during the relaxed planning process.In between each of these facts, is the costs of actions.
Our goal G is to have both d and e to be true.So its important to factor how expensive each of those actions is and this can tell us how close we are to goal based on the facts that are emerging and the costs that it takes us to achieve them.

hAdd (additive) heuristic
hAdd = We sum the value of all costs to reach a given fact.
Inadmissible, given emphasis on goal independence.
[a = 0, b =2, c=7,d =6, e=6, G = 12]
Image Credit: Heuristics for Planning, Keyder & Bonet, ICAPS 2009

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

So lets consider hAdd, sometimes referred to as the additive heuristic, where we add up the costs of all actions to get us to the point that we are in now.This is can be rolled out using Dijkstras algorithm, whereby we sum the cost of actions and maintain the lowest total costs.Then for the final goal set, we simply add up the total costs reached to get there.
Hence while a is 0, b is 2 given in order to reach this fact there was an action with a cost of 2.c has a value of 7, given its the summation of the original cost to achieve b, which was 2 plus the extra action with a cost of 5.Note that both d and e are lower here, given that the other actions that we could execute from reaching point b would enable for us to achieve d and e with a value of 6, instead of 8 if it went via C.And the goal facts, G are satisfied when both d and e are satisfied, hence it takes the sum of those two states as the overall value of 12.

In essence, what were doing here is adding the heuristic values of all the preconditions required to achieve the goal set.But and this is the main part, were not adding them up admissibly.
This approach is really more reflective of the costs of satisfying each individual goal and all of its preconditions, but the knock-on effect is that it is now inadmissible.

hAdd (additive) heuristic
hAdd = We sum the value of all costs to reach a given fact.
Inadmissible, given emphasis on goal independence.
[G = (d,e)] relaxed plan cost is 10
Image Credit: Heuristics for Planning, Keyder & Bonet, ICAPS 2009

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

Now we know this is inadmissible, because if we were to let FF solve it, it would generate the plan that we see here, where the total cost is actually 10.

Given the nature of the heuristic, it means that were counting the actions several times over when we calculate the heuristic value.
Now this works fine, if we run on the assumption that the goals are completely independent of one another.

But in most planning situations, theyre not.You will typically work towards satisfying one goal while working on another.So there are a lot of shared actions. Even in this example you see here, the action that achieves fact b, moving from fact a is shared by both d and e.Hence youre counting this cost twice.This is why in this particular example, hAdd proves to be inadmissible.
Ultimately, its no longer admissible because actions are being counted several times over when achieving our heuristic value.

hMax heuristic
hMax = The most expensive path to reach a given fact.
Admissible, given on most expensive path to achieve G.
When costs are uniform (i.e. = 1), hMax is simply number of RPG layers.
[a = 0, b =2, c=7,d =6, e=6, G = 6]
Image Credit: Heuristics for Planning, Keyder & Bonet, ICAPS 2009

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

Now lets consider this another way, what if we only calculate the most expensive path to each fact?This is the hMax heuristic, where were only interested in just how expensive it was to reach that goal state.Or in other words, were simply counting the max of all the heuristic values of the goal preconditions.
So we calculate as before, only this time when we reach the goal set and theyre satisfied, we simply pick the most expensive of the paths that reached our satisfied goal set.Hence in this case, the value is 6.

Now this means that hMax is an admissible heuristic, because were assuming that by solving this, were also largely solving the rest of the planning problem.But the thing is its not terribly informative.It doesnt tell us all that much about how far we are from the goal.And one of the reasons for this is there is an implicit assumption here that the goals or rather the actions require to satisfy them interweave to a point that by basing it on the most expensive path, it is a safe bet that this is how long it will take us to satisfy all of the goals.

And thats a problem as it is not all that accurate.Especially once you have independent goals.What do we mean by that?Well if we have independent goals it means that the actions we need to achieve all of the goals dont really overlap with one another.So even in this trivial case, its relying on the assumption that all of our facts are on the same path.Hence its still reasonably informative for us here.

Classical Planning
Building Heuristics from RPG
6CCS3AIP Artificial Intelligence Planning
Dr Tommy Thompson

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

And that wraps up this video on additional heuristics that we can derive from our relaxed planning problem.Thanks for watching and well see you in the next one.

/docProps/thumbnail.jpeg

CS: assignmentchef QQ: 1823890830 Email: [email protected]

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[SOLVED] IJCAI95), pp. 1636
$25