A* Path Finding Algorithm

Fastest path between two points is a straight line… unless there's an obstacle!

Have you ever wondered how in the game PacMan, the ghosts would chase, or run-away from, PacMan? Or how about how Google Maps can give you the shortest route to the next gas station? How about, how networks discover the fastest route between them? In the world of programming this is accomplished through a Path Finding Algorithm, which is a graphing problem. In this article we'll explore one of the most popular and commonly used path finding algorithms in the majority of general purpose applications (especially games)! It's called A* or A-star search.

Point A to Point B

Getting from point A to point B is easy if you don't actually care about the shortest path. You could randomly change direction with each move and you would ultimately get there, but who on earth wants to do that?

In the popular game PacMan the ghosts used a very primitive way of searching for PacMan based on a simple heuristic commonly known as The Manhattan Distance and performed look-ahead search in order to create un-deterministic routes. This was by no means entirely effective, in case you haven't played PacMan before, but it does work efficiently in a small enough graph.

Now, the Manhattan Distance is an optimistic heuristic, because it assumes we have no obstacles in our way, however, it's very easy to apply and can still prove quite effective in A* search. We'll be using it in our implementation since it's the easiest to implement and grasp.

Learning Opportunity

It's called the Manhattan Distance, because it assumes our graph is laid out much like the street grid of the city of Manhattan. That is to say that our goal distance is equal to the summed differences between our current position and end position relative to the X and Y axis. So our heuristic would be something like abs(current.X - goal.X) + abs(current.Y - goal.Y), where all we care about is the total absolute value of distance.

First, let's take a look at how A* search actually works in an unweighted graph where all nodes have equal distance, rate-of-travel, and collision factors.

In the example above, the white squares represent unexplored areas of our graph, the red squares represent the areas we have explored, the green square represents the starting node (or Point A), and the yellow square represents the ending node (or Point B). The blue squares represent the plotted shortest path discovered by the search algorithm.

What you can immediately recognize is that the algorithm can actually exit early if it reaches the end node. Meaning, it doesn't actually need to explore the entire graph. However, it does fan out in the shape of a star or diamond, if you will. This is because we search all connecting nodes, likely all four primary cardinal directions and the four inter cardinal directions (N, S, W, E, NW, NE, SW, SE), recursively.

Of course, this is way too simplistic. Let's make it a little more complicated by putting some obstacle in the path of the starting and ending nodes and see how the algorithm will be able to plot our shortest path.

In the example above, black squares represent nodes in our graph that are obstacles, or that we can't visit, because they are blocked off. This means we can only travel around them, but not through them.

Distance from Start, Heuristic, and Cost

The easiest way to understand how this algorithm works is to calculate three basic factors for every square on our graph and observe the smallest value from those calculations in relation to the start and end nodes. These three factors are defined as follows.

Factor Description Example
G Distance from starting node node.X = parent.X + 10
H Our heuristic (Manhattan distance) (abs(current.X - end.X) + abs(current.Y - end.Y)) * 10
F Heuristic + Distance (G + H)

So just moving along the X and Y axis of the grid we can determine paths that are longer or shorter to the end node from the starting node. This isn't guaranteed to find the fastest path, because the heuristic is greedy and doesn't take a lot of other factors into account like time. Imagine you are building a game where some nodes in your graph represent water and others represent solid land. It could be the case that it takes longer to swim through water than it does to walk on land. You could tweak this heuristic to account for the added time difference between traveling each nodes and find a faster path, but for the sake of simplicity we'll be avoiding such complexities in our example here. Also, notice that our example may require exploring a lot of extra nodes, unnecessarily in many cases, since it doesn't use bounded relaxation in order to speed up the search. However, this is the good enough approach.

It's important to note that only the F factor is really considered when examining which path to follow, for the purposes of this example.

Here's the same example we saw earlier, but with the F factor plotted on the graph so that we can observe how the algorithm reached its conclusion.

Implementation

In order to implement this in javascript we'll need three basic objects. First, we'll need a Node object, which we'll use to create the graph. Each node should contain a link to its children (8 children in all), which can be used to traverse the graph. We'll name these properties by their cardinal direction abbreviations and initialize them to null.

var Node = function() {
    this.N      =
    this.S      =
    this.W      =
    this.E      =
    this.NW     =
    this.SW     =
    this.NE     =
    this.SE     =
    this.x      =
    this.y      =
                  null;
    this.G        = 0;
    this.H        = 0;
    this.F        = 0;
    this.obstacle = false;
    this.visited  = false;
}

We'll also need a Graph object which can be used to populate the nodes based on the desired size of our graph and perform other tasks like find a particular node.

var Graph = function(x, y, start, end, obs) {
    this.x     = x;
    this.y     = y;
    this.grid  = []; // our x * y grid as a vector
    this.start = null; // start node
    this.end   = null; // end node
    this.obs   = !(obs instanceof Array) ? [] : obs;
    this.init = function(start, end, obs) {
        /* Instantiate the Node objects here */
    }
    this.init(start, end, this.obs); // constructor call
}

Once you populate the graph with nodes you'll need to do two things. First, you must explore the graph, which is the discovery phase depicted by the red colored nodes in our earlier animation above. Second, you must retrace your steps back from the end node to the start node following the smallest F cost along the path of discovered nodes. This is assuming you've reached the end.

To do this we'll need a PathFinder object, which will begin at the start node by calculating it's F cost and checking all of it's child nodes and calculating each of the F costs as well. This step should repeat until you have either reached the end node or all nodes in the graph, reachable from the start node, have been explored. If all nodes reachable from the start node have been explored and the end node has still not been reached it is safe to say that the end node is unreachable and that we should give up. You also have two choices during the discovery phase. You can either choose to exit early when you have reached the end node (because we assume the first path discovered to the end node is the most optimal path, or you can choose to explore until all reachable nodes have been discovered. The second option usually only makes sense in an implementation that applies some form of linear least square, or successive relaxation technique to the heuristic factor, in order to reduce search time.

A Little Math

Wrapping your head around greedy algorithms and discrete math problems is not always trivial. Especially when you're trying to translate that math algorithmically into working code. So it might help if we try plotting the numbers on paper to see how the math actually adds up.

Here we start by looking at the four cardinal directions from the starting node, (i.e. North, South, West, and East) and calculate each of their G, H, and F costs. We'll also jot down an arrow to indicate which direction we're coming from as a reminder. This will come in handy later when we expand on the next iteration.

Notice that only the North and East nodes have the lowest F cost on the graph so far. So we can make a choice to randomly pick one in terms of which node to proceed to on our path or we can take additional factors into consideration if our implementation provides them (like the travel time I mentioned earlier). To keep things simple we're just using only the F cost and it doesn't matter if two nodes have the same F cost. We can pick one at random.

Looking at the inter-cardinal directions; NorthWest, NorthEast, SouthWest, and SouthEast, we can see that our G cost varies slightly. Remember that the G cost is just the successive Euclidean distance from the starting node. Meaning that inter-cardinal directions are the square root of the two intersecting line segments, which is more accurately 14.14213, but we're rounding down for simplicity to 14. Remember that each node should add its distance factor to it's parent node's distance factor. Since our starting node should have all factors set to null or 0, we're merely adding either 0 + 10 for all cardinal directions or 0 + 14 for all inter-cardinal directions.

From here we have explored all of the unexplored nodes in all 8 directions from the starting node. All nodes marked as red have been explored. All nodes still in white remain unexplored. We'll need to repeat this step until we have either explored all explorable nodes form the starting node or reached the ending node. Which ever comes first.

So you simply start at the starting node all over again and look at each of it's eight directions checking for the next unexplored node. Of course you find none, so you continue to examine each of the eight directions for the child nodes until you find an unexplored node.

Notice that by now we have technically found our path depending on whether your implementation allows for inter-cardinal path finding. For example, in some games it may not be possible to move in any direction other than East, West, South, or North. So we wouldn't be able to move diagonally towards the ending node.

Finally, you'll notice we may actually have to overwrite the factors of some of our nodes depending on how we wish to traverse these nodes. That's the sort of greedy part of this algorithm where you can get different results depending on your path, but still end up at the same node in the end!

I think I'll let your imagination run wild with you from here on how you can efficiently implement this algorithm for your particular use case…