Fastest path between two points is a straight line… unless there's an obstacle!

Have you ever wondered how in the game PacMan, the ghosts would chase, or run-away from, PacMan? Or how
about how Google Maps can give you the shortest route to the next gas station? How about, how networks
discover the fastest route between them? In the world of programming this is accomplished through a
*Path Finding Algorithm*, which is a graphing problem. In this article we'll explore one of the
most popular and commonly used path finding algorithms in the majority of general purpose applications
(*especially games*)! It's called **A*** or *A-star* search.

Getting from point A to point B is easy if you don't actually care about the shortest path. You could randomly change direction with each move and you would ultimately get there, but who on earth wants to do that?

In the popular game *PacMan* the ghosts used a very primitive way of searching for PacMan based
on a simple heuristic commonly known as **The Manhattan Distance** and performed look-ahead
search in order to create un-deterministic routes. This was by no means entirely effective, in case you
haven't played PacMan before, but it does work efficiently in a small enough graph.

Now, the Manhattan Distance is an optimistic heuristic, because it assumes we have no obstacles in our way, however, it's very easy to apply and can still prove quite effective in A* search. We'll be using it in our implementation since it's the easiest to implement and grasp.

## Learning Opportunity

It's called the Manhattan Distance, because it assumes our graph is laid out much like the street grid of the city of Manhattan. That is to say that our goal distance is equal to the summed differences between our current position and end position relative to the X and Y axis. So our heuristic would be something like

`abs(current.X - goal.X) + abs(current.Y - goal.Y)`

, where all we care about is the total absolute value of distance.

First, let's take a look at how A* search actually works in an unweighted graph where all nodes have equal distance, rate-of-travel, and collision factors.

In the example above, the white squares represent unexplored areas of our graph, the red squares represent
the areas we have explored, the green square represents the starting node (or Point A), and the yellow square represents
the ending node (or Point B). The blue squares represent the plotted *shortest path* discovered by
the search algorithm.

What you can immediately recognize is that the algorithm can actually exit early if it reaches the end
node. Meaning, it doesn't actually need to explore the entire graph. However, it does fan out in the
shape of a *star* or diamond, if you will. This is because we search all connecting nodes, likely
all four primary cardinal directions and the four inter cardinal directions (N, S, W, E, NW, NE, SW, SE),
recursively.

Of course, this is way too simplistic. Let's make it a little more complicated by putting some obstacle in the path of the starting and ending nodes and see how the algorithm will be able to plot our shortest path.

In the example above, black squares represent nodes in our graph that are obstacles, or that we can't visit, because they are blocked off. This means we can only travel around them, but not through them.

The easiest way to understand how this algorithm works is to calculate three basic factors for every square on our graph and observe the smallest value from those calculations in relation to the start and end nodes. These three factors are defined as follows.

Factor | Description | Example |
---|---|---|

`G` |
Distance from starting node | `node.X = parent.X + 10` |

`H` |
Our heuristic (Manhattan distance) |
`(abs(current.X - end.X) + abs(current.Y - end.Y)) * 10` |

`F` |
Heuristic + Distance | `(G + H)` |

So just moving along the X and Y axis of the grid we can determine paths that are longer or shorter to
the end node from the starting node. This isn't guaranteed to find the fastest path, because the heuristic
is greedy and doesn't take a lot of other factors into account like time. Imagine you are building a game
where some nodes in your graph represent water and others represent solid land. It could be the case that
it takes longer to swim through water than it does to walk on land. You could tweak this heuristic to
account for the added time difference between traveling each nodes and find a faster path, but for the
sake of simplicity we'll be avoiding such complexities in our example here. Also, notice that our example
may require exploring a lot of extra nodes, unnecessarily in many cases, since it doesn't use **bounded
relaxation** in order to speed up the search. However, this is the *good enough* approach.

It's important to note that only the `F`

factor is really considered when examining which path
to follow, for the purposes of this example.

Here's the same example we saw earlier, but with the `F`

factor plotted on the graph so that
we can observe how the algorithm reached its conclusion.

In order to implement this in javascript we'll need three basic objects. First, we'll need a `Node`

object, which we'll use to create the graph. Each node should contain a link to its children (*8 children in all)*,
which can be used to traverse the graph. We'll name these properties by their cardinal direction abbreviations
and initialize them to `null`

.

var Node = function() { this.N = this.S = this.W = this.E = this.NW = this.SW = this.NE = this.SE = this.x = this.y = null; this.G = 0; this.H = 0; this.F = 0; this.obstacle = false; this.visited = false; }

We'll also need a `Graph`

object which can be used to populate the nodes based on the desired
size of our graph and perform other tasks like find a particular node.

var Graph = function(x, y, start, end, obs) { this.x = x; this.y = y; this.grid = []; // our x * y grid as a vector this.start = null; // start node this.end = null; // end node this.obs = !(obs instanceof Array) ? [] : obs; this.init = function(start, end, obs) { /* Instantiate the Node objects here */ } this.init(start, end, this.obs); // constructor call }

Once you populate the graph with nodes you'll need to do two things. First, you must explore the graph,
which is the discovery phase depicted by the red colored nodes in our earlier animation above. Second,
you must retrace your steps back from the end node to the start node following the smallest `F`

cost along the path of discovered nodes. This is assuming you've reached the end.

To do this we'll need a `PathFinder`

object, which will begin at the start node by calculating
it's `F`

cost and checking all of it's child nodes and calculating each of the `F`

costs as well. This step should repeat until you have either reached the end node or all nodes in the
graph, reachable from the start node, have been explored. If all nodes reachable from the start node
have been explored and the end node has still not been reached it is safe to say that the end node
is unreachable and that we should give up. You also have two choices during the discovery phase. You can
either choose to exit early when you have reached the end node (because we assume the first path discovered
to the end node is the most optimal path, or you can choose to explore until all reachable nodes have been
discovered. The second option usually only makes sense in an implementation that applies some form of
linear least square, or
successive relaxation technique to the heuristic factor, in order to reduce search time.

Wrapping your head around greedy algorithms and discrete math problems is not always trivial. Especially when you're trying to translate that math algorithmically into working code. So it might help if we try plotting the numbers on paper to see how the math actually adds up.

Here we start by looking at the four cardinal directions from the starting node, (i.e. North, South,
West, and East) and calculate each of their `G`

, `H`

, and `F`

costs.
We'll also jot down an arrow to indicate which direction we're coming from as a reminder. This will
come in handy later when we expand on the next iteration.

Notice that only the North and East nodes have the lowest `F`

cost on the graph so far. So
we can make a choice to randomly pick one in terms of which node to proceed to on our path or we can
take additional factors into consideration if our implementation provides them (like the travel time I
mentioned earlier). To keep things simple we're just using only the `F`

cost and it doesn't
matter if two nodes have the same `F`

cost. We can pick one at random.

Looking at the inter-cardinal directions; NorthWest, NorthEast, SouthWest, and SouthEast, we can see that
our `G`

cost varies slightly. Remember that the `G`

cost is just the successive
Euclidean distance from the starting node. Meaning that inter-cardinal directions are the square root
of the two intersecting line segments, which is more accurately `14.14213`

, but we're rounding
down for simplicity to `14`

. Remember that each node should add its distance factor to it's
parent node's distance factor. Since our starting node should have all factors set to `null`

or
`0`

, we're merely adding either `0 + 10`

for all cardinal directions or `0 + 14`

for all inter-cardinal directions.

From here we have explored all of the unexplored nodes in all 8 directions from the starting node. All nodes marked as red have been explored. All nodes still in white remain unexplored. We'll need to repeat this step until we have either explored all explorable nodes form the starting node or reached the ending node. Which ever comes first.

So you simply start at the starting node all over again and look at each of it's eight directions checking for the next unexplored node. Of course you find none, so you continue to examine each of the eight directions for the child nodes until you find an unexplored node.

Notice that by now we have technically found our path depending on whether your implementation allows for inter-cardinal path finding. For example, in some games it may not be possible to move in any direction other than East, West, South, or North. So we wouldn't be able to move diagonally towards the ending node.

Finally, you'll notice we may actually have to overwrite the factors of some of our nodes depending on
how we wish to traverse these nodes. That's the sort of *greedy* part of this algorithm where
you can get different results depending on your path, but still end up at the same node in the end!

I think I'll let your imagination run wild with you from here on how you can efficiently implement this algorithm for your particular use case…