Dazbo's Advent of Code solutions, written in Python
Graphs and Graph TheoryNetworkXA* AlgorithmHeapq
These algorithms help you find optimum paths through a graph.
Honestly, you won’t get far in Advent-of-Code if you don’t know them. Eric loves challenges like…
Here are my three top reasons why you should learn this:
First, let me provide a quick summary and comparison. Then we’ll get into the details after.
Algorithm | Summary | When To Use It |
---|---|---|
Breadth First Search (BFS) | Aka "flood fill". Explores equally in all directions. Uses a FIFO queue.
A list will work,
but a deque is best. |
For finding the shortest path where the _cost_ is constant for any move. Or, for finding every point in a graph. Or for finding ALL paths from a to b. |
Dijkstra’s Algorithm | Flood fill + a cost-based optimisation. I.e. prioritises paths to explore, favouring lower cost paths. Uses a _priority queue_, i.e. heapq . |
When movement costs may vary. |
A* | Optimised for a single destination. I.e. prioritises paths that _seem_ to be approaching the goal. Still uses a heapq , but uses some sort of cost heuristic. Depending on the value of the distance heuristic, it may not add much value over Dijkstra, and may actually be slower. |
When we have an idea of the direction we need to follow. |
All these algorithms utilise the concept of an expanding frontier. This expanding frontier is sometimes called a flood fill. It gives us a way to visit every allowed location on a map.
How the frontier works:
Here’s some code that will perform a flood fill to explore every valid location in the graph:
frontier = Queue() # For BFS, we just use a FIFO queue.
frontier.append(start)
explored = set() # To stop us exploring the same location more than once.
explored.add(start)
# keep going until the frontier is empty; i.e. when we've explored all the valid nodes
while frontier:
current = frontier.popleft() # pop the first item off the FIFO queue
for next in graph.neighbors(current): # get all the valid neighbours
if next not in explored:
frontier.append(next) # add it to the frontier
explored.add(next) # mark it as explored
BFS can be used to find a path from a to b. It can also be used to find paths from a to all locations in the graph.
It is guaranteed to find a solution if one exists. The frontier expands at an equal rate in all directions, and uses a FIFO (first-in, first-out) queue to implement the frontier.
distance=1
from the root. We add those to the frontier.distance=2
from the root.So the sequence of exploration looks like this:
The numbers represent the sequence of traversal. But note that it is possible for the same node to appear in the tree more than once.
When we find a node that matches our goal, we stop. Thus, a BFS is just a flood fill, with a goal.
We can implement the queue in many ways:
list
.deque
to implement a FIFO queue; it is much more efficient for appending and popping at each end.heapq
, then we end up popping based on a priority rather than FIFO. This is how we implement Dijkstra’s Algorithm or A*.stack
, then we end up with last-in, first-out (LIFO), which actually turns a BFS into a depth first search (DFS)!Here’s some pseudocode that shows how to implement the BFS:
function BFS(graph, root):
queue = new Queue()
label root as explored # we know about this node, but not its children
queue.enqueue(root)
while queue not empty:
current = queue.pop()
if current is goal:
return current
for neighbour in graph.adjacent_nodes(current):
if neighbour is not explored:
label neighbour as explored
queue.enqueue(neighbour)
There are a couple of convenient ways to label a node as explored:
set
.dictionary
If we use a set
:
frontier = deque() # For BFS, we need a FIFO queue. A deque is perfect.
frontier.append(start)
explored = set() # To stop us exploring the same location more than once.
explored.add(start)
# keep going until the frontier is empty; i.e. when we've explored all the valid nodes
while frontier:
current = frontier.popleft() # pop the first item off the FIFO queue
if current == goal:
break
for next in graph.neighbors(current): # get all the valid neighbours
if next not in explored:
frontier.append(next) # add it to the frontier
explored.add(next) # mark it as explored
if current != goal:
# there was no solution!
If we’re only interested in the shortest distance to a point, but we don’t need to create the actual path, a cool trick is to keep track of the distance in our queue. E.g.
frontier = deque() # For BFS, we need a FIFO queue. A deque is perfect.
frontier.append(start, 0) # Queue always includes (point, steps)
explored = set() # To stop us exploring the same location more than once.
explored.add(start)
# keep going until the frontier is empty; i.e. when we've explored all the valid nodes
while frontier:
current, steps = frontier.popleft() # pop the first item off the FIFO queue
if current == goal:
break
for next in graph.neighbors(current): # get all the valid neighbours
if next not in explored:
frontier.append(next, steps+1) # add it to the frontier
explored.add(next) # mark it as explored
if current != goal:
# there was no solution!
# distance from start to goal = steps
If we use a dictionary
instead of a set
, we can create a breadcrumb trail to each point we’ve explored.
Let’s explore every valid point in the graph:
frontier = deque() # For BFS, we just use a FIFO queue. A deque is perfect.
frontier.append(start)
came_from = {} # To stop us exploring the same location more than once; and for breadcrumbs
came_from[start] = None # This marks the last breadcrumb
# keep going until the frontier is empty; i.e. when we've explored all the valid nodes
while frontier:
current = frontier.popleft() # pop the first item off the FIFO queue
for next in graph.neighbors(current): # get all the valid neighbours
if next not in came_from:
frontier.append(next) # add it to the frontier
came_from[next] = current
We can then create the breadcrumb path, like this:
current = goal # start at the end
path = [] # to store every point in a specific path
while current != start:
path.append(current)
current = came_from[current] # get the next node in the trail
path.append(start) # optional - depends if we want the first node to be included or not
path.reverse() # optional - depends whether we want start->end or end->start
This algorithm is like BFS. But instead of expanding the frontier in all directions with equal weight - called an unweighted pathfinder algorithm - we instead prioritise popping the node that has the lowest overall cost.
Dijkstra’s algorithm is perfect for:
heapq
. The heapq
data structure allows us to efficiently pop the item with the lowest priotity.Check out the illustration below, that shows how the algorithm finds our goal, H:
The A* Algorithm is a lot like BFS. But instead of expanding frontier in all directions equally - called an unweighted pathfinder algorithm - we prioritise popping from the frontier in the direction that we believe takes us closer to our goal. Thus, a weighted pathfinder algorithm.
To do this, we need two key changes, compared to the BFS:
heapq
. Recall that a heapq
allows us to pop elements based on a priority.So, an A* implementation requires the combination of:
We need both, because if we only include the distance heuristic, then we end up with a greedy BFS which tries to get closer to the goal without considering the path cost so far.
Here’s a BFS that creates a breadcrumb path. The priority of the pop is simply the number of steps taken from the starting point.
def solve_with_bfs(start:Point, goal:Point) -> tuple[int, dict]:
queue = []
heapq.heappush(queue, (0, start)) # posn, steps
came_from = {} # use a dict so we can build a breadcrumb trail
came_from[start] = None
while queue:
step_count, curr_posn = heapq.heappop(queue)
# Check if the target is reached
if curr_posn == goal:
return step_count, came_from
# Explore adjacent hexagons
for dx, dy in Hex.hex_vectors.values():
adjacent = curr_posn + Point(dx, dy)
if adjacent not in came_from:
came_from[adjacent] = curr_posn
heapq.heappush(queue, (step_count + 1, adjacent))
assert False, "We should never get here."
Here I’ve changed the code to an A*. Now I’m popping from the heapq
, using a priority that is sum of the number of steps taken, and the estimated distance to the goal.
def solve_with_astar(start: Point, goal: Point) -> tuple[int, dict]:
""" Obtain the fastest path from origin to goal using A*.
Returns: tuple[int, dict]: steps required, breadcrumbs
"""
queue = []
came_from = {} # use a dict so we can build a breadcrumb trail
came_from[start] = None
heapq.heappush(queue, (0, 0, start)) # distance heuristic, steps, posn
while queue:
_, step_count, curr_posn = heapq.heappop(queue)
# Check if the target is reached
if curr_posn == goal:
return step_count, came_from
# Explore adjacent hexagons
for dx, dy in Hex.hex_vectors.values():
adjacent = curr_posn + Point(dx, dy)
if adjacent not in came_from:
came_from[adjacent] = curr_posn
heuristic_cost = Hex.distance(adjacent, goal)
new_cost = step_count + 1 + heuristic_cost # heuristic is combination of distance to goal, and step count
heapq.heappush(queue, (new_cost, step_count + 1, adjacent))
assert False, "We should never get here."
I compared these two solutions when solving 2017 day 11. The BFS took 30 seconds, and the A* took just 20ms!!
When I first learned these algorithms, the best guide I came across was this one. I wrote a lot of notes when I first read it, and in writing this page today, I’ve been referring to my previous notes. So, the content I’ve written here will likely have a lot in common with the amazing content from https://www.redblobgames.com.