Learning Python with Advent of Code Walkthroughs

Dazbo's Advent of Code solutions, written in Python

Chitons

Advent of Code 2021 - Day 15

Day 15: Chiton

Useful Links

Concepts and Packages Demonstrated

heapqDijkstra’s Algorithm

dataclassPriority queuevisualisation

Problem Intro

We’re in a cavern with a low ceiling just above us, so we can only navigate in two dimensions. I.e. we can’t go up or down. The cavern makes up a graph of connected locations that we need to navigate through. We’re told that the cave walls are covered in chitons, and don’t want to bump into them!

Our sub’s computer has mapped out the chiton density, and converted this into grid of risk level, in the two dimensions we can navigate.

The input data looks like this:

1163751742
1381373672
2136511328
3694931569
7463417111
1319128137
1359912421
3125421639
1293138521
2311944581

We start in the top left. We can navigate the maze by moving in any orthogonal direction; we can’t move diagonally.

Part 1

We need to get to the bottom right, and we need to compute the total risk of the lowest risk route. Total risk is given by adding up the risk of every location we enter; thus, the risk of our starting location is not counted.

Setup

The usual stuff…

from __future__ import annotations
from copy import deepcopy
from dataclasses import dataclass
import logging
import os
import time
import heapq

SCRIPT_DIR = os.path.dirname(__file__) 
INPUT_FILE = "input/input.txt"
# INPUT_FILE = "input/sample_input.txt"

logging.basicConfig(format="%(asctime)s.%(msecs)03d:%(levelname)s:%(name)s:\t%(message)s", 
                    datefmt='%H:%M:%S')
logger = logging.getLogger(__name__)
logger.setLevel(level=logging.DEBUG)

The only new addition worthy of mention is the heapq module. This allows us to implement a priority queue. More on this later.

Solution

This is simple enough. We’re going to use a bog-standard Dijkstra solution.

Recall that in Day 9 we used a Breadth First Search (BFS) to flood fill an area. And in Day 12 we again used the BFS to find all the paths between a starting point and a destination.

Well, Dijkstra builds on the BFS algorithm, by allowing us to prioritise the paths we want to explore first, favouring paths with a lower cost. Thus, Dijkstra is a great algorithm to allow us to find the best path through a graph, where best might mean shortest, fastest, lowest_cost, etc. It’s also particularly suited to where movement in the graph may have variable cost.

And this is exactly the scenario we have here: some locations have higher risk scores. So, we want to prioritise the path that has the overall lowest risk cost.

The Dijkstra works just like the BFS, in that it uses a frontier that expands in all valid directions. However, unlike a BFS where we typically pop items off the frontier in the order they were added (FIFO), Dijkstra instead uses a priority queue for the frontier, where we pop off items that have the lowest priority. (Here, lowest could actually be highest; it really depends what behaviour you want.)

For our solution, we want to prioritise based on lowest cumulative risk. So, we do this by keeping track of our path and the total risk for our path. And as we explore the next possible move for each path, we always explore the path that has the lowest cumulative risk, first.

We’ll start with a Point dataclass:

@dataclass(frozen=True, order=True)
class Point():
    """ Point class, which knows how to return a list of all adjacent coordinates """
    
    # Return all adjacent orthogonal (not diagonal) coordinates
    DELTAS = [(dx,dy) for dx in range(-1, 2) for dy in range(-1, 2) if abs(dy) != abs(dx)]
    
    x: int
    y: int
    
    def neighbours(self) -> list[Point]:
        """ Return all adjacent orthogonal (not diagonal) Points """
        return [Point(self.x+dx, self.y+dy) for dx,dy in Point.DELTAS]

Now let’s create a Grid class to represent our input of risks:

class Grid():
    """ 2D grid of point values. Knows how to:
       - Determine value at any point
       - Determine all neighbouring points of a given point """
       
    def __init__(self, grid_array: list[list[int]]) -> None:
        """ Generate Grid instance from 2D array. 
        This works on a deep copy of the input data, so as not to mutate the input. """                                         
        self._array = deepcopy(grid_array)  # Store a deep copy of input data
        self._x_size = len(self._array[0])
        self._y_size = len(self._array)
        
    @property
    def x_size(self):
        """ Array width (cols) """
        return self._x_size
    
    @property
    def y_size(self):
        """ Array height (rows) """
        return self._y_size
    
    @property
    def array(self):
        return self._array

    def all_points(self) -> list[Point]:
        points = [Point(x, y) for x in range(self.x_size) for y in range(self.y_size)]
        return points
    
    def set_value_at_point(self, point: Point, value: int):
        self._array[point.y][point.x] = value
        
    def value_at_point(self, point: Point) -> int:
        """ Value at this point """
        return self._array[point.y][point.x]
    
    def _valid_location(self, point: Point) -> bool:
        """ Check if a location is within the grid """
        if (0 <= point.x < self.x_size and 0 <= point.y < self.y_size):
            return True
        
        return False
    
    def valid_neighbours(self, point:Point):
        """ Yield adjacent neighbour points """
        for neighbour in point.neighbours():
            if self._valid_location(neighbour):
                yield neighbour
    
    def __repr__(self) -> str:
        return "\n".join("".join(map(str, row)) for row in self._array)

There’s not much to say about this. We’ve seen it all before.

Note the cool Python shorthand. This:

0 <= point.x < self.x_size

is equivalent to this:

0 <= point.x and point.x < self.x_size

Now we’ll write the code that performs our Dijkstra:

def navigate_grid(grid: Grid) -> list[tuple[Point, int]]:    
    """  A Dijkstra BFS to get from top left to bottom right 

    Args:
        grid (Grid): 2d grid of risk values

    Returns:
        list[tuple[Point, int]]: A path, from beginning to end, as a list of (Point, risk)
    """
    start: Point = (Point(0,0))
    current: Point = start
    end: Point = (Point(grid.x_size-1, grid.y_size-1))
    
    frontier = []
    heapq.heappush(frontier, (0, current))   # (priority, location)
    
    came_from = {}  # So we can rebuild winning path from breadcrumbs later
    came_from[current] = None
    
    risk_so_far: dict[Point, int] = {}    # Store cumulative risk from grid values
    risk_so_far[current] = 0
    
    while frontier:
        priority, current = heapq.heappop(frontier)
        if current == end:
            break   # Goal reached
        
        for neighbour in grid.valid_neighbours(current):
            new_risk = risk_so_far[current] + grid.value_at_point(neighbour)
            if neighbour not in risk_so_far or new_risk < risk_so_far[neighbour]:
                risk_so_far[neighbour] = new_risk
                heapq.heappush(frontier, (new_risk, neighbour))
                came_from[neighbour] = current
    
    # Now we've reached our goal, build the winning path from breadcrumbs
    path: list[tuple[Point, int]] = []   # (location, risk)
    while current != start:
        risk_at_current = grid.value_at_point(current)
        path.append((current, risk_at_current))
        current = came_from[current]
    
    path.reverse()
    return path

Explanation:

Finally, we build the path to get to the end from the start, by following the breadcrumbs, and then finally reversing the order of the path. The final path is a list of tuples, where each tuple is (Point, risk-at-point).

And we run it like this:

input_file = os.path.join(SCRIPT_DIR, INPUT_FILE)
with open(input_file, mode="rt") as f:
    data = [[int(posn) for posn in row] for row in f.read().splitlines()]

# Part 1
grid = Grid(data)
path = navigate_grid(grid)
total_risk = sum([location[1] for location in path])
logger.info("Part 1 total risk: %d", total_risk)

How About a Quick Visualisation

I added this function to visualise my path:

def visualise_path(grid: Grid, path: list[tuple[Point, int]]):
    """ Render this paper and its dots as a scatter plot """
    all_x = [point.x for point in grid.all_points()]
    all_y = [point.y for point in grid.all_points()]
    labels = [grid.value_at_point(point) for point in grid.all_points()]
    path_points = [Point(0,0)] + [path_item[0] for path_item in path]
    
    axes = plt.gca()
    axes.set_aspect('equal')
    plt.axis("off") # hide the border around the plot axes
    axes.set_xlim(min(all_x)-1, max(all_x)+1)
    axes.set_ylim(min(all_y)-1, max(all_y)+1)
    axes.invert_yaxis()
    
    for point, label in zip(grid.all_points(), labels):
        if point in path_points:
            plt.text(point.x, point.y, s=str(label), color="r")
        else:
            plt.text(point.x, point.y, s=str(label), color="b")
        
    # axes.scatter(all_x, all_y)
    plt.show()

This is how it works:

Here’s what the visualisation looks like with the sample data:

Chiton Maze Path

Part 2

Urgh.

Now we’re told the cave is five times larger than our input data, in both the x and y dimensions. Thus, a total area 25x larger. We’re told our input data is actually just the first tile in a 5x5 grid of contiguous tiles. With each tile in either the x or y direction, the location risk is 1 higher than the risk level at the same position in the tile to the left or above (respectively).

Uber Grid

As before, we need to find the path from the top left to bottom right.

Our approach to Part 2 is as follows:

In our Grid class, we only need to add the following:

    def increment_grid(self):
        """ Increment the value of every point in the array by 1.
        However, max is 9, and values wrap around to 1, NOT 0. """
        for point in self.all_points():
            value = self.value_at_point(point)
            if value < 9:
                self.set_value_at_point(point, value+1)
            else:
                self.set_value_at_point(point, 1)

    def append_grid(self, adjacent_grid: Grid) -> Grid:
        """ Append the new grid to the right of this grid.
        This creates a new grid which is horizontally bigger.
        Returns the new grid """

        this_array = self.array
        other_array = adjacent_grid.array
        
        new_rows = []
        for y in range(self.y_size):
            new_rows.append(this_array[y] + other_array[y])
            
        return Grid(new_rows)

Now we need to create our new 5x5 set of tiles, using our input data as the top left tile:

def build_uber_grid(start_grid: Grid, rows: int, cols:int) -> Grid:
    """ Build an uber grid, made up of y*x tiles, where each tile is the size of the start grid.
    With each tile to the right or down, all values increase by 1, according to increment rules. 
    """
    
    # Create the nine additional permutations of the starting tile 
    # (since each digit in the tile can only be from 1-9 inclusive)
    tile_permutations: dict[int, Grid] = {}
    tile_permutations[0] = start_grid
    for i in range(1, 9):
        tile = Grid(tile_permutations[i-1].array)
        tile.increment_grid()
        tile_permutations[i] = tile  # each tile is an increment of the grid before

    # Now stich each adjacent tile together to make an uber row            
    tile_rows: list[Grid] = []  # to hold y rows of very wide arrays
    for row in range(rows):
        tile_row = tile_permutations[row] # set the first tile in the row
        for col in range(1, cols):   # now add additional tiles to make the complete row
            tile_row = tile_row.append_grid(tile_permutations[row+col])
            
        tile_rows.append(tile_row)
    
    # now convert our five long grids into a single list of rows
    uber_rows = []
    for tile_row in tile_rows:
        for row in tile_row.array:
            uber_rows.append(row)
            
    return Grid(uber_rows)   

Finally, let’s run it:

# Part 2
# Build a new grid, which is 5x5 extension of our existing grid
uber_grid = build_uber_grid(grid, 5, 5)
path = navigate_grid(uber_grid) # Re-run our search
total_risk = sum([location[1] for location in path])
logger.info("Part 2 total risk: %d", total_risk)

And the output looks something like this:

23:44:02.345:INFO:__main__:     Part 1 total risk: 619
23:44:11.259:INFO:__main__:     Part 2 total risk: 2922
23:44:11.262:INFO:__main__:     Execution time: 2.2146 seconds

Onwards!