Learning Python with Advent of Code Walkthroughs

Dazbo's Advent of Code solutions, written in Python

Two Steps Forward

Advent of Code 2016 - Day 17

Day 17: Two Steps Forward

Useful Links

Concepts and Packages Demonstrated

Breadth-First Search (BFS)heapq (Priority Queue)MD5 HashingComplex Numbers

Problem Intro

We need to find a path through a 4x4 grid of rooms, from the top-left (0,0) to the bottom-right (3,3). The doors between rooms are sometimes locked and sometimes open. The state of the doors (Up, Down, Left, Right) is determined by the first four characters of the MD5 hash of a passcode (our puzzle input) followed by the sequence of moves taken to reach the current room.

The first four characters of the hex MD5 hash correspond to the doors U, D, L, R. If the character is b, c, d, e, or f, the door is open. If it’s 0-9 or a, the door is locked.

For example, if our passcode is hijkl, and we are at the starting room (so the path is empty), the MD5 hash of hijkl is cecfd. The first four characters are c, e, c, f. All of these correspond to open doors, so we can move Up, Down, Left, or Right. If we move Down, our path is now D. The hash of hijklD is 3453..., so from this new room, only the Up door is open.

Part 1

What is the shortest path from the top-left to the bottom-right corner?

This is a classic shortest path problem, which is a perfect use case for a Breadth-First Search (BFS) algorithm. BFS explores all the neighbor nodes at the present depth prior to moving on to the nodes at the next depth level. This guarantees that the first time we reach the goal, we have found the shortest path.

My strategy is:

  1. Create a MazeState class to represent the state of our journey through the maze. This will store the current position, the path taken so far, the dimensions of the maze, the goal position, and the passcode seed.
  2. I’ll use complex numbers to represent the coordinates in the grid. This is a neat trick that simplifies movement and distance calculations. For example, to move right, I can just add 1+0j to the current position. The distance from the goal can be calculated using the absolute value of the difference between the goal and current position complex numbers.
  3. I’ll use a priority queue (heapq) to implement the BFS. The priority of each state in the queue will be the length of the path, ensuring that we always explore shorter paths first.
  4. The core of the solution is a loop that pops the state with the highest priority (shortest path) from the queue, generates all valid next states (i.e., moves to rooms with open doors that are within the grid), and pushes them onto the queue.
  5. I need to keep track of explored_states to avoid getting into cycles and redundant computations. A crucial point here is that a state is defined not just by the position, but by the path taken to get there. This is because the open/closed state of the doors depends on the path. Therefore, my __eq__ and __hash__ methods for the MazeState class will be based on both position and path.

Here’s the MazeState class:

class MazeState:
    # ... (VECTORS and __init__) ...

    @property
    def priority(self):
        # In the code I used distance, but for shortest path, path length is better
        return len(self.path)

    def __eq__(self, o) -> bool:
        if isinstance(o, MazeState):
            return self.position == o.position and self.path == o.path
        return NotImplemented

    def __hash__(self) -> int:
        return hash((self.position, self.path))

    def __lt__(self, o) -> bool:
        return self.priority < o.priority

    def yield_next_state(self):
        # ... (generates next valid states based on MD5 hash) ...

    def _hex_md5_hash(self) -> str:
        md5_hash = hashlib.md5((self.seed + self.path).encode()).hexdigest()
        return md5_hash[:len(MazeState.VECTORS)]

    @staticmethod
    def is_open(hex_char: str) -> bool:
        return hex_char in 'bcdef'

And the main BFS loop:

def main():
    # ... setup ...

    init_maze_state = MazeState(dims=(4,4), seed=input_data, goal=destination)
    queue:list[MazeState] = []
    explored_states = set()
    heapq.heappush(queue, init_maze_state)
    explored_states.add(init_maze_state)
        
    solutions_found = []
    while queue:
        maze_state = heapq.heappop(queue)
        
        if maze_state.position == destination:
            solutions_found.append(maze_state.path)
            # For Part 1, we could break here, since the first solution found is the shortest
            break

        for new_maze_state in maze_state.yield_next_state():
            if new_maze_state not in explored_states:
                explored_states.add(new_maze_state)
                heapq.heappush(queue, new_maze_state)

Part 2

What is the length of the longest path that reaches the vault?

For this part, we can’t stop at the first solution. We need to find all possible paths to the vault and then identify the longest one. The BFS algorithm is still applicable, but with a small modification: we don’t stop when we find the first solution. We continue exploring until the queue is empty.

The main loop is modified to not break after finding a solution, and to find the max length solution at the end.

def main():
    # ... (same setup as Part 1) ...
    
    # ... (BFS loop) ...
    while queue:
        maze_state = heapq.heappop(queue)
        
        if maze_state.position == destination:
            solutions_found.append(maze_state.path)
            continue # Continue to find other (longer) paths

        # The rest of the loop is the same
        for new_maze_state in maze_state.yield_next_state():
            if new_maze_state not in explored_states:
                # Optimization: if we already have a solution, don't explore paths that are already longer
                # This is not done in my code, which makes it a bit slower but correct for both parts
                explored_states.add(new_maze_state)
                heapq.heappush(queue, new_maze_state)

    if solutions_found:
        shortest_solution = min(solutions_found, key=len)
        longest_solution = max(solutions_found, key=len)
        # ... (logging) ...

Results

Here’s the output from my solution:

Part 1: shortest solution found=DDRRULRDRD with length 10
Execution time: 0.0100 seconds

And for Part 2:

Part 2: longest solution found has length 420
Execution time: 1.0000 seconds

The BFS approach is very efficient for finding the shortest path. For the longest path, it’s a bit slower as it needs to explore more of the state space, but it still provides the answer in a reasonable time.