Learning Python with Advent of Code Walkthroughs

Dazbo's Advent of Code solutions, written in Python

hydrothermal vents

Advent of Code 2021 - Day 5

Day 5: Hydrothermal Venture

Useful Links

Concepts and Packages Demonstrated

NumPyDataclassRegular Expressions (regex)classesdecorator

NamedTupleCounterpropertymaxgeneratoryieldlambda

Problem Intro

I’ve written two solutions for this problem.

We’re told there’s a field of hydrothermal vents on the ocean floor. They produce dangerous plumes that we need to avoid. The vents occur in lines. Our input data describes the lines of vents, with each line in the format x1,y1 -> x2,y2. I.e. the (x,y) points that represent the two ends of each line. Here’s some sample input:

0,9 -> 5,9
8,0 -> 0,8
9,4 -> 3,4
2,2 -> 2,1
7,0 -> 7,4

Part 1

We’re told the vent lines can overlap. Overlapping points are more dangerious. Only considering orthogonal lines (i.e. horizontal and vertical), at how many points do at least two lines overlap?

Solution #1

Here we’re going to use NumPy.

Setup

import logging
import os
import time
import re
from dataclasses import dataclass
import numpy as np

SCRIPT_DIR = os.path.dirname(__file__) 
INPUT_FILE = "input/input.txt"
# INPUT_FILE = "input/sample_input.txt"

logging.basicConfig(format="%(asctime)s.%(msecs)03d:%(levelname)s:%(name)s:\t%(message)s", 
                    datefmt='%Y-%m-%d %H:%M:%S')
logger = logging.getLogger(__name__)
logger.setLevel(level=logging.INFO)

The notable inclusions here are the imports for regular expressions (re), dataclasses, and Numpy.

Solving the Problem

First, we create our Line dataclass:

@dataclass
class Line:
    """ A vertical, horizontal or diagonal line. """
    x1: int
    y1: int
    x2: int
    y2: int
    
    @property
    def is_orthogonal(self) -> bool:
        """ I.e. whether horizontal or vertical line.
        If both x are the same, then horizontal.
        If both y are the same, then vertical. """
        return self.x1 == self.x2 or self.y1 == self.y2

    @property
    def diagonal_down(self) -> bool:
        """ Determine if the diagonal line increases in both x and y axes.
        If it increases in one axis but decreases in the other, then it slopes up. """
        assert not self.is_orthogonal, "Must be diagonal"
        return self.x1 - self.x2 == self.y1 - self.y2

    @property    
    def min_x(self) -> int:
        return min(self.x1, self.x2) 
    
    @property
    def min_y(self) -> int:
        return min(self.y1, self.y2)

    @property
    def max_x(self) -> int:
        return max(self.x1, self.x2)

    @property
    def max_y(self) -> int:
        return max(self.y1, self.y2)

Here we’re using a class to represent Line objects. Python supports object-oriented programming, and consequently allows us create objects. An object is a way to describe something that has properties (attributes) and behviours (methods). A class is like a blueprint, allowing us to create instances of a given type of object. In this case, we’ve created a Line class, allowing us to create individual instances of Line. I.e. each Line is an instance.

We’re also using the dataclass decorator, which tells Python that we want the Line object to be treated as dataclass. This reduces the amount of basic implementation we need to write in the class. E.g. we can simply define the x1, y1, x2 and y2 properties, but we don’t need to write any code to initialise these variables. Note that when we create a Line, we should pass in the variables in the order we’ve defined in them in the dataclass. That’s how dataclasses work! For example:

a_line = Line(x1, y1, x2, y2)

Here’s things that instances of this class knows how to do:

Now we can process the input data:

input_file = os.path.join(SCRIPT_DIR, INPUT_FILE)
with open(input_file, mode="rt") as f:
    data = f.read().splitlines()
    
logger.debug("\n%s", "\n".join(str(vent) for vent in vents))

def process_data(data: list[str]) -> list[Line]:
    """ Parse data of format x1,y1 -> x2,y2 """
    lines = []
    for line in data:
        # get non-overlapping matching groups
        x1, y1, x2, y2 = map(int, re.findall(r"(\d+),(\d+) -> (\d+),(\d+)", line)[0])
        lines.append(Line(x1, y1, x2, y2))
            
    return lines

As usual, we read in the input file, and use splitlines() to split the data into a list, where each item in the list is one line of the input data. We then pass this list of str to our process_data() function. This function uses regular expression parsing to read each line of data.

In the regular express (\d+),(\d+) -> (\d+),(\d+), each (\d+) identifies one or more consecutive digits. Hence, the first pair of (\d+) are used to get one end (x1,y1) of any given line, and the second pair of (\d+) are used to the other end (x2,y2) of the line.

So, we retrieve these four ‘groups’, which get returned as strings. As we’ve done before, we then use map() to convert of these to an int. Finally, we use these four int values to create a new Line, for each row of the input data. We store these lines in a list, and return it.

This line just allows us to print the lines in a nice readable format, i.e. by adding a newline between line of output:

logger.debug("\n".join(str(vent) for vent in vents))

With the sample data, the output from the above line looks something like this:

2022-01-09 20:55:47.397:DEBUG:__main__:
Line(x1=0, y1=9, x2=5, y2=9)
Line(x1=8, y1=0, x2=0, y2=8)
Line(x1=9, y1=4, x2=3, y2=4)
Line(x1=2, y1=2, x2=2, y2=1)
Line(x1=7, y1=0, x2=7, y2=4)
...

The following code:

    # Get the bottom right coordinate of our x,y field
    max_x = max(vents, key=lambda line: line.max_x).max_x
    max_y = max(vents, key=lambda line: line.max_y).max_y

    field = np.zeros(shape=(max_y+1, max_x+1), dtype=np.int8)   # Initialise field(y, x)
    
    # Part 1: Count how many vents there are at each location, 
    # counting only horizontal and vertical vent lines
    for line in vents:
        if line.is_orthogonal:
            field[line.min_y:line.max_y+1, line.min_x:line.max_x+1] += 1
    
    logger.debug("\n%s", field)
    
    dangerous_vents = np.count_nonzero(field >= 2)
    logger.info("Part 1 dangerous vents: %d", dangerous_vents)

That’s part 1 done!

Part 2

Now we’re told we also need to consider diagonal lines.

# Part 2: Now add diagonal vent lines
for line in vents:
    if not line.is_orthogonal: # diagonal
        for i in range(line.max_y-line.min_y+1):    # length of the line (x len = y len)
            if line.diagonal_down:
                field[line.min_y+i, line.min_x+i] += 1
            else:   # diagonal up
                field[line.max_y-i, line.min_x+i] += 1

dangerous_vents = (field >= 2).sum()    # alternative to count_nonzero
logger.info("Part 2 dangerous vents: %d", dangerous_vents)    

If our line is not orthogonal, then we know it must be diagonal. Our class determines if the diagonal line is sloping down or sloping up by assessing if x1 - x2 == y1 - y2. With diagonal lines, the vertical distance will always be equal to the horizontal distance, i.e. this will always be true: abs(x1-x2) == abs(y1-y2)

But if the line is sloping down, then both sides will be of equal magnitude and both will be negative values. But if the line is sloping upwards, the left side will be negative, whilst the right will be positive.

Once we know this, we step through each coordinate in the line, incrementing the x by 1 each time, and either incrementing the y, if diagonal down, or decrementing y, if diagonal up. For each position position in the numpy array, we increment the vent counter.

And that’s part 2 done.

Output:

2022-01-09 21:15:05.769:INFO:__main__:  Part 1 dangerous vents: 6841
2022-01-09 21:15:06.175:INFO:__main__:  Part 2 dangerous vents: 19258
2022-01-09 21:15:06.176:INFO:__main__:  Execution time: 0.4127 seconds

Solution #2

I decided to write a solution that goes back to basics.

Setup

Similar to before:

from collections import Counter
import logging
import os
import time
import re
from dataclasses import dataclass
from typing import Iterator, NamedTuple

SCRIPT_DIR = os.path.dirname(__file__) 
INPUT_FILE = "input/input.txt"
# INPUT_FILE = "input/sample_input.txt"

logging.basicConfig(format="%(asctime)s.%(msecs)03d:%(levelname)s:%(name)s:\t%(message)s", 
                    datefmt='%Y-%m-%d %H:%M:%S')
logger = logging.getLogger(__name__)
logger.setLevel(level=logging.INFO)

In this case, we’re not using numpy, but we are including the Counter and NamedTuple.

Solving the Problem

First we create a NamedTuple called Point:

class Point(NamedTuple):
    x: int
    y: int

NamedTuple is quite similar to Dataclass. It creates named instances of a tuple, but has the advantage that we can refer to the tuple attributes by name, rather than simply by index.

some_point = (3,5) # a regular tuple
some_point[0] # x coord
some_point[1] # y coord

some_point = Point(3,5) # using our NamedTuple
some_point.x # x coord
some_point.y # y cord

The we create a Line dataclass as before, but with some minor tweaks:

@dataclass
class Line:
    """ A vertical, horizontal or diagonal line. 
    Able to yield all points between p1 and p2, inclusive. """
    p1: Point
    p2: Point
    
    @property
    def is_orthogonal(self) -> bool:
        """ I.e. whether horizontal or vertical line.
        If both x are the same, then horizontal.
        If both y are the same, then vertical. """
        return self.p1.x == self.p2.x or self.p1.y == self.p2.y

    @property
    def diagonal_down(self) -> bool:
        """ Determine if the diagonal line increases in both x and y axes.
        If it increases in one axis but decreases in the other, then it slopes up. """
        assert not self.is_orthogonal, "Must be diagonal"
        return self.p1.x - self.p2.x == self.p1.y - self.p2.y
    
    def points(self) -> Iterator[Point]:
        """ Yield every point from the start of the line (p1) to the end of the line (p2). """
        dx = 0 if self.p2.x == self.p1.x else 1 if self.p2.x > self.p1.x else -1
        dy = 0 if self.p2.y == self.p1.y else 1 if self.p2.y > self.p1.y else -1
        
        point = self.p1
        while point != self.p2:
            yield point
            point = Point(point.x + dx, point.y + dy)
            
        assert point == self.p2
        yield point # the end point

Firstly, since we have our NamedTuple for Point, we can just store two points for each line, rather than storing the x1, x2, y1 and y2 attributes. (It’s the same data… It’s just that using the Point attributes is easier to understand.)

The logic for checking if lines are orthogonal or diagonal down/up are basically the same as before, except using Point references.

The most significant change is that we now have a generator that knows how to yield every Point between the first point and the last point of the line. It does this by determining the delta in x and y for each successive point on the line, depending on whether the line is vertical, horizontal, or diagonal. E.g. if the line is vertical, then dx will always be 0 and dy will always be 1. It stores the latest point, and adds the dx and dy to this point, with every yielding of the next point.

Note that yield works a bit like a return statement, but the method retains its state between each yield execution.

We then read in the data exactly as before.

Part 1 is then solved using this code:

# Part 1: Count how many vents there are at each location
vents_counter = Counter()
for line in vents:
    if line.is_orthogonal:  # only include orthogonal lines
        for point in line.points():
            vents_counter[point] += 1

dangerous_vents = sum(1 for point, count in vents_counter.items() if count >= 2)
logger.info("Part 1 dangerous vents: %d", dangerous_vents)

The code works by:

For part 2, the code is identical, except we no longer need to check if the line is orthogonal.

# Part 2
vents_counter = Counter()    
for line in vents:
    for point in line.points():
        vents_counter[point] += 1

Onwards!