Dazbo's Advent of Code solutions, written in Python
NumPyDataclassRegular Expressions (regex)classesdecorator
I’ve written two solutions for this problem.
We’re told there’s a field of hydrothermal vents on the ocean floor. They produce dangerous plumes that we need to avoid. The vents occur in lines. Our input data describes the lines of vents, with each line in the format x1,y1 -> x2,y2
. I.e. the (x,y) points that represent the two ends of each line. Here’s some sample input:
0,9 -> 5,9
8,0 -> 0,8
9,4 -> 3,4
2,2 -> 2,1
7,0 -> 7,4
We’re told the vent lines can overlap. Overlapping points are more dangerious. Only considering orthogonal lines (i.e. horizontal and vertical), at how many points do at least two lines overlap?
Here we’re going to use NumPy.
import logging
import os
import time
import re
from dataclasses import dataclass
import numpy as np
SCRIPT_DIR = os.path.dirname(__file__)
INPUT_FILE = "input/input.txt"
# INPUT_FILE = "input/sample_input.txt"
logging.basicConfig(format="%(asctime)s.%(msecs)03d:%(levelname)s:%(name)s:\t%(message)s",
datefmt='%Y-%m-%d %H:%M:%S')
logger = logging.getLogger(__name__)
logger.setLevel(level=logging.INFO)
The notable inclusions here are the imports for regular expressions (re), dataclasses, and Numpy.
First, we create our Line dataclass
:
@dataclass
class Line:
""" A vertical, horizontal or diagonal line. """
x1: int
y1: int
x2: int
y2: int
@property
def is_orthogonal(self) -> bool:
""" I.e. whether horizontal or vertical line.
If both x are the same, then horizontal.
If both y are the same, then vertical. """
return self.x1 == self.x2 or self.y1 == self.y2
@property
def diagonal_down(self) -> bool:
""" Determine if the diagonal line increases in both x and y axes.
If it increases in one axis but decreases in the other, then it slopes up. """
assert not self.is_orthogonal, "Must be diagonal"
return self.x1 - self.x2 == self.y1 - self.y2
@property
def min_x(self) -> int:
return min(self.x1, self.x2)
@property
def min_y(self) -> int:
return min(self.y1, self.y2)
@property
def max_x(self) -> int:
return max(self.x1, self.x2)
@property
def max_y(self) -> int:
return max(self.y1, self.y2)
Here we’re using a class
to represent Line
objects. Python supports object-oriented programming, and consequently allows us create objects
. An object
is a way to describe something that has properties (attributes
) and behviours (methods
). A class
is like a blueprint, allowing us to create instances
of a given type of object. In this case, we’ve created a Line
class, allowing us to create individual instances of Line
. I.e. each Line
is an instance.
We’re also using the dataclass decorator
, which tells Python that we want the Line
object to be treated as dataclass
. This reduces the amount of basic implementation we need to write in the class. E.g. we can simply define the x1
, y1
, x2
and y2
properties, but we don’t need to write any code to initialise these variables. Note that when we create a Line
, we should pass in the variables in the order we’ve defined in them in the dataclass
. That’s how dataclasses work! For example:
a_line = Line(x1, y1, x2, y2)
Here’s things that instances of this class knows how to do:
Now we can process the input data:
input_file = os.path.join(SCRIPT_DIR, INPUT_FILE)
with open(input_file, mode="rt") as f:
data = f.read().splitlines()
logger.debug("\n%s", "\n".join(str(vent) for vent in vents))
def process_data(data: list[str]) -> list[Line]:
""" Parse data of format x1,y1 -> x2,y2 """
lines = []
for line in data:
# get non-overlapping matching groups
x1, y1, x2, y2 = map(int, re.findall(r"(\d+),(\d+) -> (\d+),(\d+)", line)[0])
lines.append(Line(x1, y1, x2, y2))
return lines
As usual, we read in the input file, and use splitlines()
to split the data into a list
, where each item in the list is one line of the input data. We then pass this list
of str
to our process_data()
function. This function uses regular expression
parsing to read each line of data.
In the regular express (\d+),(\d+) -> (\d+),(\d+)
, each (\d+)
identifies one or more consecutive digits. Hence, the first pair of (\d+)
are used to get one end (x1,y1) of any given line, and the second pair of (\d+)
are used to the other end (x2,y2) of the line.
So, we retrieve these four ‘groups’, which get returned as strings. As we’ve done before, we then use map()
to convert of these to an int
. Finally, we use these four int
values to create a new Line
, for each row of the input data. We store these lines in a list
, and return it.
This line just allows us to print the lines in a nice readable format, i.e. by adding a newline between line of output:
logger.debug("\n".join(str(vent) for vent in vents))
With the sample data, the output from the above line looks something like this:
2022-01-09 20:55:47.397:DEBUG:__main__:
Line(x1=0, y1=9, x2=5, y2=9)
Line(x1=8, y1=0, x2=0, y2=8)
Line(x1=9, y1=4, x2=3, y2=4)
Line(x1=2, y1=2, x2=2, y2=1)
Line(x1=7, y1=0, x2=7, y2=4)
...
The following code:
max_x
and max_y
respectively. # Get the bottom right coordinate of our x,y field
max_x = max(vents, key=lambda line: line.max_x).max_x
max_y = max(vents, key=lambda line: line.max_y).max_y
field = np.zeros(shape=(max_y+1, max_x+1), dtype=np.int8) # Initialise field(y, x)
# Part 1: Count how many vents there are at each location,
# counting only horizontal and vertical vent lines
for line in vents:
if line.is_orthogonal:
field[line.min_y:line.max_y+1, line.min_x:line.max_x+1] += 1
logger.debug("\n%s", field)
dangerous_vents = np.count_nonzero(field >= 2)
logger.info("Part 1 dangerous vents: %d", dangerous_vents)
That’s part 1 done!
Now we’re told we also need to consider diagonal lines.
# Part 2: Now add diagonal vent lines
for line in vents:
if not line.is_orthogonal: # diagonal
for i in range(line.max_y-line.min_y+1): # length of the line (x len = y len)
if line.diagonal_down:
field[line.min_y+i, line.min_x+i] += 1
else: # diagonal up
field[line.max_y-i, line.min_x+i] += 1
dangerous_vents = (field >= 2).sum() # alternative to count_nonzero
logger.info("Part 2 dangerous vents: %d", dangerous_vents)
If our line is not orthogonal, then we know it must be diagonal. Our class determines if the diagonal line is sloping down or sloping up by assessing if x1 - x2 == y1 - y2
. With diagonal lines, the vertical distance will always be equal to the horizontal distance, i.e. this will always be true:
abs(x1-x2) == abs(y1-y2)
But if the line is sloping down, then both sides will be of equal magnitude and both will be negative values. But if the line is sloping upwards, the left side will be negative, whilst the right will be positive.
Once we know this, we step through each coordinate in the line, incrementing the x by 1 each time, and either incrementing the y, if diagonal down, or decrementing y, if diagonal up. For each position position in the numpy array, we increment the vent counter.
And that’s part 2 done.
Output:
2022-01-09 21:15:05.769:INFO:__main__: Part 1 dangerous vents: 6841
2022-01-09 21:15:06.175:INFO:__main__: Part 2 dangerous vents: 19258
2022-01-09 21:15:06.176:INFO:__main__: Execution time: 0.4127 seconds
I decided to write a solution that goes back to basics.
Similar to before:
from collections import Counter
import logging
import os
import time
import re
from dataclasses import dataclass
from typing import Iterator, NamedTuple
SCRIPT_DIR = os.path.dirname(__file__)
INPUT_FILE = "input/input.txt"
# INPUT_FILE = "input/sample_input.txt"
logging.basicConfig(format="%(asctime)s.%(msecs)03d:%(levelname)s:%(name)s:\t%(message)s",
datefmt='%Y-%m-%d %H:%M:%S')
logger = logging.getLogger(__name__)
logger.setLevel(level=logging.INFO)
In this case, we’re not using numpy, but we are including the Counter
and NamedTuple
.
First we create a NamedTuple
called Point
:
class Point(NamedTuple):
x: int
y: int
NamedTuple
is quite similar to Dataclass
. It creates named instances of a tuple
, but has the advantage that we can refer to the tuple attributes by name, rather than simply by index.
some_point = (3,5) # a regular tuple
some_point[0] # x coord
some_point[1] # y coord
some_point = Point(3,5) # using our NamedTuple
some_point.x # x coord
some_point.y # y cord
The we create a Line dataclass
as before, but with some minor tweaks:
@dataclass
class Line:
""" A vertical, horizontal or diagonal line.
Able to yield all points between p1 and p2, inclusive. """
p1: Point
p2: Point
@property
def is_orthogonal(self) -> bool:
""" I.e. whether horizontal or vertical line.
If both x are the same, then horizontal.
If both y are the same, then vertical. """
return self.p1.x == self.p2.x or self.p1.y == self.p2.y
@property
def diagonal_down(self) -> bool:
""" Determine if the diagonal line increases in both x and y axes.
If it increases in one axis but decreases in the other, then it slopes up. """
assert not self.is_orthogonal, "Must be diagonal"
return self.p1.x - self.p2.x == self.p1.y - self.p2.y
def points(self) -> Iterator[Point]:
""" Yield every point from the start of the line (p1) to the end of the line (p2). """
dx = 0 if self.p2.x == self.p1.x else 1 if self.p2.x > self.p1.x else -1
dy = 0 if self.p2.y == self.p1.y else 1 if self.p2.y > self.p1.y else -1
point = self.p1
while point != self.p2:
yield point
point = Point(point.x + dx, point.y + dy)
assert point == self.p2
yield point # the end point
Firstly, since we have our NamedTuple
for Point
, we can just store two points for each line, rather than storing the x1, x2, y1 and y2 attributes. (It’s the same data… It’s just that using the Point
attributes is easier to understand.)
The logic for checking if lines are orthogonal or diagonal down/up are basically the same as before, except using Point
references.
The most significant change is that we now have a generator
that knows how to yield
every Point
between the first point and the last point of the line. It does this by determining the delta in x and y for each successive point on the line, depending on whether the line is vertical, horizontal, or diagonal. E.g. if the line is vertical, then dx will always be 0 and dy will always be 1. It stores the latest point, and adds the dx and dy to this point, with every yielding of the next point.
Note that yield
works a bit like a return
statement, but the method retains its state between each yield
execution.
We then read in the data exactly as before.
Part 1 is then solved using this code:
# Part 1: Count how many vents there are at each location
vents_counter = Counter()
for line in vents:
if line.is_orthogonal: # only include orthogonal lines
for point in line.points():
vents_counter[point] += 1
dangerous_vents = sum(1 for point, count in vents_counter.items() if count >= 2)
logger.info("Part 1 dangerous vents: %d", dangerous_vents)
The code works by:
Counter
.Line
. If the line is orthononal, we yield each point on the line, store the point in the counter, and increment the vent count at each point.For part 2, the code is identical, except we no longer need to check if the line is orthogonal.
# Part 2
vents_counter = Counter()
for line in vents:
for point in line.points():
vents_counter[point] += 1
Onwards!