Advent of Code 2015 - Day 6

Useful Links

Concepts and Packages Demonstrated

Nested Comprehensions Multi-sequence Comprehensions Regular Expressions (Regex)Regexr - Testing Regex Assert Tuple unpacking Map Sum NumPy

Problem Intro

We’re told we’re going to use one million Christmas lights, in a 1000x1000 grid!!

We’re given a set of instructions that describe which of the lights should be on, and which should be off. Instructions always correspond to a rectangle of lights, and the instructions are inclusive of each opposite corner of the rectangle.

Here’s some sample instructions:

toggle 1,6 through 2,6
turn off 1,7 through 2,9
turn off 6,1 through 6,3
turn off 8,2 through 11,6
turn on 19,2 through 21,3
turn on 17,4 through 26,8
turn on 10,10 through 19,15
turn off 4,14 through 6,16
toggle 5,15 through 15,25
toggle 20,1 through 29,10

I’ve solved this challenge using two different approaches:

Using a list of lists
Using numpy

List-of-Lists Solution

Part 1

There are three types of instructions:

turn on: all the lights in this rectangle should be turned on
turn off: all the lights in this rectangle should be turned off
toggle: all the lights in this rectangle should be toggled from their current state.

After following the instructions, how many lights are lit?

My approach here is to create a 1000x1000 grid like this:

A list of 1000 rows.
With each row containing a list of 1000 booleans.
All elements of the grid are initialised to False.

The initialisation can be done like this:

# Create a list of lists and initialise every light to a value of False
width = height = 1000
light_rows = []
for _ in range(height):
    light_row = []
    for _ in range(width):
        light_row.append(False)
    
    light_rows.append(light_row)

Sure, that works fine. But it’s not very Pythonic. A better way is to use a nested list comprehension, like this:

# Create a list of lists and initialise every light to a value of False
width = height = 1000
light_rows = [[False for light in range(width)] for row in range(height)]

Neat, right?

Now let’s create function to process the instructions:

def process_instructions(data, lights):
    p = re.compile(r"(\d+),(\d+) through (\d+),(\d+)")

    for line in data:
        match = p.search(line)
        assert match, "All instruction lines are expected to match"
        tl_x, tl_y, br_x, br_y = map(int, match.groups())

        for y in range(tl_y, br_y + 1):
            for x in range(tl_x, br_x + 1):
                if "toggle" in line:
                    lights[y][x] = not lights[y][x]
                elif "on" in line:
                    lights[y][x] = True
                elif "off" in line:
                    lights[y][x] = False

This function takes two parameters:

The instructions.
The lights grid.

The instructions are being processed using a regex pattern.

Consider the example line: turn off 1,7 through 2,9
The regex pattern contains four separate groups, and each group looks like: (\d+). This means: match any numeric digit, and there must be at least one digit in the number.

For each instruction line:

We match the regex pattern and return all four groups using the groups() method.
We then use the map() function, to apply the int() function against all four strings. This converts our four variables to integers. The variables are tl_x (top-left x), tl_y (top-left y), br_x (bottom-right x), and br_y (bottom-right y). I.e. two pairs of x,y coordinates: one pair for the top left corner, and one pair for the bottom right corner.
We then iterate through every grid position in the rectangle, with y being the current row, and x being the current position in the row.
- We check whether the current line contains the word toggle, off or on. These are all that is required to identify the instruction type.
  - If toggle, we use not lights[y][x] to invert the boolean variable of all the lights in the rectangle
  - If on, we set all lights[y][x] to True.
  - If off, we set all lights[y][x] to False.

Having processed every instruction, we now just need to count all the lights that are on.

I do it like this:

process_instructions(data, light_rows)
lights = [light_rows[y][x] for x in range(width) for y in range(height)]
assert len(lights) == width*height, "Verify number of lights"
print(f"Part 1, lights on: {sum(lights)}")

Again, we’re using a list comprehension. This is called a multi-sequence comprehension, because it creates a single list from two nested for loops. I.e. it loops through each x position in each row (y), generating a single list of boolean values. Thus, this comprehension returns a list of one million booleans.
To verify my logic, I’ve then added an assert statement.
- It is good practice to use assert to test for anything that you always expect to be true. It is a way to check that our code is doing what we think it should be doing.
- Here, my assert validates that the number of lights is indeed the product of width and height, i.e. that it is indeed 1000000.
- If the assertion were to fail, then the program would immediate exit with an AssertionError, and it would print my Verify number of lights message.
We then use the sum() function, and pass in our lights list. Recall that True has an integer value of 1, whilst False has an integer value of 0. This is perfect, because our goal is to count all the lights with a True value. Alternatively, we could have done: lights.count(True)

Part 2

Oh quelle surprise. The instructions don’t work quite how we originally thought. Instead:

Turn on means: increase brightness by one.
Turn off means: decrease brightness by one, to a minimum of 0.
Toggle means: increase the brightness by 2.

Er, of course.

The lights all start with a brightness value of 0.

What is the total brightness of all lights combined after following Santa’s instructions?

We don’t have to change much. First, we need to initialise all our lights. We could set them all to False like we did before. This will work, since False has a numeric value of 0 in Python. But to be more explicit, let’s initialise them all to 0:

light_rows = [[0 for light in range(width)] for row in range(height)]

Now we need a new function, which knows how to update brightness according to the new rules. (I don’t think these are thew new rules that Dua had in mind.)

def process_variable_brightness_instructions(data, lights):
    p = re.compile(r"(\d+),(\d+) through (\d+),(\d+)")

    for line in data:
        match = p.search(line)
        assert match, "All instruction lines are expected to match"
        tl_x, tl_y, br_x, br_y = map(int, match.groups())

        for y in range(tl_y, br_y + 1):
            for x in range(tl_x, br_x + 1):
                if "toggle" in line:
                    lights[y][x] += 2
                elif "on" in line:
                    lights[y][x] += 1
                elif "off" in line:
                    if lights[y][x] > 0:
                        lights[y][x] -= 1

Remember:

toggle adds 2.
on adds 1.
off subtracts 1, but only if we haven’t already reached our minimum value of 0.

Our final program looks like this:

from pathlib import Path
import time
import re

SCRIPT_DIR = Path(__file__).parent 
INPUT_FILE = Path(SCRIPT_DIR, "input/input.txt")
# INPUT_FILE = Path(SCRIPT_DIR, "input/sample_input.txt")

def main():
    with open(INPUT_FILE, mode="rt") as f:
        data = f.read().splitlines()

    width = height = 1000

    # Part 1
    # Create a list of lists
    light_rows = [[False for light in range(width)] for row in range(height)]
    process_instructions(data, light_rows)
    lights = [light_rows[y][x] for x in range(width) for y in range(height)]
    assert len(lights) == width*height
    print(f"Part 1, lights on: {sum(lights)}")

    # Part 2
    # Re-initialise our grid
    light_rows = [[0 for light in range(width)] for row in range(height)]
    process_variable_brightness_instructions(data, light_rows)
    lights = [light_rows[y][x] for x in range(width) for y in range(height)]
    print(f"Part 2, brightness: {sum(lights)}")

def process_variable_brightness_instructions(data, lights):
    p = re.compile(r"(\d+),(\d+) through (\d+),(\d+)")

    for line in data:
        match = p.search(line)
        assert match, "All instruction lines are expected to match"
        tl_x, tl_y, br_x, br_y = map(int, match.groups())

        for y in range(tl_y, br_y + 1):
            for x in range(tl_x, br_x + 1):
                if "toggle" in line:
                    lights[y][x] += 2
                elif "on" in line:
                    lights[y][x] += 1
                elif "off" in line:
                    if lights[y][x] > 0:
                        lights[y][x] -= 1

def process_instructions(data, lights):
    p = re.compile(r"(\d+),(\d+) through (\d+),(\d+)")

    for line in data:
        match = p.search(line)
        assert match, "All instruction lines are expected to match"
        tl_x, tl_y, br_x, br_y = map(int, match.groups())

        for y in range(tl_y, br_y + 1):
            for x in range(tl_x, br_x + 1):
                if "toggle" in line:
                    lights[y][x] = not lights[y][x]
                elif "on" in line:
                    lights[y][x] = True
                elif "off" in line:
                    lights[y][x] = False

if __name__ == "__main__":
    t1 = time.perf_counter()
    main()
    t2 = time.perf_counter()
    print(f"Execution time: {t2 - t1:0.4f} seconds")

And the output looks something like this:

Part 1, lights on: 543903
Part 2, brightness: 14687245
Execution time: 3.4142 seconds

It’s the right answer, and fairly quick. But we can do better…

Numpy Solution

Whenever a problem requires manipulation of grids of data, it’s worth considering whether NumPy might make our lives easier. For an introduction to NumPy, check out my guide.

Setup

from pathlib import Path
import time
import re
import numpy as np

SCRIPT_DIR = Path(__file__).parent 
INPUT_FILE = Path(SCRIPT_DIR, "input/input.txt")
# INPUT_FILE = Path(SCRIPT_DIR, "input/sample_input.txt")

INSTR_PATTERN = re.compile(r"(\d+),(\d+) through (\d+),(\d+)")

I’ve imported numpy, and named it np, as is standard convention.

Solution

with open(INPUT_FILE, mode="rt") as f:
    data = f.read().splitlines()

width = height = 1000

# Part 1
lights = np.full((width, height), False, dtype=np.bool8) # fill with False

Here’s what’s happening:

We read in the data, as before.
We crate a new NumPy ndarray using the full() method.
- Both width and height are set to 1000.
- We set the dtype to be np.bool8. I.e. the array expects every element to be a boolean.
- We set each element in a new array to False.

Now, our function to process the instructions:

# Part 1
def process_instructions(data, lights):
    for line in data:
        match = INSTR_PATTERN.search(line)
        assert match, "All instruction lines are expeted to match"
        tl_x, tl_y, br_x, br_y = map(int, match.groups())

        if "toggle" in line:
            lights[tl_x:br_x+1, tl_y:br_y+1] ^= True
        elif "on" in line:
            lights[tl_x:br_x+1, tl_y:br_y+1] = True
        elif "off" in line:
            lights[tl_x:br_x+1, tl_y:br_y+1] = False

The main difference with this version is that we no longer need iterate through each x and y location in the grid. Instead, we can just use numpy slicing.

We pass in a slice for the x range, and another slice for the y range. This allows us to select a rectangular region of the original 1000x1000 grid.
If the lights are being set to on, we set each element in this selection to True.
If the lights are being set to off, we set each element in this selection to False.
If the lights need to be toggled, we use ^= True to apply an XOR (exclusive OR) to each element in the selection. XOR works by comparing the two boolean values. If they are both the same, then it returns False. If they are different, it returns True. The practical result of ^= True is that it toggles the current boolean value. It’s a neat trick to remember!
Alternatively, we could have negated all the booleans in our selection like this:

    lights[tl_x:br_x+1, tl_y:br_y+1] = np.logical_not(lights[tl_x:br_x+1, tl_y:br_y+1])

For Part 2, first we need to re-initialise our array. Instead of setting each element to a boolean with a value of False, we initialise with integers of value zero.

lights = np.zeros((width, height), dtype=np.int8)   # Initialise with 0

Our function to process the variable brightness instructions looks like this:

def process_variable_brightness_instructions(data, lights):
    for line in data:
        match = INSTR_PATTERN.search(line)
        assert match, "All instruction lines are expeted to match"
        tl_x, tl_y, br_x, br_y = map(int, match.groups())

        if "toggle" in line:
            lights[tl_x:br_x+1, tl_y:br_y+1] += 2
        elif "on" in line:
            lights[tl_x:br_x+1, tl_y:br_y+1] += 1
        elif "off" in line:
            lights[tl_x:br_x+1, tl_y:br_y+1] -= 1

        lights[lights < 0] = 0

This function is super-easy to understand! The only line worth mentioning is the last one:

        lights[lights < 0] = 0

This line ensure that if any element in our array has a value less than 0, then that element is set to 0.

Our final code looks like this:

from pathlib import Path
import time
import re
import numpy as np

SCRIPT_DIR = Path(__file__).parent 
INPUT_FILE = Path(SCRIPT_DIR, "input/input.txt")
# INPUT_FILE = Path(SCRIPT_DIR, "input/sample_input.txt")

INSTR_PATTERN = re.compile(r"(\d+),(\d+) through (\d+),(\d+)")

def main():
    with open(INPUT_FILE, mode="rt") as f:
        data = f.read().splitlines()

    width = height = 1000

    # Part 1
    lights = np.full((width, height), False, dtype=np.bool8) # fill with False
    process_instructions(data, lights)
    print(f"Part 1, lights on: {lights.sum()}")

    # Part 2
    lights = np.zeros((width, height), dtype=np.int8)   # Initialise with 0
    process_variable_brightness_instructions(data, lights)
    print(f"Part 2, brightness: {lights.sum()}")

def process_instructions(data, lights):
    for line in data:
        match = INSTR_PATTERN.search(line)
        assert match, "All instruction lines are expeted to match"
        tl_x, tl_y, br_x, br_y = map(int, match.groups())

        if "toggle" in line:
            # lights[tl_x:br_x+1, tl_y:br_y+1] ^= True
            lights[tl_x:br_x+1, tl_y:br_y+1] = np.logical_not(lights[tl_x:br_x+1, tl_y:br_y+1])
        elif "on" in line:
            lights[tl_x:br_x+1, tl_y:br_y+1] = True
        elif "off" in line:
            lights[tl_x:br_x+1, tl_y:br_y+1] = False
            
def process_variable_brightness_instructions(data, lights):
    for line in data:
        match = INSTR_PATTERN.search(line)
        assert match, "All instruction lines are expeted to match"
        tl_x, tl_y, br_x, br_y = map(int, match.groups())

        if "toggle" in line:
            lights[tl_x:br_x+1, tl_y:br_y+1] += 2
        elif "on" in line:
            lights[tl_x:br_x+1, tl_y:br_y+1] += 1
        elif "off" in line:
            lights[tl_x:br_x+1, tl_y:br_y+1] -= 1

        lights[lights < 0] = 0

if __name__ == "__main__":
    t1 = time.perf_counter()
    main()
    t2 = time.perf_counter()
    print(f"Execution time: {t2 - t1:0.4f} seconds")

So using NumPy, the code is shorter and easier to read.

Let’s run it…

Part 1, lights on: 543903
Part 2, brightness: 14687245
Execution time: 0.2113 seconds

It’s fast! About 17 times faster than doing it with nested lists. Sweet.

NumPy FTW!!

‹›