Advent of Code 2021 - Day 3

Day 3: Binary Diagnostic

Useful Links

‹›

Concepts and Packages Demonstrated

zip splat base conversion (int)Counter

Binarytransposingtuplesjoindefault args

Problem Intro

So the warm-ups are over, and Day 3 presents the first non-trivial challenges.

Today’s problem requires us to decode a list of fixed-length binary numbers in order to obtain diagnostic information about the sub. The input data looks like this…

Setup

The only new class we’re going to use for this solution is the awesome Counter class, from the collections module.

import logging
import os
import time
from collections import Counter

SCRIPT_DIR = os.path.dirname(__file__) 
INPUT_FILE = "input/input.txt"
# INPUT_FILE = "input/sample_input.txt"

logging.basicConfig(format="%(asctime)s.%(msecs)03d:%(levelname)s:%(name)s:\t%(message)s", 
                    datefmt='%H:%M:%S')
logger = logging.getLogger(__name__)
logger.setLevel(level=logging.DEBUG)

Part 1

Part 1 is pretty simple. We’re told we need to determine the gamma rate and the epsilon rate, where:

Each bit in the gamma rate can be determined by finding the most common bit in the corresponding position of all the numbers in the input.
Each bit in the epsilon rate can be determined by finding the least common bit in the corresponding position of all the numbers in the input.

In short… If we take all the input numbers as rows, then we need to find the most common and least common bit in each column. But how to count the columns? This where the zip() function is really handy. It zips two or more iterables together, into a single series of tuples.

E.g. we can zip abc and 123 together:

res = list(zip("abc", "123"))
print(res)

And the output is a single list that looks like this:

[('a', '1'), ('b', '2'), ('c', '3')]

So, if we were to zip these three binary numbers:

res = list(zip("001", "101", "110"))

We would end up with:

[('0', '1', '1'), ('0', '0', '1'), ('1', '1', '0')]

We’ve transposed our rows of binary numbers into columns of numbers, which is exactly what we want!

Thus, our code starts like this:

    input_file = os.path.join(SCRIPT_DIR, INPUT_FILE)
    with open(input_file, mode="rt") as f:
        data = f.read().splitlines()
    
    # Part 1
    transposed = list(zip(*data))   # transpose to get list of columns

The *data is worth a mention. Here * is known as the splat operator, and it unpacks (splats) any given series, and uses each series element as an argument to a function. In combination with zip(), this allows us to take each element in our list of binary numbers, and pass each one as a separate argument to zip(). Because each element is a str, each element is also a series of characters.

All that remains is to count each column, and take the most / least common bit from each column. We do this by creating a counter for each column. Then we can use the most_common() method to return an ordered list of (item, count) pairs, with the highest frequency item at the front of the list, and the least frequent item at the back.

The code below uses [0] and [-1] to get the first and last tuples from the ordered list, respectively. Then it indexes again with [0], since we only want the item from the tuple, i.e. the binary digit itself. (We don’t need the count.) This is all wrapped in a list comprehension, such that we return a bit for each column in our transposed data.

Finally, we use "".join() to assemble a new single str from all the returned bits. I.e. our new binary number.

most_common_bits = "".join([Counter(current_col).most_common()[0][0] for current_col in transposed])
logger.debug("Gamma rate: %s", most_common_bits)
least_common_bits = "".join([Counter(current_col).most_common()[-1][0] for current_col in transposed])
logger.debug("Epsilon rate: %s", least_common_bits)

logger.info("Part 1: Product = %d\n", int(most_common_bits, 2) * int(least_common_bits, 2))

Finally, we multiply these two rates to obtain the power consumption rate, as per the instructions. We need to use int(some_number, 2) in order to convert the binary numbers to decimal numbers, before we do the multiplication.

Simples!

Part 2

This part is slightly less simple. We’re told we need the life support rating, given by the product of the oxygen generator rating, and the CO2 scrubber rating. Both of these are found by filtering out numbers (our rows) that don’t meet the criteria, until only one number remains. The criteria are:

For oxygen generator rating, determine the most common value in each position, and only keep numbers that have that bit in that position. If the counts are equal, keep 1.
For CO2, determine the most common value in each position, and only keep numbers that have that bit in that position. If the counts are equal, keep 0.

To solve this, I’ve opted to iterate through each column position, and maintain a count for each position. We could have used Counter(), but I’ve opted to do it manually. I’ve created a function to do this:

def filter_diag_vals(diag_values: list[str], least_common=False):
    """ Filter the supplied data, by iteratively keeping only elements that match the rules.
    We determine the most/least common bit in each position, from left to right.
    We keep only diag values that contain that bit.
    If equal counts of 1s and 0s, keep 1 if we're after most common, or 0 if we're after least.

    Args:
        diag_values ([str]): list of binary diagnostic values
        least_common (bool, optional): Whether we want least common. Defaults to False.

    Returns:
        [str]: The single remaining diagnostic value.
    """
    diag_value_len = len(diag_values[0])
    
    # Process each digit
    for diag_posn in range(diag_value_len):
        count_of_1 = count_of_0 = 0
        
        # loop through each value to find most common / least common digit
        for diag_val in diag_values:
            if diag_val[diag_posn] == "1":
                count_of_1 += 1
            else:
                count_of_0 += 1
        
        keep = "1"  # default for most common, i.e. if equal count of 1s and 0s
        if count_of_0 > count_of_1:
            keep = "0"
            
        if least_common:    # invert if we want least common
            keep = "1" if keep == "0" else "0"
        
        # eliminate the diag values that don't match the criteria
        diag_values = [diag_val for diag_val in diag_values if diag_val[diag_posn] == keep]
        if len(diag_values) == 1:
            break
            
    return diag_values[0] 

Two things worth mentioning here. First: the list comprehension that actually does the filtering at the end. This comprehension checks if the specified position has the bit we want to keep, and only returns the numbers that match. Finally, when we’re only left with one number, we exit the loop and return the number.

Secondly: note how I’m setting the function argument least_common to a default of False. This makes the argument optional, so the function only ‘inverts’ if we set this parameter to True when we call the function.

# Part 2
filter_with_most_common = int(filter_diag_vals(data), 2)
logger.debug("Oxygen sensor rating = %s", filter_with_most_common)

filter_with_least_common = int(filter_diag_vals(data, least_common=True), 2)
logger.debug("CO2 scrubber rating = %s", filter_with_least_common)

logger.info("Part 2: Product of O2 and CO2 ratings = %d\n", filter_with_most_common * filter_with_least_common)

The output looks like:

16:51.857:DEBUG:__main__:    Gamma rate: 101001001011
16:51.857:DEBUG:__main__:    Epsilon rate: 010110110100
16:51.857:INFO:__main__:     Part 1: Product = 3847100

16:51.858:DEBUG:__main__:    Oxygen sensor rating = 2735
16:51.858:DEBUG:__main__:    CO2 scrubber rating = 1501
16:51.858:INFO:__main__:     Part 2: Product of O2 and CO2 ratings = 4105235

16:51.858:INFO:__main__:     Execution time: 0.0091 seconds

(Note how we can tell which lines were created with logger.info(), and which were created with logger.debug().)

All done.

‹›