Learning Python with Advent of Code Walkthroughs

Dazbo's Advent of Code solutions, written in Python

binary diagnostics

Advent of Code 2021 - Day 3

Day 3: Binary Diagnostic

Useful Links

Concepts and Packages Demonstrated

zipsplatbase conversion (int)Counter

Binarytransposingtuplesjoindefault args

Problem Intro

So the warm-ups are over, and Day 3 presents the first non-trivial challenges.

Today’s problem requires us to decode a list of fixed-length binary numbers in order to obtain diagnostic information about the sub. The input data looks like this…

00100
11110
10110
10111
10101

Setup

The only new class we’re going to use for this solution is the awesome Counter class, from the collections module.

import logging
import os
import time
from collections import Counter

SCRIPT_DIR = os.path.dirname(__file__) 
INPUT_FILE = "input/input.txt"
# INPUT_FILE = "input/sample_input.txt"

logging.basicConfig(format="%(asctime)s.%(msecs)03d:%(levelname)s:%(name)s:\t%(message)s", 
                    datefmt='%H:%M:%S')
logger = logging.getLogger(__name__)
logger.setLevel(level=logging.DEBUG)

Part 1

Part 1 is pretty simple. We’re told we need to determine the gamma rate and the epsilon rate, where:

In short… If we take all the input numbers as rows, then we need to find the most common and least common bit in each column. But how to count the columns? This where the zip() function is really handy. It zips two or more iterables together, into a single series of tuples.

E.g. we can zip abc and 123 together:

res = list(zip("abc", "123"))
print(res)

And the output is a single list that looks like this:

[('a', '1'), ('b', '2'), ('c', '3')]

So, if we were to zip these three binary numbers:

res = list(zip("001", "101", "110"))

We would end up with:

[('0', '1', '1'), ('0', '0', '1'), ('1', '1', '0')]

We’ve transposed our rows of binary numbers into columns of numbers, which is exactly what we want!

Thus, our code starts like this:

    input_file = os.path.join(SCRIPT_DIR, INPUT_FILE)
    with open(input_file, mode="rt") as f:
        data = f.read().splitlines()
    
    # Part 1
    transposed = list(zip(*data))   # transpose to get list of columns

The *data is worth a mention. Here * is known as the splat operator, and it unpacks (splats) any given series, and uses each series element as an argument to a function. In combination with zip(), this allows us to take each element in our list of binary numbers, and pass each one as a separate argument to zip(). Because each element is a str, each element is also a series of characters.

All that remains is to count each column, and take the most / least common bit from each column. We do this by creating a counter for each column. Then we can use the most_common() method to return an ordered list of (item, count) pairs, with the highest frequency item at the front of the list, and the least frequent item at the back.

The code below uses [0] and [-1] to get the first and last tuples from the ordered list, respectively. Then it indexes again with [0], since we only want the item from the tuple, i.e. the binary digit itself. (We don’t need the count.) This is all wrapped in a list comprehension, such that we return a bit for each column in our transposed data.

Finally, we use "".join() to assemble a new single str from all the returned bits. I.e. our new binary number.

most_common_bits = "".join([Counter(current_col).most_common()[0][0] for current_col in transposed])
logger.debug("Gamma rate: %s", most_common_bits)
least_common_bits = "".join([Counter(current_col).most_common()[-1][0] for current_col in transposed])
logger.debug("Epsilon rate: %s", least_common_bits)

logger.info("Part 1: Product = %d\n", int(most_common_bits, 2) * int(least_common_bits, 2))

Finally, we multiply these two rates to obtain the power consumption rate, as per the instructions. We need to use int(some_number, 2) in order to convert the binary numbers to decimal numbers, before we do the multiplication.

Simples!

Part 2

This part is slightly less simple. We’re told we need the life support rating, given by the product of the oxygen generator rating, and the CO2 scrubber rating. Both of these are found by filtering out numbers (our rows) that don’t meet the criteria, until only one number remains. The criteria are:

To solve this, I’ve opted to iterate through each column position, and maintain a count for each position. We could have used Counter(), but I’ve opted to do it manually. I’ve created a function to do this:

def filter_diag_vals(diag_values: list[str], least_common=False):
    """ Filter the supplied data, by iteratively keeping only elements that match the rules.
    We determine the most/least common bit in each position, from left to right.
    We keep only diag values that contain that bit.
    If equal counts of 1s and 0s, keep 1 if we're after most common, or 0 if we're after least.

    Args:
        diag_values ([str]): list of binary diagnostic values
        least_common (bool, optional): Whether we want least common. Defaults to False.

    Returns:
        [str]: The single remaining diagnostic value.
    """
    diag_value_len = len(diag_values[0])
    
    # Process each digit
    for diag_posn in range(diag_value_len):
        count_of_1 = count_of_0 = 0
        
        # loop through each value to find most common / least common digit
        for diag_val in diag_values:
            if diag_val[diag_posn] == "1":
                count_of_1 += 1
            else:
                count_of_0 += 1
        
        keep = "1"  # default for most common, i.e. if equal count of 1s and 0s
        if count_of_0 > count_of_1:
            keep = "0"
            
        if least_common:    # invert if we want least common
            keep = "1" if keep == "0" else "0"
        
        # eliminate the diag values that don't match the criteria
        diag_values = [diag_val for diag_val in diag_values if diag_val[diag_posn] == keep]
        if len(diag_values) == 1:
            break
            
    return diag_values[0] 

Two things worth mentioning here. First: the list comprehension that actually does the filtering at the end. This comprehension checks if the specified position has the bit we want to keep, and only returns the numbers that match. Finally, when we’re only left with one number, we exit the loop and return the number.

Secondly: note how I’m setting the function argument least_common to a default of False. This makes the argument optional, so the function only ‘inverts’ if we set this parameter to True when we call the function.

# Part 2
filter_with_most_common = int(filter_diag_vals(data), 2)
logger.debug("Oxygen sensor rating = %s", filter_with_most_common)

filter_with_least_common = int(filter_diag_vals(data, least_common=True), 2)
logger.debug("CO2 scrubber rating = %s", filter_with_least_common)

logger.info("Part 2: Product of O2 and CO2 ratings = %d\n", filter_with_most_common * filter_with_least_common)

The output looks like:

09:16:51.857:DEBUG:__main__:    Gamma rate: 101001001011
09:16:51.857:DEBUG:__main__:    Epsilon rate: 010110110100
09:16:51.857:INFO:__main__:     Part 1: Product = 3847100

09:16:51.858:DEBUG:__main__:    Oxygen sensor rating = 2735
09:16:51.858:DEBUG:__main__:    CO2 scrubber rating = 1501
09:16:51.858:INFO:__main__:     Part 2: Product of O2 and CO2 ratings = 4105235

09:16:51.858:INFO:__main__:     Execution time: 0.0091 seconds

(Note how we can tell which lines were created with logger.info(), and which were created with logger.debug().)

All done.