Dazbo's Advent of Code solutions, written in Python # Advent of Code 2021 - Day 10

## Problem Intro

I enjoyed this one. There are a couple of ways of going about the problem, and I ended up writing three different solutions.

We’re told that our sub’s navigation data is corrupted. The navigation is our input data, and it looks like this:

``````[({(<(())[]>[[{[]{<()<>>
[(()[<>])]({[<{<<[]>>(
{([(<{}[<>[]}>{[]{[(<()>
(((({<>}<{<{<>}{[]{[]{}
[[<[([]))<([[{}[[()]]]
...
``````
• Each line contains one or more navigation chunks.
• Chunks open and close with matching brackets.
• Valid brackets are any of (), <>, {}, and [].
• Each chunk contains zero or more inner chunks.

We’re told some lines are incomplete, whilst some are corrupted.

• Corrupt lines will have a closing bracket in the wrong place. I.e. because a different closing bracket is expected. For example, `"{()()()>"` must be corrupt, because we have a `>` without a corresponding `<`.
• Incomplete lines do not have any closing brackets in the wrong place. But there are chunks still open, and thus missing appropriate closing brackets. For example, `"{<<[]>>("` must be incomplete, because the last `(` has not been closed, nor has the `{`.

## Solution #1

My first solution uses the very cool Parsimonious. This is a library that allows the parsing of grammars, i.e. a set of terms that follow a defined set of rules. Think of it like regex on steroids, since it has the ability to recognise specific language terms, but also recurse into those terms.

### Part 1

We’re ignoring incomplete lines. We’re told to find the first invalid chracter in each corrupted line, and determine its score. Each type of invalid character maps to one of four unique scores. Then we’re asked to add up these scores, to give the total syntax error score.

### Setup

Let’s import what we need:

``````from collections import defaultdict
import logging
import os
import time
from parsimonious import Grammar, ParseError
from parsimonious.exceptions import IncompleteParseError

SCRIPT_DIR = os.path.dirname(__file__)
INPUT_FILE = "input/input.txt"
# INPUT_FILE = "input/sample_input.txt"

logging.basicConfig(format="%(asctime)s.%(msecs)03d:%(levelname)s:%(name)s:\t%(message)s",
datefmt='%Y-%m-%d %H:%M:%S')
logger = logging.getLogger(__name__)
logger.setLevel(level=logging.INFO)
``````

### The solution

Let’s define our `Parsimonious Grammar`, i.e. the set of rules that determine whether our input is valid.

``````# Each row (expr) must contain one or more chunks (any)
# Each chunk (any) can be of type 'normal', 'angle', 'square', or 'curly'
# Each type is composed of a bracket pair, and 0 or more inner chunks (any)
grammar = Grammar(r"""
expr = any+
any = (normal / angle / square / curly)
normal = "(" any* ")"
angle = "<" any* ">"
square = "[" any* "]"
curly = "{" any* "}"
""")
``````

Explaining the rules:

• Start by defining `expr`, which is what we’ll validate each input line against. This requires 1 or more instances of `any`. The `+` operator means “one or more”, just like in regex.
• We define `any` as one item of `normal`, `angle`, `square`, or `curly`. The `/` means “or”.
• We define each of `normal`, `angle`, `square`, and `curly` to be a specific opening bracket, followed by zero or more `any`, followed by the matching closing bracket. As with regex, `*` means 0, 1 or more. Note that this rule allows our grammar to be recursive, i.e. because our ‘top’ any can contain terms which can themselves contain ‘any’.

Now we create a `list` of opening brackets, and a `list` of corresponding closing brackets. Then we zip these two lists together, to create two dicts:

• One which maps each opener to a matching closer.
• Another which maps each closer to a matching opener.

And finally, one more dictionary, which maps each closing bracket to the arbitrary invalid char scores we were told to use.

``````OPENERS = ["(", "[", "{", "<"]
CLOSERS = [")", "]", "}", ">"]
OPEN_TO_CLOSE = dict(zip(OPENERS, CLOSERS))  # {'(': ')', ...}
CLOSE_TO_OPEN = dict(zip(CLOSERS, OPENERS))  # {')': '(', ...}

INVALID_CHAR_SCORES = dict(zip(CLOSERS, (3, 57, 1197, 25137))) # {')': 3, ...}
``````

Now we run it:

``````input_file = os.path.join(SCRIPT_DIR, INPUT_FILE)
with open(input_file, mode="rt") as f:

# Part 1 - Looking for corrupted lines only
incomplete_lines = []
invalid_chars = []
for line_num, line in enumerate(data):
try:
grammar.parse(line)
except IncompleteParseError:    # valid, but incomplete
incomplete_lines.append(line)
except ParseError as err:
if err.pos == len(line):    # valid, but incomplete
incomplete_lines.append(line)
else:                       # corrupted
logger.debug("%d: %s", line_num, line)
logger.debug("Found %s", line[err.pos])
invalid_chars.append(line[err.pos])

logger.info("Part 1: There are %d corrupted lines", len(invalid_chars))
score = sum([INVALID_CHAR_SCORES[char] for char in invalid_chars])
logger.info("Syntax error score=%d\n", score)
``````

First, we read all the data and split into separate lines. We parse each line, using our grammar. We’re only interested in lines which fail to validate, and these are identified as follows:

• Incomplete lines - with some rules passed - throw `IncompleteParseError`. We’ll keep these for later!
• Incomplete lines - with no rules passed - throw `ParseError`. We can identify these because by the fact that we’ve got to the end of the line before the `ParseError` was thrown. We’ll also keep these for later.
• Corrupt lines also throw a `ParseError`, but won’t have reached the end of the line. This is all we need for Part 1. We store the character that was read when the ParseError was thrown, since this is the character we need. Finally, we map each of these characters to its corresponding score, and then add up the scores.

Easy!

### Part 2

Unsurprisingly, we’re now asked to work with the incomplete lines. Specifically, we’re told we need to complete the incomplete lines, by adding the necessary sequence of closing brackets. Work out the completion score.

We’ll do most of the hard work here:

``````def get_completion_for_line(line: str) -> str:
""" Determine which closing brackets need to be added to complete this incomplete line. """

to_complete = ""
close_counters = defaultdict(int)
for char in line[::-1]:     # process chars from the end
if char in CLOSERS:
close_counters[char] += 1
else:  # opener
matching_closer = OPEN_TO_CLOSE[char]
if close_counters[matching_closer] > 0:    # opener for existing closer
close_counters[matching_closer] -= 1
else:    # opener, but missing closer
to_complete += matching_closer

``````

The objective is to take each incomplete line, and then work out which brackets haven’t been closed.

This is what the function does:

• Create a `defaultdict(int)`. We’ve seen the `defaultdict` before in this AoC. This is a dict where each value is initialised to 0, meaning we can add to the value of a key, even if the key hasn’t been used before.
• Then we take our incomplete line, and reverse it, using the `[::-1]` syntax. (This means step from the end to the beginning, but in steps of -1.)
• For each char in the reversed line:
• Determine if the char is a closing bracket.
• If so, store a count of how many of this type of closing brackets we’ve come across. (The key is the bracket type.)
• If not, then we have an opening bracket. If the opening bracket matches a closing bracket in the dict:
• We’ve matched it, and we decrement the count for that closing bracket.
• If not, then we’ve found an opening bracket that needs a closing bracket. So, add the appropriate closing bracket to our `to_complete` variable.

Now we need to determine the score that results from the set of closing brackets we’ve identified.

``````def score_for_completion(completion_str: str) -> int:
""" Arbitrary rules for calculating a score for a str of completion chars """
score = 0

for char in completion_str:
score *= 5
score += COMPLETION_CHAR_SCORES[char]

return score
``````

Finally, let’s run all this, to solve Part 2.

``````# Part 2
logger.info("Part 2: There are %d remaining incomplete lines", len(incomplete_lines))
completion_scores = []
for line in incomplete_lines:
to_complete = get_completion_for_line(line)
completion_scores.append(score := score_for_completion(to_complete))
logger.debug("To complete: %s with score %d", to_complete, score)

completion_scores.sort()
logger.info("Completion score=%d", completion_scores[len(completion_scores)//2])
``````

We sort the scores, and then identify the median, as required.

The output looks like this:

``````2022-01-15 20:12:37.680:INFO:__main__:  Part 1: There are 49 corrupted lines
2022-01-15 20:12:37.681:INFO:__main__:  Syntax error score=339477

2022-01-15 20:12:37.682:INFO:__main__:  Part 2: There are 49 remaining incomplete lines
2022-01-15 20:12:37.686:INFO:__main__:  Completion score=3049320156
2022-01-15 20:12:37.687:INFO:__main__:  Execution time: 0.0664 seconds
``````

## Solution #2

Parsimonious is awesome, and we can give it any rules we want it to follow. But it does come with a bit of processing overhead. If we want to be super-quick, we can write our own parser. So that’s what I’ve done here. It also gives us a bit of practice at writing our own user-defined exceptions.

First, Part 1:

### Setup

These are the imports we need:

``````import logging
import os
import time
from collections import deque
from dataclasses import dataclass
``````

And we’ll still use these constants:

``````OPENERS = ["(", "[", "{", "<"]
CLOSERS = [")", "]", "}", ">"]
OPEN_TO_CLOSE = dict(zip(OPENERS, CLOSERS))  # {'(': ')', ...}

COMPLETION_CHAR_SCORES = dict(zip(CLOSERS, (1, 2, 3, 4)))
INVALID_CHAR_SCORES = dict(zip(CLOSERS, (3, 57, 1197, 25137)))
``````

### Solution

And we’ll create a user-defined `ParseException`, by subclassing `Exception`. Note that we also make it a `dataclass`, so we don’t need to create an `__init__()` method to initialise any variables.

``````@dataclass
class ParseException(Exception):
""" Data parsed and found to be invalid """
expected: str
actual: str
``````

Now, for each line, we parse using this function:

``````def parse(line: str):
""" Parse the navigation instructions, line-by-line.
If we read any right bracket that does not match an existing left bracket, raise ParseException.
"""
stack = deque()

for char in line:
if char in OPENERS:
stack.append(OPEN_TO_CLOSE[char])
continue    # Move on to the next char

assert char in CLOSERS, "Must be right bracket"
popped = stack.pop()    # Pop the required right bracket
if char == popped:
continue    # This was the right bracket we needed

raise ParseException(expected=popped, actual=char)  # Wrong right bracket - invalid line
``````

It works by:

• Maintaining a last-in, first-out stack (deque), where the last item is the corresponding right bracket needed for any left bracket just read.
• If we read a right bracket that isn’t at the top of the stack, then it means this closing bracket is not required and is invalid. So we raise a `ParseException`, containing the invalid chaacter we just read.

Solving Part 1 is now just this:

``````# Part 1 - Looking for corrupted lines only
invalid_chars = []
for line in data:
try:
parse(line)
except ParseException as pe:
invalid_chars.append(pe.actual)

logger.info("Part 1: There are %d corrupted lines", len(invalid_chars))
score = sum([INVALID_CHAR_SCORES[char] for char in invalid_chars])
logger.info("Syntax error score=%d\n", score)
``````

For Part 2, we just make some small tweaks.

First, we add a new user-defined exception:

``````@dataclass
class ParseIncompleteException(Exception):
""" Data parsed and found to be incomplete """
remaining: str
``````

Then we need to modify our `parse()` method:

``````def parse(line: str):
""" Parse the navigation instructions, line-by-line.
If we read any right bracket that does not match an existing left bracket, raise ParseException.
If we read all the data but still have closing brackets left on the stack, raise ParseIncompleteException.
"""
stack = deque()

for char in line:
if char in OPENERS:
stack.append(OPEN_TO_CLOSE[char])
continue    # Move on to the next char

assert char in CLOSERS, "Must be right bracket"
popped = stack.pop()    # Pop the required right bracket
if char == popped:
continue    # This was the right bracket we needed

raise ParseException(expected=popped, actual=char)  # Wrong right bracket - invalid line

if stack:   # There were more left brackets than right, so this line is incomplete
raise ParseIncompleteException(remaining="".join(reversed(stack)))
``````

The only change here is the check at the end. If there are any closing brackets still on the stack, then we haven’t yet found these required brackets. So, throw a `ParseIncompleteException`, and pass in the closing brackets remaining on the stack, in reverse order. These are the brackets that are needed to complete our line!

And now let’s update our main method accordingly:

``````invalid_chars = []
completion_scores = []   # We've added this
for line in data:
try:
parse(line)
except ParseException as pe:
invalid_chars.append(pe.actual)
except ParseIncompleteException as pie: # We've added this new exception type
completion_scores.append(score_for_completion(pie.remaining))

logger.info("Part 1: There are %d corrupted lines", len(invalid_chars))
score = sum([INVALID_CHAR_SCORES[char] for char in invalid_chars])
logger.info("Syntax error score=%d\n", score)

# Part 2
completion_scores.sort()
logger.info("Part 2: Completion score=%d", completion_scores[len(completion_scores)//2])
``````

And this is about 300x faster:

``````2022-01-15 20:35:23.126:INFO:__main__:  Part 1: There are 49 corrupted lines
2022-01-15 20:35:23.127:INFO:__main__:  Syntax error score=339477

2022-01-15 20:35:23.127:INFO:__main__:  Part 2: Completion score=3049320156
2022-01-15 20:35:23.128:INFO:__main__:  Execution time: 0.0021 seconds
``````

## Solution #3

This approach relies on stripping out complete bracket pairs from the input lines, until all that remains is either invalid characters, or brackets that need to be completed.

Slight tweaks to our constants:

``````OPENERS = ["(", "[", "{", "<"]
CLOSERS = [")", "]", "}", ">"]
PAIRS = ["".join(item) for item in zip(OPENERS, CLOSERS)] # ['()', '[]', ...]
OPEN_TO_CLOSE = dict(zip(OPENERS, CLOSERS))  # {'(': ')', ...}

INVALID_CHAR_SCORES = dict(zip(CLOSERS, (3, 57, 1197, 25137)))
COMPLETION_CHAR_SCORES = dict(zip(CLOSERS, (1, 2, 3, 4)))
``````

To trim out valid brackets:

``````def trim_brackets(line: str) -> str:
""" Iteratively strip out contiguous pairs of brackets from this line, until none remain """
stripped = True
while stripped:         # Continue until no more stripping
stripped = False
for bracket_pair in PAIRS: # E.g. "()"
if bracket_pair in line:
line = line.replace(bracket_pair, "")   # Strip out all occurences of this pair
stripped = True

return line
``````

This is what it does with an input line:

``````'[({(<(())[]>[[{[]{<()<>>'  # remove ()
'[({(<()[]>[[{[]{<<>>'  # remove []
'[({(<()>[[{{<<>>'  # remove <>
'[({(<()>[[{{<>'  # remove ()
'[({(<>[[{{<>'  # remove <>
'[({([[{{'  # no pairs left to remove - valid but incomplete
``````

Our `get_completion_for_line()` function is now really simple. We take the incomplete (but valid) lines, and simply add the corresponding closer for each opener, starting at the end and working back to the front.

``````def get_completion_for_line(line: str) -> str:
""" Determine which closing brackets need to be added to complete this incomplete line. """

to_complete = ""
for opener in line[::-1]:
to_complete += OPEN_TO_CLOSE[opener]

``````

So we solve both parts, thus:

``````# Part 1 - Looking for corrupted lines only
incomplete_lines = []
invalid_chars = []
trimmed_lines = [trim_brackets(line) for line in data]
for line in trimmed_lines:
# corrupt lines will have closing brackets, whilst incomplete lines will not
if any(char in CLOSERS for char in line):   # corrupt if any closer in the line
for char in line:
if char in CLOSERS:
invalid_chars.append(char)  # find first closer
break
else:
incomplete_lines.append(line)

logger.info("Part 1: There are %d corrupted lines", len(invalid_chars))
score = sum([INVALID_CHAR_SCORES[char] for char in invalid_chars])
logger.info("Syntax error score=%d\n", score)

# Part 2
logger.info("Part 2: There are %d remaining incomplete lines", len(incomplete_lines))
completion_scores = []
for line in incomplete_lines:
to_complete = get_completion_for_line(line)
completion_scores.append(score := score_for_completion(to_complete))
logger.debug("To complete: %s with score %d", to_complete, score)

completion_scores.sort()
logger.info("Completion score=%d", completion_scores[len(completion_scores)//2])
``````

Output:

``````21:04:20.963:INFO:__main__:     Part 1: There are 49 corrupted lines
21:04:20.965:INFO:__main__:     Syntax error score=339477

21:04:20.965:INFO:__main__:     Part 2: There are 49 remaining incomplete lines
21:04:20.968:INFO:__main__:     Completion score=3049320156
21:04:20.968:INFO:__main__:     Execution time: 0.0023 seconds
``````

Also pretty quick.