Dazbo's Advent of Code solutions, written in Python
Day 9: Explosives in Cyberspace
This puzzle is all about data compression. We’re given a compressed file format that uses markers to indicate repeated sequences of characters. A marker is of the form (AxB)
, where A
is the number of characters to repeat, and B
is the number of times to repeat them.
For example:
ADVENT
contains no markers and decompresses to itself.A(1x5)BC
decompresses to ABBBBBC
.(3x3)XYZ
decompresses to XYZXYZXYZ
.A(2x2)BCD(2x2)EFG
decompresses to ABCBCDEFEFG
.(6x1)(1x3)A
decompresses to (1x3)A
.X(8x2)(3x3)ABCY
decompresses to X(3x3)ABC(3x3)ABCY
.What is the decompressed length of the file (your puzzle input)?
For Part 1, the rules are simple: markers within a repeated section are not expanded. This makes the problem relatively straightforward. We can iterate through the string, and when we encounter a marker, we expand the subsequent section and append it to our result.
My initial approach was to build the entire decompressed string and then find its length. This works perfectly for Part 1.
import logging
import os
import time
import re
SCRIPT_DIR = os.path.dirname(__file__)
INPUT_FILE = "input/input.txt"
expand_pattern = re.compile(r"\((\d+)x(\d+)\)")
def main():
logging.basicConfig(level=logging.DEBUG, format="%(asctime)s:%(levelname)s:\t%(message)s")
input_file = os.path.join(SCRIPT_DIR, INPUT_FILE)
with open(input_file, mode="rt") as f:
src_str = f.read()
# Part 1
char_posn = 0
expanded_str = ""
while char_posn < len(src_str):
match = expand_pattern.search(src_str, char_posn)
if match:
match_start = match.start()
match_end = match.end()
extent, repeat = [int(x) for x in match.groups()]
expanded_str += src_str[char_posn:match_start]
for _ in range(repeat):
expanded_str += src_str[match_end:match_end+extent]
char_posn = match_end+extent
else:
expanded_str += src_str[char_posn:]
char_posn = len(src_str)
logging.info(f"Part 1: Expanded str length = {len(expanded_str)}")
if __name__ == "__main__":
t1 = time.perf_counter()
main()
t2 = time.perf_counter()
print(f"Execution time: {t2 - t1:0.4f} seconds")
This approach is simple to understand but has a major drawback: it can consume a lot of memory and time if the decompressed string is very large. For Part 1, it’s fine, but for Part 2, this will be a problem.
What is the decompressed length of the file using this improved format?
In Part 2, the rules change: markers within repeated sections are also expanded. This means we need a recursive approach. Building the full string is no longer feasible, as the length can become astronomically large.
Instead of building the string, we can write a recursive function that calculates the length of the decompressed string.
This solution is much more efficient and elegant.
import logging
import os
import time
import re
SCRIPT_DIR = os.path.dirname(__file__)
INPUT_FILE = "input/input.txt"
expand_pattern = re.compile(r"\((\d+)x(\d+)\)")
def main():
logging.basicConfig(level=logging.DEBUG, format="%(asctime)s:%(levelname)s:\t%(message)s")
input_file = os.path.join(SCRIPT_DIR, INPUT_FILE)
with open(input_file, mode="rt") as f:
src_str = f.read()
# Part 1 - Using the recursive function for consistency
result = decompressed_len(src_str=src_str)
logging.info(f"Part 1: Expanded str length = {result}")
# Part 2 - Recurses into each segment
result = recursive_len(src_str=src_str)
logging.info(f"Part 2: Expanded str length = {result}")
def recursive_len(src_str: str) -> int:
return decompressed_len(src_str=src_str, len_func=recursive_len)
def decompressed_len(src_str: str, len_func=len) -> int:
if not src_str:
return 0
match = expand_pattern.match(src_str)
if match:
extent, repeat = map(int, match.groups())
start = match.end()
end = match.end() + extent
return (repeat * len_func(src_str[start:end])
+ decompressed_len(src_str[end:], len_func))
return 1 + decompressed_len(src_str[1:], len_func)
if __name__ == "__main__":
t1 = time.perf_counter()
main()
t2 = time.perf_counter()
print(f"Execution time: {t2 - t1:0.4f} seconds")
The beauty of this solution lies in the decompressed_len
function. It takes an optional len_func
argument, which defaults to the standard len()
function.
decompressed_len
without a len_func
. When a marker is found, len_func
(which is len()
) is called on the section to be repeated. This returns the length of the unexpanded section, which is then multiplied by the repeat count. The function then recursively calls itself on the rest of the string.decompressed_len
with recursive_len
as the len_func
. recursive_len
is a helper function that simply calls decompressed_len
again, but this time on the substring that needs to be repeated. This creates a beautiful recursive pattern where the function dives deeper and deeper into nested markers, calculating the length at each level and bubbling it back up.This recursive approach is incredibly powerful and efficient, allowing us to calculate the massive length in Part 2 almost instantly.
Part 1: Expanded str length = 102239
Part 2: Expanded str length = 10774309173
Execution time: 0.0025 seconds