Dazbo's Advent of Code solutions, written in Python
Day 14: Extended Polymerization
A problem like this one can break your brain. Conceptually, it’s quite simple. But implementing can be quite trickly.
I’ve created two solutions to this problem:
We’re told we need to produce materials to reinforce the sub, and we can produce these materials from available nearby elements, using our own polymerization equipment.
Our input looks something like this:
NNCB
CH -> B
HH -> N
CB -> H
NH -> C
HB -> C
HC -> B
HN -> C
NN -> C
The first row is the template, and the remaining rows are the rules. Any given rule XY -> A
means: replace any adjacent elements XY
with XAY
, by inserting A
between them.
If we apply a couple of iterations to this starting template, the polymerisation proceeds like this:
E.g. Template:N N C B Step 1:N C N B C H B Step 2:N B C C N B B B C B H C B
It’s immediately apparent that the chain length grows exponentially. With each cycle, a chain of length n
grows to a length of 2n-1
. And we can generalise by saying that:
i
s
(i-1) * 2
i
+ 1
.Thus, with an initial template of 4 elements:
Step | Chain Length |
---|---|
1 | 7 |
2 | 13 |
3 | 25 |
4 | 49 |
We’re asked to determine the numeric value of subtracting the least common element from the most common element, after performing 10 cycles.
Since each adjacent pair generates a new pair with each iteration, a recursive solution seems the most obvious choice.
First, let’s get a feel for how big our chain will be after 10 iterations? Using the formula above, we know that our final chain will have length of 3073, with an initial chain length of 4. And my actual data has a starting chain length of 20. So, after 10 iterations, we would have a chain of 20481. That’s not very big at all, so building out the whole str
is perfectly doable.
A recursive function is a function that calls itself. Typically, a recursive function will:
To perform the polymerisation, we can write our recursive polymerisation function like this:
def recursive_replace(steps: int, to_expand: str, rules: dict) -> str:
""" Replace the given string with a new replacement string, according to the rules.
Args:
steps (int): How many iterations. Decremented with each recursion. Recursion ends when step=1.
input (str): Input str. The str to be replaced at this recursion depth
rules (dict): Map of XY: Z, such that XY becomes XZY """
res = to_expand # E.g. NNCB first iteration, NCN on second iteration, NBC on third...
chars_inserted = 0 # This grows as we insert horizontally, to allow indexing
for i in range(len(to_expand)-1): # sliding window of 2 chars; stop at len-1
pair = to_expand[i:i+2] # E.g. CB
if pair in rules: # if this pair has a valid replacement str
replacement = pair[0] + rules[pair] + pair[1] # E.g. CH -> CBH
insertion_point = i + chars_inserted
if steps > 1: # Deeper recursions to go
# Now recurse into the pair we've just expanded
replacement = recursive_replace(steps-1, replacement, rules)
res = res[:insertion_point] + replacement + res[insertion_point+2:]
# Because replacement is recursive, XY could be replaced by a long str
chars_inserted += len(replacement)-len(pair)
return res
How does it work?
str
, one pair of elements at a time. But remember that pairs can overlap, so this is a sliding window, two chars wide, that always moves one character at a time.We know we need to generate 10 cycles worth of expansion. So, we execute this as follows:
input_file = os.path.join(SCRIPT_DIR, INPUT_FILE)
with open(input_file, mode="rt") as f:
data = f.read()
template, rules = process_data(data)
# Part 1
steps = 10
res = recursive_replace(steps, template, rules)
char_counts = Counter(res).most_common() # [(char1, count1), (char2, count2)...]
logger.info("Part 1 with recursive replace: Most common count - least common count = %d",
char_counts[0][1] - char_counts[-1][1])
def process_data(data: str) -> tuple[str, dict[str, str]]:
""" Process one row of template, then an empty line, then rows of rules.
Returns:
tuple[str, dict[str, str]]: (template, rules-dict)
where rules-dict looks like {'CH':'B', 'HH':'N', ...}
"""
template, _, rules_lines = data.partition('\n\n')
rules = {}
for line in rules_lines.splitlines():
if len(line) > 0:
pair, element = line.split(" -> ")
rules[pair] = element
return template, rules
This works as follows:
tuple
.Counter
from the expanded chain, which is a dict
of {element: count}
in descending count order.Easy! It runs pretty darn quickly, too.
Yeah, I had a feeling this was coming. We’re told 10 cycles isn’t enough. We need to perform 40 cycles.
To check feasibility of our current solution, let’s see how big the final string would be…
20 * 2
40
+1 = 21990232555521
Yeah, those are big strings. If you try and do it the same way, you’re going to be waiting around for quite a while. So, we need a better solution.
Much like the lanternfish of day 6, we really don’t need to represent the actual polymerising chain. We only need to keep track of how many of each pair we have. That’s because a given pair will always polymerise into the same 3-element product. And that 3-element product is always made up of a left-hand pair, and an overlapping right-hand pair. I.e. given a rule XY -> A
:
Initial pair:X Y Resulting product:X A Y Which is two pairs:X A +A Y
So, we can write a new function that looks like this:
def count_by_pairs(template, rules, steps) -> Counter:
""" Expand the pairs according to the supplied rules, for the given number of steps.
Simply counts how many of each type of pair we have after each step.
Args:
input (str): The initial template, e.g. 'NNCB'
steps (int): How many iterations.
rules (dict): Dict of {'XY': 'Z', ...}, such that XY becomes XZY """
pairs_counter = Counter() # Initialise the counts
# Find each overlapping pair in the template
for i in range(len(template)-1): # E.g. NNCB
pair = template[i:i+2] # E.g. NN, NC, CB
pairs_counter[pair] += 1 # increment count for pair, e.g. {'NN': 1, 'NC': 1, ...}
# With each step, increment counts of all daughter pairs by the count of source pairs
for _ in range(steps):
counts_this_step = Counter()
for pair in pairs_counter: # E.g. NN, NC..
# Rule NN -> C converts NN -> NC + CN
# Increment count of new left and right pairs by the count of source pair
# E.g. count of NC and CN both increment by count of NN
counts_this_step[pair[0] + rules[pair]] += pairs_counter[pair] # NB
counts_this_step[rules[pair] + pair[1]] += pairs_counter[pair] # BC
pairs_counter = counts_this_step
# We've got counts of pairs, but we need counts of the individual elements
element_counts = Counter() # i.e. a counter for each char
for pair in pairs_counter: # e.g. NB
# Apart from the char at the end,
# each letter will be both the last char of a pair, and the first letter of another pair.
# We don't want to double-count, so we only need to count first chars
element_counts[pair[0]] += pairs_counter[pair] # N from NC, C from CN, etc
# The only exception is the last char from the original str,
# since it always remains after insertions
element_counts[template[-1]] += 1
return element_counts
This is how it works
Let’s take a look at our example polymerisation again:
E.g. Template:N N C B Step 1:N C N B C H B Step 2:N B C C N B B B C B H C B
All we have is pairs and counts. We don’t know what order the pairs appear in, in our final chain. However:
B
in the example above.)We run it like this, which is almost identical to what we did before:
# Part 2
steps = 40
element_counts = count_by_pairs(template, rules, steps).most_common()
logger.info("Part 2: Most common count - least common count = %d", element_counts[0][1] - element_counts[-1][1])
With my real data, the result is this:
16:50:15.073:INFO:__main__: Part 1 with recursive replace: Most common count - least common count = 5656
16:50:15.076:INFO:__main__: Part 1 using pair counting: Most common count - least common count = 5656
16:50:15.081:INFO:__main__: Part 2: Most common count - least common count = 12271437788530
16:50:15.082:INFO:__main__: Execution time: 0.0233 seconds
All done. Onwards!