Dazbo's Advent of Code solutions, written in Python
Day 7: Internet Protocol Version 7
regexOverlapping MatchesString Manipulation
Today’s puzzle involves validating IPv7 addresses based on specific patterns. An IPv7 address consists of a sequence of characters, some of which are enclosed in square brackets. The parts outside the brackets are called “supernet sequences”, and the parts inside the brackets are called “hypernet sequences”.
The core of the problem revolves around identifying a specific pattern called an “ABBA” sequence. An ABBA is a four-character sequence where the first two characters are different, and the sequence is followed by its reverse. For example, abba
, bddb
, or xyyx
are ABBAs, but aaaa
or abca
are not.
Here’s an example of what the input data looks like:
abba[mnop]qrst
abcd[bddb]xyyx
aaaa[qwer]tyui
ioxxoj[asdfgh]zxcvbn
How many IPs support TLS?
An IP address supports Transport-Layer Snooping (TLS) if it meets two conditions:
My approach to solving Part 1 involves:
Here’s how I extract the supernet and hypernet sequences using the regex
module:
# Only match text outside the square bracekts - supernet sequences
non_brackets_pattern = regex.compile(r"(\w+)(?:\[\w*\])*")
# Only match the text within the square brackets - hypernet sequences
brackets_pattern = regex.compile(r"\[(\w+)\]+")
# ... inside main() loop
supernets = non_brackets_pattern.findall(line)
hypernets = brackets_pattern.findall(line)
The non_brackets_pattern
captures sequences of word characters that are not enclosed in brackets, while brackets_pattern
captures the word characters within brackets.
Next, I implemented a helper function contains_abba
to check for the ABBA pattern:
def contains_abba(potential: str, abba_size: int) -> bool:
""" Check whether this string contains an ABBA sequence.
E.g. abba is valid. deed is valid. xyyx is valid.
aaaa is not. aabb is not. abcd is not.
Args:
potential (str): Any str
abba_size (int): The number of chars in a valid ABBA sequence
Returns:
bool: Whether valid ABBA.
"""
for i in range(len(potential) - abba_size + 1):
block = potential[i:i+4]
if block[0] != block[1] and block[0:2] == (block[3:1:-1]):
return True
return False
This function iterates through all possible 4-character blocks in the given string. It checks if the first two characters are different and if the first pair is the reverse of the second pair.
The main logic for Part 1 then becomes:
valid_ips = []
for line in data:
supernets = non_brackets_pattern.findall(line)
hypernets = brackets_pattern.findall(line)
# first check we don't have an ABBA in a hypernet seq, i.e. within [...]
try:
for hypernet in hypernets:
# if we find a match, this line is NOT a valid IP. Move on to next line.
if contains_abba(hypernet, 4):
raise StopIteration() # Custom exception to break out of nested loops
except StopIteration:
continue # Move to the next IP line
try:
for supernet in supernets:
# if we find a match, this line is a valid IP. Add and move on.
if contains_abba(supernet, 4):
valid_ips.append(line)
raise StopIteration() # Custom exception to break out of nested loops
except StopIteration:
continue # Move to the next IP line
logging.info(f"Part 1: Found {len(valid_ips)} valid TLS IPs.")
I use a StopIteration
exception to efficiently break out of nested loops once a condition (either invalid hypernet or valid supernet) is met for an IP address.
How many IPs support SSL?
An IP address supports Super-Secret Listening (SSL) if it has an “Area-Broadcast Accessor” (ABA) anywhere in its supernet sequences and a corresponding “Byte Allocation Block” (BAB) anywhere in its hypernet sequences.
An ABA is a three-character sequence like xyx
where x
and y
are different. A corresponding BAB would be yxy
. For example, if a supernet has aba
, a hypernet must have bab
.
A crucial aspect of Part 2 is finding overlapping matches for ABA sequences. The standard re
module in Python does not support this directly, so I used the more powerful regex
module (installed via pip install regex
).
Here’s the pattern for finding ABA sequences:
aba_pattern = regex.compile(r"((.).\2)")
This regex captures a character, then any character, then the first captured character again (e.g., (a).(\1)
). The overlapped=True
flag in findall
is essential for finding all possible ABAs, even if they overlap (e.g., ababa
contains two ABAs: aba
and bab
).
The logic for Part 2 is as follows:
valid_ips = []
for line in data:
supernets = non_brackets_pattern.findall(line)
hypernets = brackets_pattern.findall(line)
try:
babs = []
for supernet in supernets:
# look for ABAs in supernets
aba_matches = aba_pattern.findall(supernet, overlapped=True)
for aba_match in aba_matches:
match_str = str(aba_match[0])
# ensure middle char is different to avoid 'aaa'
if match_str[0] != match_str[1]:
bab = match_str[1] + match_str[0] + match_str[1] # Construct BAB
babs.append(bab)
# Then check if any of those BABs are present in any of the hypernets
for bab in babs:
for hypernet in hypernets:
if bab in hypernet:
valid_ips.append(line)
raise StopIteration() # Custom exception to break out of nested loops
except StopIteration:
continue # Move to the next IP line
logging.info(f"Part 2: Found {len(valid_ips)} valid SSL IPs.")
For each supernet, I find all ABA matches. For each valid ABA (where the first and second characters are different), I construct its corresponding BAB. Then, I check if any of these generated BABs exist within any of the hypernet sequences for that IP address. If a match is found, the IP supports SSL.
This solution effectively leverages the regex
library’s advanced features to handle the overlapping match requirement, making the parsing and validation straightforward.