Learning Python with Advent of Code Walkthroughs

Dazbo's Advent of Code solutions, written in Python

The Python Journey - Priority Queues and Heaps

Heapq

Useful Links

Graphs and Graph TheoryheapqShortest Path Algorithms

Page Contents

Overview

Whilst the Python list is great for storing an arbitrary number of elements, and for being able to retrieve elements quickly by position, the list does not perform well when working with huge volumes of ordered data. For example, imagine we had a list with millions of objects in a random order, and we need to continually retrieve the object that is largest. In this situation, Python needs to continually scan the entire list to find the one that is largest. This is not an efficient approach!

And that’s where a priority queue comes in. A priority queue is an abstract dta structure designed to allow fast retrieval of the largest or smallest value, regardless of the number of items being stored.

When To Use a Priority Queue?

Typical use cases for a priority queue are:

Abstract Data Structure?

What does abstract data structure even mean? It means that however we implement this data structure, it needs to adhere to a few rules. A implementation for a priority queue should include these operations:

Heaps

Heaps are commonly used to implement priority queues in a manner where performance is guaranteed in relation to the size of the overall dta structure. This is possible because heaps implement a priority queue in the form of a binary tree.

Binary Tree

Recall from graphs that a tree is an acyclic graph - i.e. one in which no loops are created - where vertices are connected by exactly one path.

A heap has additional characteristics, called heap property:

The performance of pushing/popping a heap is guaranteed to be proportional to the base-2 log of the size of the queue.

Heapq

In Python, we have a ready-made heap available in the form of the heapq.

It is a complete binary tree implemented _on top of a list. For this reason, it is easy to convert an existing list or iterable to a heapq.

Recall that the heap property requires that a parent has a smaller value (higher priority) than both of its children. This is implemented in the heapq as shown below.

Heapq

The arrows indicate the children of any given parent, and shows where the children are always located, in the heapq implementation.

The heapq pops based on the smallest value having the highest priority. So, if we (say) store state objects in the queue, you could implement the __lt__() method of the state object such that a < b if a is closer to the goal.

Heapq Methods

Heapq Elements and Priority

Recall that the heapq always pops the item with the smallest value. Thus, highest priority = smallest value. Bear this in mind when adding objects to a heapq.

An alternative way to add things to a queue with our desired priority is to add them as the second element of a tuple. This is because if we add tuples to a heapq, the first item in the tuple is used for the priority.

E.g.

heapq.heappush(frontier, (priority, item))

We’ll look at some real heapq usage when we move on to look at Shortest Path Algorithms.

Examples