Dazbo's Advent of Code solutions, written in Python
JupyterJupyter Project DocumentationJupyter Lab DocumentationInstalling JupyterNotebook Tips and TricksAnacondaSciPy Docker ComposeGoogle ColabAnaconda Cloud
Integrated Development Environments (IDEs) have been the go-to choice for developers seeking a dedicated space to write, test, and debug their code. However, as data science and machine learning began gaining traction, there was increasing need for a more interactive and data-centric environment. Enter Jupyter Notebooks: a web-based application that has redefined the way we interact with code and data, the way we document our code, and the way we share it.
Unlike traditional IDEs, Jupyter Notebooks allow for code, data, and multimedia to coexist in a shared space. With its cell-based structure, users can write and execute code in small chunks, making it easier to test ideas and see results in real-time. This attribute alone sets it apart from the linear approach of traditional coding environments, making Jupyter Notebooks a beloved tool among data scientists and analysts.
pip
.Here’s an example of a Notebook, with chapter structure, and with an introduction written in markdown:
And here’s an example, where we’re dynamically generating an image with code, and showing it after the cell:
Whilst Jupyter Notebooks are great, they lack some of the features that developers have come to expect from an IDE like Visual Studio. This gap in capability has led to the next generation of Jupyter notebook environment, called Jupyter Lab. It builds on the Jupyter notebook environment, but then adds a bunch of extra capabilities, like:
For example, here’s one of my AoC Jupyter Notebooks:
Here’s the same notebook, opened in the Jupyter Lab environment:
There are a few ways to run a Jupyter Notebook. I’ll go through a few of them here.
The quickest and easiest way is to install the notebook
package with pip
:
py -m pip install notebook
Then you can launch the notebook like this:
jupyter-notebook
For a more sophisticated and complete experience, you can instead use Anaconda or Miniconda.
Anaconda is a fully-fledged data science environment. When you install Anaconda, you get:
Anaconda is pretty big, at over 3GB. Alternatively, you can install Miniconda, which is a cut-down minimal version of Anaconda.
Anaconda is the de facto standard for data science. It is highly customisable and configurable.
This is my favourite approach.
You can download a pre-configured container image, such as the Jupyter Notebook Data Science Stack .
Advantages:
There are a bunch of so-called Jupyter stacks available as Docker images, and they’re all documented here.
For example:
Stack | Includes (for example) | Approx Size |
---|---|---|
jupyter/base-notebook | Conda, mamba, notebook, jupyterlab | 1.0GB |
jupyter/minimal-notebook | As with base-notebook , plus some command-line tools and utilities (like curl , git , nano ) |
1.6GB |
jupyter/scipy-notebook | As with minimal-notebook , plus a bunch of data science packages and tools (like bokeh , matplotlib , pandas , scikit-image , scikit-learn , scipy , and seaborn ) |
4.1GB |
jupyter/tensorflow-notebook | As with scipy-notebook , plus tensorflow |
|
jupyter/pyspark-notebook | As with scipy-notebook , plus libraries for working with Hadoop and Apache Spark |
|
jupyter/datascience-notebook | Combines everything from scipy-notebook , r-notebook and julia-notebook |
4.2GB |
Of course, to run a container, you do need to have Docker installed.
My favourite way to pull the image and run a container is using docker compose
file. For example, here is my docker-compose-scipy-lab.yml.
version: '3.9'
services:
jupyter:
environment:
JUPYTER_ENABLE_LAB: yes
CHOWN_HOME: yes # Next three env vars are needed to fix permission issues on WSL
CHOWN_HOME_OPTS: '-R'
JUPYTER_ALLOW_INSECURE_WRITES: true
image: jupyter/scipy-notebook
container_name: scipy-lab
volumes:
- .:/home/jovyan
ports:
- 8888:8888
To run the above file:
docker compose -f .\docker-compose-scipy-lab.yml up
And it looks like this:
You don’t even need to run Jupyter Notebooks locally! You can make use of a pre-configured cloud service. They are often free, unless you reach a point where you need more power, capacity or features.
A couple of options include:
Anaconda Notebooks in the Cloud
There are others, like Azure Notebooks, and Google Vertex AI Workbench. But these are paid-for offerings, so I’m not going to get into them here.
Note: you can always edit your notebooks locally, and then use a cloud-based Jupyter service for sharing your work with others, in a runnable format. For example, here’s how you might share notebooks with Google Colab:
https://colab.research.google.com/drive/some_unique_id
https://colab.research.google.com/github/profile/repo/blob/master/path/to/some_notebook.ipynb