This is Part 2 of a three-part series. In Part 1: The Strategic Value of Thinking in Notebooks, we discussed why and when to use Jupyter. Here, we dive into the technical implementation. Part 3: Real-World Code Examples covers practical use cases.


The Modern Jupyter Stack

For a software engineer, the “standard” way of installing Jupyter (global pip install) is often the wrong way. It leads to dependency hell and “it works on my machine” syndrome.

Here is how to set it up like a pro.


1. Installation & Environment Management

If you haven’t tried uv yet, it’s a lightning-fast Python package manager. It makes managing Jupyter environments trivial.

1
2
3
4
5
6
7
# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh

# Create a new project and add jupyter
uv init my-notebooks
cd my-notebooks
uv add jupyterlab ipywidgets pandas matplotlib

The Traditional Virtualenv Way

If you prefer standard tools:

1
2
3
python -m venv .venv
source .venv/bin/activate
pip install jupyterlab

2. Choosing Your Interface

JupyterLab (The Browser Experience)

JupyterLab is the next-generation web-based user interface. It supports tabs, file browsers, and terminal access.

  • Run it: jupyter lab
  • Best for: Deep data exploration and when you want a dedicated workspace.

VS Code (The Engineer’s Choice)

Most software engineers should use the VS Code Jupyter Extension.

  • Why: You get your familiar keybindings, themes, and Copilot integration directly inside the notebook.
  • Setup: Install the “Jupyter” extension from the Marketplace. Open any .ipynb file, and VS Code will prompt you to select a kernel (point it to your .venv).

3. Managing Kernels

A Kernel is the engine that runs your code. You can have different kernels for different projects (e.g., one for Python 3.10, one for R, one for a specific project with heavy dependencies).

To make your virtual environment available as a kernel:

1
2
pip install ipykernel
python -m ipykernel install --user --name=my-project-kernel --display-name "Python (My Project)"

4. Version Control: The “Notebook Problem”

Standard .ipynb files are JSON blobs containing code, metadata, and outputs (like large images or dataframes). This makes Git diffs unreadable.

Solution: Jupytext

Jupytext allows you to pair your notebooks with plain .py files.

  • You edit the .ipynb in the UI.
  • Jupytext automatically saves a .py version.
  • You commit the .py file to Git.
  • Result: Clean, readable code reviews.

Solution: nbstripout

Use nbstripout as a git filter to automatically remove output cells before committing.

1
2
pip install nbstripout
nbstripout --install

5. Storage & Remote Execution

  • Local: Keep your notebooks in a dedicated /notebooks folder in your repo.
  • Cloud (Google Colab / Kaggle): Great for quick tests or when you need a free GPU.
  • Self-Hosted (JupyterHub): If your team needs a shared environment with access to internal databases.

6. Project Structure & Hierarchy

As your research grows, a single folder full of untitled1.ipynb files becomes a nightmare. A professional Jupyter project should follow a predictable hierarchy.

The “Research-First” Structure

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
my-project/
├── data/               # Never commit raw data to Git
│   ├── raw/            # Immutable original data
│   └── processed/      # Cleaned data ready for analysis
├── notebooks/          # The "Thinking" space
│   ├── 01-exploration.ipynb
│   ├── 02-data-cleaning.ipynb
│   └── 03-modeling.ipynb
├── src/                # The "Execution" space
│   ├── __init__.py
│   └── utils.py        # Move stable code here from notebooks
├── models/             # Saved weights or serialized objects
├── pyproject.toml      # Dependency management (uv/pip)
└── README.md

Best Practices

  • Number your notebooks: Prefixing filenames with 01-, 02- ensures they appear in the order of the workflow.
  • The “Notebook-to-Script” Pipeline: Once a function in a notebook becomes stable and reused across multiple notebooks, move it to src/utils.py. This keeps notebooks clean and makes the code testable.
  • Data Isolation: Always keep data/raw read-only. Any transformations should be saved into data/processed.

Conclusion

Setting up Jupyter correctly is the difference between a messy experiment and a professional research tool. By using modern package managers like uv, integrating with VS Code, and handling version control with Jupytext, you turn Jupyter into a first-class citizen of your development workflow.

Remember: Jupyter isn’t where you write your app; it’s where you understand the problems your app is trying to solve.

Further Reading & Resources