This is Part 2 of a three-part series. In Part 1: The Strategic Value of Thinking in Notebooks, we discussed why and when to use Jupyter. Here, we dive into the technical implementation. Part 3: Real-World Code Examples covers practical use cases.


The Modern Jupyter Stack

For a software engineer, the “standard” way of installing Jupyter (global pip install) is often the wrong way. It leads to dependency hell and “it works on my machine” syndrome.

Here is a professional setup guide.


1. Installation & Environment Management

If you haven’t tried uv yet, it’s a lightning-fast Python package manager. It makes managing Jupyter environments trivial.

1
2
3
4
5
6
7
# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh

# Create a new project and add jupyter
uv init my-notebooks
cd my-notebooks
uv add jupyterlab ipywidgets pandas matplotlib

The Traditional Virtualenv Way

If you prefer standard tools:

1
2
3
python -m venv .venv
source .venv/bin/activate
pip install jupyterlab

2. Choosing Your Interface

JupyterLab (The Browser Experience)

JupyterLab is the next-generation web-based user interface. It supports tabs, file browsers, and terminal access.

  • Run it: jupyter lab
  • Best for: Deep data exploration and when you want a dedicated workspace.

VS Code (The Engineer’s Choice)

Most software engineers should use the VS Code Jupyter Extension.

  • Why: You get your familiar keybindings, themes, and Copilot integration directly inside the notebook.
  • Setup: Install the “Jupyter” extension from the Marketplace. Open any .ipynb file, and VS Code will prompt you to select a kernel (point it to your .venv).

3. Managing Kernels

A Kernel is the engine that runs your code. You can have different kernels for different projects (e.g., one for Python 3.10, one for R, one for a specific project with heavy dependencies).

To make your virtual environment available as a kernel:

1
2
pip install ipykernel
python -m ipykernel install --user --name=my-project-kernel --display-name "Python (My Project)"

4. Version Control: The “Notebook Problem”

Standard .ipynb files are JSON blobs containing code, metadata, and outputs (like large images or dataframes). This makes Git diffs unreadable.

Solution: Jupytext

Jupytext allows you to pair your notebooks with plain .py files.

  • You edit the .ipynb in the UI.
  • Jupytext automatically saves a .py version.
  • You commit the .py file to Git.
  • Result: Clean, readable code reviews.

Solution: nbstripout

Use nbstripout as a git filter to automatically remove output cells before committing.

1
2
pip install nbstripout
nbstripout --install

5. Storage & Remote Execution

  • Local: Keep your notebooks in a dedicated /notebooks folder in your repo.
  • Cloud (Google Colab / Kaggle): Great for quick tests or when you need a free GPU.
  • Self-Hosted (JupyterHub): If your team needs a shared environment with access to internal databases.

6. Project Structure & Hierarchy

As your research grows, a single folder full of untitled1.ipynb files becomes a nightmare. A professional Jupyter project should follow a predictable hierarchy.

The “Research-First” Structure

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
my-project/
├── data/               # Never commit raw data to Git
│   ├── raw/            # Immutable original data
│   └── processed/      # Cleaned data ready for analysis
├── notebooks/          # The "Thinking" space
│   ├── 01-exploration.ipynb
│   ├── 02-data-cleaning.ipynb
│   └── 03-modeling.ipynb
├── src/                # The "Execution" space
│   ├── __init__.py
│   └── utils.py        # Move stable code here from notebooks
├── models/             # Saved weights or serialized objects
├── pyproject.toml      # Dependency management (uv/pip)
└── README.md

Best Practices

  • Number your notebooks: Prefixing filenames with 01-, 02- ensures they appear in the order of the workflow.
  • The “Notebook-to-Script” Pipeline: Once a function in a notebook becomes stable and reused across multiple notebooks, move it to src/utils.py. This keeps notebooks clean and makes the code testable.
  • Data Isolation: Always keep data/raw read-only. Any transformations should be saved into data/processed.

Conclusion

Setting up Jupyter correctly is the difference between a messy experiment and a professional research tool. By using modern package managers like uv, integrating with VS Code, and handling version control with Jupytext, you turn Jupyter into a first-class citizen of your development workflow.

Remember: Jupyter isn’t where you write your app; it’s where you understand the problems your app is trying to solve.

Further Reading & Resources