Skip to main content

Setting up Fish to make virtualenv easier

Since I started my new job, I’ve been doing a lot more work in Python. As I was starting with a completely clean slate, I wanted to try setting up Python the “right” way – or if not “right”, at least better way than my previous pile of hacks and kludges.

(I don’t remember much of my old approach, but I know it was messy. At one point I used Homebrew and virtual environments, but I got burnt by Homebrew unexpectedly breaking Python, so I scrapped it and started installing everything in my global Python installation. Don’t do that.)

In August I read Glyph’s post Get Your Mac Python From Python.org and it all seemed like sensible advice, so I decided to use that as my starting point. I downloaded Python on my new work laptop from Python.org, and I started using virtual environments for everything.

This worked well enough, but there were some rough edges in my new workflow. I’ve been tweaking my Fish shell config to make it a bit smoother.

Setting the PIP_REQUIRE_VIRTUALENV variable

One recommendation in Glyph’s post is that you always use virtual environments, and they suggest a way to enforce that:

Once you have installed Python from python.org, never pip install anything globally into that Python, even using the --user flag. Always, always use a virtual environment of some kind. In fact, I recommend configuring it so that it is not even possible to do so, by putting this in your ~/.pip/pip.conf:

[global]
require-virtualenv = true

I like the idea of always using virtualenvs, but I’m not a fan of putting config files in my home directory. I struggle to keep them up-to-date, and after a while I lose track of what’s what – is this config still in use, or is it cruft from a tool I no longer use? Plus, each config file becomes one more thing to remember when I set up a new computer.

Fortunately, this config file isn’t the only way to ensure you always use a virtual environment. You can also set PIP_REQUIRE_VIRTUALENV, so I have the following lines in my fish shell config:

# This prevents me from installing packages with pip without being
# in a virtualenv first.
#
# This allows me to keep my system Python clean, and install all my
# packages inside virtualenvs.
#
# See https://docs.python-guide.org/dev/pip-virtualenv/#requiring-an-active-virtual-environment-for-pip
# See https://blog.glyph.im/2023/08/get-your-mac-python-from-python-dot-org.html#and-always-use-virtual-environments
#
set -g -x PIP_REQUIRE_VIRTUALENV true

Because I keep my shell config in Git, it’s easier to see when I added this variable, and when I get a new computer I’ll get this right behaviour “for free”.

A function to create and auto-enable new virtual environments

The process of creating new virtual environments is ostensibly simple – just two commands.

$ python3 -m venv .venv
$ source .venv/bin/activate.fish

In practice I only ever remembered to run the first – I’d create my new virtual environment, go to pip install something, and then it would complain I hadn’t enabled a virtual environment. I’d mutter and grumble, activate the virtualenv, and try again.

If I’m creating a virtual environment, I want to use it immediately, so I wrapped this process in a Fish function called venv:

function venv --description "Create and activate a new virtual environment"
    echo "Creating virtual environment in "(pwd)"/.venv"
    python3 -m venv .venv --upgrade-deps
    source .venv/bin/activate.fish

    # Append .venv to the Git exclude file, but only if it's not
    # already there.
    if test -e .git
        set line_to_append ".venv"
        set target_file ".git/info/exclude"

        if not grep --quiet --fixed-strings --line-regexp "$line_to_append" "$target_file" 2>/dev/null
            echo "$line_to_append" >> "$target_file"
        end
    end

    # Tell Time Machine that it doesn't need to both backing up the
    # virtualenv directory. (macOS-only)
    # See https://ss64.com/mac/tmutil.html
    tmutil addexclusion .venv
end

I typically run this in the root of a project directory, usually a Git repo.

When I run it, it creates a new virtual environment with an up-to-date version of pip (thanks to --upgrade-deps), then it activates it immediately. This means my next command can be a pip install, and it’ll run inside the new virtualenv.

It also adds the .venv directory to .git/info/exclude, which is a local-only gitignore file. This means that Git will ignore my virtual environment, and not try to save it. The grep command is checking that I haven’t already gitignore-d .venv, so I don’t add repeated ignore rules.

It also tells Time Machine not to bother backing up the virtual environment directory. I’d never restore a virtualenv from a backup; I’d just create a new one fresh, so backing it up is a waste of space and CPU cycles.

I often combine this with another function I have for creating temporary directories:

function tmpdir --description "Create and switch into a temporary directory"
    cd (mktemp -d)
end

like so:

$ tmpdir; venv

And with two short commands, I’m in an empty directory with a fresh virtual environment. This is great for quick prototyping, experiments, and one-off projects.

Auto-activating virtual environments when I switch directories

Once I’ve created my virtual environments, I need to remember to activate them.

I could do this manually, or I could have the computer look for virtualenvs and (de)activate them automatically for me. There are various plugins for doing this (I used virtualfish a few years ago), but this time round I realised my needs were simple enough that I could just write my own function.

My venv function ensures a standard approach to virtualenv naming: I always call them .venv, and I put them in the root of my project directories, which are always Git repos. This means I can find if there’s a virtualenv I want to auto-activate by looking to see if I’m in a Git repo, then looking for a folder called .venv.

This is the function:

function auto_activate_venv --on-variable PWD --description "Auto activate/deactivate virtualenv when I change directories"

    # Get the top-level directory of the current Git repo (if any)
    set REPO_ROOT (git rev-parse --show-toplevel 2>/dev/null)

    # Case #1: cd'd from a Git repo to a non-Git folder
    #
    # There's no virtualenv to activate, and we want to deactivate any
    # virtualenv which is already active.
    if test -z "$REPO_ROOT"; and test -n "$VIRTUAL_ENV"
        deactivate
    end

    # Case #2: cd'd folders within the same Git repo
    #
    # The virtualenv for this Git repo is already activated, so there's
    # nothing more to do.
    if [ "$VIRTUAL_ENV" = "$REPO_ROOT/.venv" ]
        return
    end

    # Case #3: cd'd from a non-Git folder into a Git repo
    #
    # If there's a virtualenv in the root of this repo, we should
    # activate it now.
    if [ -d "$REPO_ROOT/.venv" ]
        source "$REPO_ROOT/.venv/bin/activate.fish" &>/dev/null
    end
end

This function runs as an event handler in Fish – it runs whenever the PWD variable changes. That variable is the current working directory, so in practice this runs whenever I change directories.

I find the top-level directory of the current Git repo by running git rev-parse --show-toplevel, which is a super handy command I use in lots of scripts. If I’m not in a Git repo, it returns an empty string. Then I compare that to the path of the currently-enabled virtualenv in VIRTUAL_ENV, and decide whether I need to activate or deactivate a virtualenv.

If you want the complete code, my Fish shell config is in a public repo, although the virtualenv stuff is a bit spread out.

This was the first project where I used ChatGPT to help write the code. I was initially quite sceptical of LLMs, but watching what Simon Willison has been doing persuaded me to try it. This felt like a safe project to try it – it’s a minimal project with clearly defined “is the code working” criteria, and limited impact if I do something daft.

Overall I was quite impressed. All the code seemed to work, and it was helpful for the bits of shell syntax I only half-remember – things like test -z and combining multiple conditions in a boolean. I didn’t use any of its output directly, but it was a good starting point that I could adapt into my actual code. I’m sure this won’t be my last project where ChatGPT lends a helping hand.