Missing the ground

git worktree workspace workflow

git worktree, despite being almost a decade old, remains an obscure command occasionally referenced in the "top 10 things you didn't know about Git"-style listicles. It isn't even mentioned once in the git book! For how useful it is, that is a travesty.

Worktrees allow you to check out the same repository at multiple points in history at the same time. A git clone with no flags checks out a single main worktree and houses the git directory inside—that's .git. Running git checkout <ref> replaces the main worktree with the one pointed to by <ref>; git worktree add <path> <ref>, meanwhile, creates a separate worktree at <path>. Having this option is indispensable when dealing with repositories you're maintaining for a long time.

The problem

A wild urgent issue has appeared: your distributions are broken on Windows! The fix touches several repositories: the server and client both need to be updated, as well as the distribution toolset. Worst of all, this branch will stick around for a while as you assemble your packages, publish them to the dev registry, and test they are assembled and deployed correctly. The tests run in CI, so you want to switch back to the cool feature you had originally been working on while you wait.

In a more traditional git workflow, you'd have two options.

Stashing

You can stash1 (or temporarily commit) the feature you're working on, and switch back and forth between branches cool-feature and issue-1234.

You now have to continuously stash and pop (or commit and amend) partially completed changes, cursing loudly on occasions you've made changes while having the wrong branch checked out. Depending on your IDE, it might decide to rescan the entire working directory, completely locking you out of editing in the meantime. Not great.

Cloning

Alternatively, you clone each of the repositories into new working directories. Maybe you even organize them into a separate directory, too.

That's a much nicer approach, but now your pre-push hook isn't running, and you're pushing changes that make the lint job in CI sad, costing you time. Or you need to fetch from the upstream to rebase your new changes, but, dagnabbit, the remote is not set up in the new clone. Need some changes from another clone? Be prepared for a completely unnecessary network round trip, or add the local path as another remote. Each one is a minor annoyance at worst, but these paper cuts still manage to hurt every time.

The workflow

Good news: you can seamlessly share the .git configuration between multiple copies of the same repository. Here's how.

Clone all the repositories you're interested in to a separate directory. I'm calling mine main-workspace:

~/workspace
`-- main-workspace
    |-- client
    |-- distribution
    `-- server

This is going to house the main worktrees of each repository you're working on.

Creating a new workspace for a task is a breeze:

mkdir issue-1234-workspace
for repo in server client distribution; do
	git -C main-workspace/$repo worktree add $PWD/issue-1234-workspace/$repo main -b issue-1234
done

You now have a nice cordoned off area where you can work on issue 1234 without affecting your existing work.

~/workspace
|-- issue-1234-workspace
|   |-- client
|   |-- distribution
|   `-- server
`-- main-workspace
	|-- client
	|-- distribution
	`-- server

These new worktrees you've just created are called "linked". They don't have their own git directory: the main git directory is still managing them, even though they can be in a completely arbitrary place in the filesystem.

I recommend creating a full workspace even when only one project is relevant to the feature. Not only does it make the workflow more uniform, but if you realize you do need to change another repo, you can simply add another worktree to the same workspace.

When you're done, delete the branch and the workspaces:

for repo in issue-1234-workspace/*; do
	git -C $repo checkout --detach      # can't delete a branch that is checked out!
	git -C $repo branch -D issue-1234
	git -C $repo worktree remove .      # remember that `-C` changes the working directory
done
rm -r issue-1234-workspace

That's all there is to it!

Sharing is caring

As if keeping the state of your working directory intact was not enough incentive, there's more: the .git directory is shared between all worktrees. Not only does that save disk space now that the objects aren't duplicated across all the working copies, you get benefits clone just can't provide.

Any local configuration is automatically shared between your worktrees. That includes hooks, configuration variables, and the exclude file2. All remote configuration is also shared. You only need to git remote add geoff [email protected]:geoff/repo.git once per collaborator per repo, and it stays available forever. References are shared as well: fetch from upstream in one worktree, and you can rebase in all of them, secure in knowledge that the remote ref is up to date across the board.

Sometime not sharing is also caring, if of a different kind. Anything gitignored, like build artifacts, obviously won't be shared. A long-running benchmark? Put it in a separate worktree. Profiling two different versions of the repo? Put them in different worktrees. Gone are the days of repeated checkouts and lengthy rebuilds.

Something spicier

If you're happy with the setup so far, you can stop reading here. I've told you all you need to know to be effective with workspaces.

In reality I use a more unhinged version of the above workflow: instead of having a main workspace, I've mirrored the repos I need into a central directory. A mirror is a special kind of bare repository, created using git clone --mirror <url>. A bare repository has no main worktree and is mainly meant to be cloned from.

The rest of the workflow remains pretty much unchanged.

~/workspace
|-- git
|	|-- client.git
|	|-- distribution.git
|	`-- server.git
|-- feature-workspace
|	|-- client
|	`-- server
`-- issue-1234-workspace
	|-- client
	|-- distribution
	`-- server

Not having a main worktree that takes priority is a little tidier: all workspaces are equal and can be deleted safely at any point. That means there's no "main" feature you're working on, and no main workspace every repository is checked out into, even though you only work on a handful at a time.

The reason I'm calling this workflow slightly unhinged is that since bare repositories in general, and mirrors in particular, aren't designed with worktrees in mind, they come with a few idiosyncrasies.

Forceful pushes

If you set your repositories up as a mirrors, every git push is a force-push. This can be a good thing! It means your local clones are authoritative, and the github (other hosters are available) forks are only used to PR your changes to the upstream repository. If you collaborate on your fork with others, however, this can be disastrous, as you'll clobber any changes pushed to your fork not by yourself. It doesn't come up too often in my experience, but it has come up before.

This also means this approach doesn't translate to multiple devices well. I only use this workflow on my work machine, and if I need to do any changes from another machine I just clone normally. These clones are short-lived, and my workflow is all about dealing with medium- to long-term work.

Forget about git pull

If you try to pull in this setup, git gets confused:

There is no tracking information for the current branch.
Please specify which branch you want to merge with.
See git-pull(1) for details.

    git pull <remote> <branch>

If you wish to set tracking information for this branch you can do so with:

    git branch --set-upstream-to=<remote>/<branch> some-branch

git fetch origin fails with an error:

fatal: refusing to fetch into branch 'refs/heads/some-branch' checked out at '/home/user/workspace/some-workspace/repo`

This is because the refspec for origin in a mirrored repository is not what you'd normally expect.

After a default clone, the origin is set up as a normal remote, and branches fetched from it have separate refs:

[remote "origin"]
	url = [email protected]:me/git.git
	fetch = +refs/heads/*:refs/remotes/origin/*

Let me break that down.

The format of a refspec is <src>:<dest> (ignore the +). In this case, any reference on the remote matching refs/heads/<branch> will be represented by a local reference refs/remotes/origin/<branch>. You may be more familiar with the origin/<branch> refspec, which is just shorthand for refs/remotes/origin/<branch>.

A mirrored repository instead assigns the exact same refs to the branches:

[remote "origin"]
	url = [email protected]:me/git.git
	fetch = +refs/*:refs/*
	mirror = true

Every reference—every branch head, every tag, every remote branch—is mirrored exactly between your origin and the mirror.

Now watch what happens after git remote add upstream [email protected]:git/git.git:

​[remote "origin"]
	url = [email protected]:me/git.git
	fetch = +refs/*:refs/*
	mirror = true
[remote "upstream"]
	url = [email protected]:git/git.git
	fetch = +refs/heads/*:refs/remotes/upstream/*

Notice the issue? If you attempt to track e.g. upstream/main—which, remember, is shorthand for refs/remotes/upstream/maingit can't tell if it needs to check origin or upstream, since both fetch destination refspecs match!

fatal: not tracking: ambiguous information for ref 'refs/remotes/upstream/main'
hint: There are multiple remotes whose fetch refspecs map to the remote
hint: tracking ref 'refs/remotes/upstream/main':
hint:   origin
hint:   upstream
hint:
hint: This is typically a configuration error.
hint:
hint: To support setting up tracking branches, ensure that
hint: different remotes' fetch refspecs map into different
hint: tracking namespaces.

This is perhaps the biggest drawback of this approach. Fortunately, since pull is by definition a fetch followed by either a merge or a rebase, the workaround is really simple:

$ git fetch upstream && git merge upstream/main
# or
$ git fetch upstream && git rebase upstream/main

Further Reading

https://git-scm.com/docs/git-worktree (also available through git help worktree or man git-worktree)

https://git-scm.com/docs/git-clone (also available through git help clone or man git-clone)

https://git-scm.com/book/en/v2/Git-Internals-The-Refspec

Footnotes

  1. Do me a favor. Run git stash list, stare at the output for a bit, then run git stash clear. You'll feel better.

  2. In case you didn't know, the exclude file, normally located at .git/info/exclude, acts exactly like .gitignore, but isn't committed to the repository. Treat it as your own secret .gitignore so you don't have to commit files specific to your machine or workflow: your preferred tool configs, editor files, random shell scripts, or OS-specific metadata like Thumbs.db or .DS_Store. You can also specify a global exclude file in the core.excludesFile configuration option.