Thursday, August 1, 2013

Rebasing makes collaboration harder

Thanks to certain version control systems' making these operations too attractive, history rewriting, e.g. rebase and squashed merge, of published revisions is currently quite popular in free software projects.  What does the git-rebase manpage, an otherwise advocate of the practice, have to say about that?

Rebasing (or any other form of rewriting) a branch that others have based work on is a bad idea: anyone downstream of it is forced to manually fix their history.

The manpage goes on to describe, essentially, cascading rebase.  I will not discuss further here why it is a bad idea.

So, let us suppose you wish to follow git-rebase's advice, and you wish to alter history you have made available to others, perhaps in a branch in a public repository.  The qualifying question becomes: "has anyone based work on this history I am rewriting?"

There are four ways in which you might answer this question.

  1. Someone has based work on your commits; rewriting history is a bad idea.
  2. Someone may have or might yet base work on your commits; rewriting history is a bad idea.
  3. It's unlikely that someone has based work on your commits so you can dismiss the possibility; the manpage's advice does not apply.
  4. It is not possible that someone has or will yet based work on your commits; the manpage's advice does not apply.

If you have truly met the requirement above and made the revisions available to others, you can only choose #4 if you have some kind of logging of revision fetches, and check this logging beforehand; this almost never applies, so it is not interesting here.  Note: it is not enough to check other public repositories; someone might be writing commits locally to be pushed later as you consider this question.  Perhaps someone is shy about sharing experiments until they're a little further along.

Now that we must accept it is possible someone has based changes on yours, even if you have dismissed it as unlikely, let's look at this from the perspective of another developer who wishes to build further revisions on yours.  The relevant question here is "should I base changes on my fellow developer's work?"  For which these are reasonable answers.

  1. You know someone has built changes on your history and will therefore not rewrite history, wanting to follow the manpage's advice.  It is safe for me to build on it.
  2. You assume someone might build changes on your history, and will not rewrite it for the same reason as with #1.  It is safe for me to build on it.
  3. You've dismissed the possibility of someone like me building on your history, and might rebase or squash, so it is not safe for me to build on it.

I have defined these answers to align with the earlier set, and wish to specifically address #3.  By answering #3 to the prior question, you have reinforced the very circumstances you might think you are only predicting.  In other words, by assuming no one will wish to collaborate on your change, you have created the circumstances by which no one can safely collaborate on your change.  It is a self-fulfilling prophecy that reinforces the tendency to keep collaboration unsafe on your next feature branch.

In this situation, it becomes very hard to break this cycle where each feature branch is "owned" by one person.  I believe this is strongly contrary to the spirits of distributed version control, free software, and public development methodology.

In circumstances with no history rewriting, the very interesting possibility of ad hoc cross-synchronizing via merges between two or more developers on a single feature branch arises.  You work on your parts, others work on other parts, you merge from each other when ready.  Given the above, it is not surprising to me that so many developers have not experienced this very satisfying way of working together, even as our modern tools with sophisticated merge systems enable it.