Prompted by a conversation last week with Brian which touched on Subversion and Git, I decided to have another go at grokking distributed version control. I confess that I'm probably hopelessly brain-damaged on this score; I can't help it: I started out with version control systems in the days of SCCS, graduated to RCS, was forced to deal with the abomination that was PVCS, migrated to CVS, and have largely been reasonably OK (though not ecstatically happy) with Subversion for the past several years and a half. So I can't really be blamed for my difficulties getting to grips with distributed version control, can I? I learned all I know about the subject back in the Dark Ages.
But, hey! I'm a distributed worker kind of guy. I'm sure I can figure this out, even at my advanced age.
Rather than tackle the Swiss Army Chainsaw that is Git, I thought I'd give Mercurial a second go. I lucked into Spolsky's HgInit tutorial which seems a lot more approachable than other tutorials I've seen to date, and a lot shorter than The Mercurial Book. Almost immediately I ran into a passage that stopped me short with the thought, "If this is how people are using Subversion, no wonder they want to move onto something better!"
Joel on Subversion
Now, here’s how Subversion works:No, that's not me he's talking about; that's some other Mike.
* When you check new code in, everybody else gets it.
Since all new code that you write has bugs, you have a choice.
* You can check in buggy code and drive everyone else crazy, or
* You can avoid checking it in until it’s fully debugged.
Subversion always gives you this horrible dilemma. Either the repository is full of bugs because it includes new code that was just written, or new code that was just written is not in the repository.
As Subversion users, we are so used to this dilemma that it’s hard to imagine it not existing.
Subversion team members often go days or weeks without checking anything in. In Subversion teams, newbies are terrified of checking any code in, for fear of breaking the build, or pissing off Mike, the senior developer, or whatever.
Wrong. All wrong!
As luck would have it I was discussing repository-management strategies just last week with a client's (new) development team, and suggesting that they use a much more aggressive strategy than they've ever seen before: Multiple checkins per day by every developer. Maybe go so far as to tie the "File-Save" key to "checkin". Anytime a developer does not make a checkin for 2 days in a row there's almost certainly a problem!
How do we achieve this without the tears and craziness described by Spolsky? Simple! Have every developer working in their own private branch. Or even flipping between a variety of private branches as they switch between tasks. (Yes, I know its not the most productive way to work, but sometimes we have to respond to demands from the outside world, so we do have to take the hit of task-switching.)
I suggested a structure where each developer simply gets a private piece of the repository to work in. Anything that's broken in there is your own problem, but doesn't affect anybody else on the team. When you're satisfied that your branch won't break the world you're ready to merge back to the main development line and integrate your work with your colleagues'. And yes, then you might have some merge conflicts, but I don't really see how any version control system can avoid this; you fix the conflicts and 'Lo! the build is intact. This does imply, though, that you want to merge quite frequently. At least every day or two. Or every time your private branch builds and tests clean. Or maybe just builds clean. All depends on your team - team size, maturity, process-maturity, personal temperaments,... One must study this very hard.
I suppose that the hangups about branching and merging come from the days of CVS, where branching was really, really expensive, and merging really, really difficult. Admittedly, too, earlier versions of Subversion were also not too hot on the merge side of things. (Though I guess it is still work-in-progress and we may yet see some improvements there.)
In recent times I have been using Subversion branches very aggressively. Frequently I'll find myself flipping between as many as 6 or 8 branches on related modules, merging them, abandoning them,... and this is on a one-man project! It means that I have to use branch-names that are pretty long and descriptive, otherwise I would soon lose myself in the forest of twisty little names.
But really, I don't see the dilemma Joel talks about in the quote above. I'll readily agree that Subversion's merging still needs some work: It can be quite counterintuitive and error prone until you get the habits right. But this Big Hairy Deal about breaking the build? Doesn't exist if you just use Subversion right!
Go forth and branch!
Maybe I'm making a mountain out of a molehill when it comes to Hg... Maybe I'll fall in love with it yet, if it makes this style of working easier for me. There's hope for the old fart, yet.