Tuesday, December 19, 2006

Subversion Move/Rename Feature

I have read a lot of mailing list threads the last couple years where projects debate whether to move from CVS to Subversion. Usually this happens when their hosting provider adds Subversion support, as the Eclipse Foundation recently did: #131096. I see two primary features in Subversion that are mentioned when people want to switch:
  1. Support for atomic transactions.
  2. Support for tracking move/renamed files and folders.
Anyone that has ever been burned by the lack of atomic transactions in CVS knows the value of having that feature, and I'd argue the presence of that feature alone justifies the transition to Subversion. However, I'd also say that even more people mention wanting improved support for move/rename as the big reason for switching.

Here is the problem. The support for move/rename in Subversion is often not what user's expect it to be.

The support for move/rename in Subversion is really about maintaining history. If I rename a file, commit it and then examine its history, I can see the history of the file all the way back to when it was created, across any move boundaries. Likewise, if I want to get a copy of what a file looked like last year, I can do so without needing to know what the file was named last year. If these are the sort of features you are after, then Subversion provides them very well.

So what are the features that Subversion does not provide? For starters, you might want to take a look at issue #898 in the Subversion issue tracker.

Subversion implements a move/rename as a copy followed by a delete. The fact that the new file is a copy is the reason you get support for the previous history. The negative side effect of this can be seen in this simple scenario:
  1. Suppose you have a versioned folder named foo with a file named bar.
  2. Users A and B checkout foo to make some changes.
  3. User A renames bar to baz and commits the change.
  4. User B makes some changes to the code in bar and attempts to commit. The commit fails because bar is out of date. So far this is all normal and expected. User B then runs svn update. What happens?
  5. svn update will add new file baz to the working copy of user B, and bar will remain in the working copy as an unversioned file (if it did not contain modifications it would have been deleted). I think that most people would expect that the local copy of bar would be renamed to baz and now contain user B's changes merged with whatever changes were in the repository as a normal update would do.
  6. User B has to recognize this is what happened and transfer the changes to baz before committing. Of course, they might also run svn add on file bar to put it back and then commit it. Perhaps because they did not really recognize or understand what happened.
This is just a trivial example that shows a potentially big problem. Imagine the above scenario played out as a large refactoring -- which is entirely possible since that was one of the motivations in moving to Subversion to begin with.

This problem also manifests in other commands like merge. Imagine you have a branch with some customizations you are working on. A refactoring happens on trunk which involves the renaming of several files and folders. You now have lost the ability to use merge to bring the changes in the branch back to trunk, or vice versa. Or, more accurately, you have lost a lot of the automation that Subversion can normally perform for you during this operation.

If you read the referenced issue from the Subversion issue tracker you will see that solving this problem is a big challenge. I would not take the lack of this feature as a reason to not use Subversion, obviously I'd recommend that you DO use Subversion. However, it is important to understand how the tool works and what it does and does not do before making the switch. My big fear would just be for a project to switch so that they could do a bunch of refactoring work and then run into this limitation in Subversion and become angry. I think the advantage Subversion gives you is that you often can do those refactorings and take advantage of the history support that Subversion provides. However, if you are going to do this, you need to use communication with the rest of the team and try to coordinate it in a way that it does not impact the entire team.

10 comments:

Anonymous said...

Thanks for the good information. You description of the problem with rename and the exact scenario where it can occur has been helpful in my evaluation of subversion. It turns out I have a two person team with good separation and communication and so the problem is not likely to affect me. I do encourage the svn developers to take on the challenge of implementing a proper rename. I believe it is a worthy challenge.

Mik said...

(just found this post now)

Thanks for the very informative post Mark. This issue confused me in the past and I never looked into it, so it's great to see a summary of what needs to be communicated to others during big refactorings and what to expect from the tool.

Sabeeh747 said...

Mark, thanks for your posting. Has the issue you described regarding the move/rename feature in Subversion been at all addressed (I see you posted this over a year ago) in recent releases? I am currently on a project in the middle of a decision point as to which source control we should use - MS Visual Source Safe, CVS, or Subversion. Please provide any advice you can, especially as far as any update on the move/rename issue you raise in your posting. Thanks!

Mark Phippard said...

SVN 1.5 will have the same behavior as previous releases. I should point out that neither VSS nor CVS handle this either. They are both essentially dead products as well.

The next release after 1.5 is going to tackle this problem. There is a branch in the Subversion repository that has made good progress on the issue.

Mark

sabeeh747 said...

Hi Mark,

Thank you for your response. Since my project will be using the Visual Studio .NET development environment, do you recommend using a Subversion tool such as VisualSVN (http://www.visualsvn.com/) that integrates with Visual Studio or stick with the non-integrated tool such as TortoiseSVN (http://tortoisesvn.net/downloads). If you have any other recommendation on good subversion interface tools please provide me with them. Thanks again!

Sabeeh

Mark Phippard said...

There is an open source Visual Studio integration, AnkhSVN, that is being relaunched today:

http://ankhsvn.open.collab.net/

There is a pretty nice community of developers that have come together around the project, so it is worth looking at and providing feedback to it.

VisualSVN seems to be a good tool, and the license fee is fairly low. As you point out, TortoiseSVN is also a valid option. VisualSVN actually uses TortoiseSVN for a lot of the UI and functionality.

Sabeeh747 said...

Mark, I have been taking a look at AnkhSVN through the link you provided. I know with VisualSVN, TortoiseSVN has to be installed prior to installing VisualSVN. I am trying to figure out if its the similar case with AnkhSVN - does something have to be installed prior to running the installation for AnkhSVN?

Thanks for you help!

Sabeeh

Mark Phippard said...

No. AnkhSVN is a complete SVN client along the lines of say Subclipse. It includes the Subversion libraries it needs and talks to them directly.

Sabeeh747 said...

Mark, thank you for your prompt reponse. I just need a little more clarification on certain points. From what I understand(if I understand correctly), the subversion repository can either be a flat-file system as suggested on http://ankhsvn.open.collab.net/servlets/ProjectProcess?pageID=3794 or a database. Downloading ankhsvn alone basically means I designate a specific directory as the repository. VisualSVN offers the VisualSVN Server as a free download at http://www.visualsvn.com/server/ and says that any Subversion client can connect to it. So I'm assuming I can use the ankhsvn subversion client and have it connect to Visual SVN Server. The 'Custom Setup' screenshot at http://www.visualsvn.com/server/ shows that you can change the repository location. So my questions are: 1)Do you know for a fact that a subversion client such as ankhsvn could connect to VisualSVN? 2)How are we to know whether the repository is a flat-file system or database, or for the purpose of usage, is this not really important or relevant? 3)Does a team-oriented development environment necessitate the installation of VisualSVN Server?

Sabeeh

Mark Phippard said...

VisualSVN Server is simply a GUI to assist with setting up a Subversion server and help manage its repostories etc. It is not technically a server itself. It is an admin GUI on top of the normal SVN server. Any SVN client can talk to it.

All Subversion repositories are essentially a database. When you create a repository, you can pick the kind. The default, fsfs, uses the filesystem and creates files in folders etc. But it is still a database. You cannot work with the files directly or anything. The other format is BDB, which uses the BerkeleyDB to store information.

Any SVN server can serve either type of repository (provided it was compiled with BDB support included). Any SVN client can talk to any SVN server and the client does not know, or need to know, the repository format.

Mailing lists would be a better place to ask question than the comments on my blog. I would recommend the forums on openCollabNet. http://www.open.collab.net/