If you don't care about retaining all the history of one of the repositories, you can just create a new directory under one project's repository, then import the other.
If you care about retaining the history of both, then you can use 'svnadmin dump' to dump one repository, and 'svnadmin load' to load it into the other repository. The revision numbers will be off, but you'll still have the history.
The above quote is from the Subversion FAQ at Tigris.org. After reading it, one might get the impression that merging two SVN repositories is a trivial process. Of course that largely depends on your situation, but it is usually not as streamlined as the above quoted text implies. In this post I will try to detail how I was able to achieve that.
Due to certain circumstances we had to temporarily stop using the public Ra-Ajax SVN repository at Google Code and continue the development of Ra-Ajax using a private repository at our server. Now that these circumstances are gone, we wanted to merge all revisions from the repository at our server back to the Google Code repository so that we don’t lose any history of changes and can resume development using the Google Code repository.
The only tool I needed to do this is VisualSVN Server. It is a Subversion server for Windows that includes all the subversion binaries we need. It also has an MMC snap-in that allows you to manage your SVN repositories in a nice visual way:
It would be convenient after installing VisualSVN Server to add its bin folder to the PATH environment variable; since we will be using several of the utilities there from the command line.
Since we wanted to retain all history of both repositories, following the advice of the Subversion FAQ was the initial plan. However, using ‘svnadmin dump’ and ‘svnadmin load’ means that you must have access to the servers that host both repositories. The reason for this is that the two commands expect a path to the repository and not a URL. This is of course possible with the repository that we want to dump which is the private repository at our server but it can’t be done, as far as I know, with the other repository at Google Code which we want to load the dumped repository into.
To overcome this problem we can mirror the Google Code repository locally and work on this mirror instead. Then we can reset the Google Code repository to revision zero and sync it to this mirror, which would finally contain the result of merging the two repositories. If this seems a little vague to you now, continue reading and more details will come.
Dumping the Private Repository
Since I have RDP access to our server, I can use ‘svnadmin dump’ from a command line to achieve this, like so:
svnadmin dump /path/to/privaterepo > PrivateRepo.dump
And this will dump the entire repository to the file PrivateRepo.dump. But here came another problem. Our repository at the server does not host only the one project we need to be dumped but it also hosts several other projects. The repository structure can be outlined like this:
Since we are only interested in the project called ‘Ra’ we can use another subversion utility called svndumpfilter to filter out the unneeded projects from the dump file like this:
svndumpfilter include --drop-empty-revs --renumber-revs Ra < PrivateRepo.dump > Ra.dump
This will filter out the unneeded projects from PrivateRepo.dump and save the result to the Ra.dump file which should only include the project ‘Ra’. The optional arguments --drop-empty-revs and --renumber-revs are necessary here to remove any empty revisions resulting from filtering and to appropriately renumber the revisions that are left.
If this works for you then good. However, filtering does not always work as expected and svndumpfilter can choke on some projects and fail to filter them out which happened in my case with an error similar to this:
svndumpfilter: Invalid copy source path '/ProjectX'
The solution I used to solve this problem is to mirror or synchronize project ‘Ra’ to its own dedicated local repository on my machine. Then dump that mirror instead.
Synchronizing Two Repositories
This can be done in following steps:
1. Using VisualSVN Server Manager, create a new user. Let’s assume username: user1 and password: user1_pass
2. Right-click Repositories and create a new repository. We will call it ‘RaMirror’. It is important here to keep ‘Create default structure (trunk, branches, tags)’ unchecked since we want this repository to remain at revision zero in order to be able to sync it.
3. Right-click the repository ‘RaMirror’ and click on Properties. Here you should make sure that the user we have created in step one has read and write access to this repository.
4. Before we can sync the two repositories ‘Ra’ and ‘RaMirror’ we need to edit the hook scripts for ‘RaMirror’. If you installed VisualSVN Server accepting all defaults, the folder that contains the repository files for ‘RaMirror’ will usually be ‘C:\Repositories\RaMirror’. Under the hooks folder, you will find the default hook scripts.
These are Unix shell scripts, you can of course modify them to work on Windows if you want. But in our situation we don’t really need to do that. We can just rename all hook files to use the ‘cmd’ or ‘bat’ file extensions instead of the default ‘tmpl’ extension in order to make them executable on Windows.
And for all the post scripts:
We can pretty much clear their contents; because they are mostly used to send notification emails after the corresponding action takes place. As for the other scripts:
They do certain checks to make sure that the provided user credentials are allowed to perform commits, change revision properties, lock/unlock files etc…
Since we created this repository, RaMirror, locally and since we have full privileges and read/write access, we can modify their contents to just exit with a hard-coded success code. like so:
5. We are now ready to sync the repository ‘RaMirror’ with the project that we want to dump, which is project ‘Ra’. To achieve this we will use another subversion utility called svnsync in two steps:
Firstly, we initialize the syncing process, like so:
svnsync init --source-username src_user1 --source-password src_user1_pass --sync-username user1 --sync-password user1_pass file:///Repositories/RaMirror svn://ra-ajax.org/Ra
Here we are using svnsync with the init subcommand. We are providing credentials using --source-username, --source-password for the source repository that we want to mirror, and --sync-username, --sync-password for the destination repository which is ‘RaMirror’. Then we provide the URL of the destination repository and the URL of the source repository respectively.
Secondly, we start the actual synchronization process using svnsync with the sync subcommand:
svnsync sync --source-username src_user1 --source-password src_user1_pass --sync-username user1 --sync-password user1_pass file:///Repositories/RaMirror
Here we only need to provide the URL of the destination repository. After this finishes successfully you should have a mirror of the source subversion repository with all its history and revisions. And now we can dump this mirror to a dump file:
svnadmin dump /Repositories/RaMirror > Ra.dump
This will dump all revisions to Ra.dump, however the first revision just adds the same files and directories that already exist in the repository which we want to load this dump file into. We need to only start at the revision that had actual changes, assuming it is revision 2 and that the last revision in the repository is 100, the command we should actually use would be like this:
svnadmin dump --incremental -r 2:100 /Repositories/RaMirror > Ra.dump
Note that we also pass the --incremental option so that the first dumped revision, 2 in our case, would only describe the changes in that revision and not everything that existed in the repository as of that version.
I also similarly mirrored the repository at Google Code to a local SVN repository and named it ‘RaGMirror’.
Since the name of the repository we mirrored is ‘Ra’, the Ra.dump file will have the file/folder names that reside in the root of the repository prefixed with a ‘Ra’ folder. And since we need these files to be created at the root of the repository when we load this dump file not under a subfolder, we need to do some editing. You can read more about this here.
I used Notepad++ to search for all instances of ‘Node-path: Ra/’ and replaced them all with ‘Node-path: ’. We also need to search for the section that creates the ‘Ra’ subfolder and remove it. It would look like this:
Then we can save the dump file and start to load it into the trunk of our local ‘RaGMirror’ repository:
svnadmin load --parent-dir trunk /Repositories/RaGMirror < Ra.dump
After the loading process is finished successfully, the ‘RaGMirror’ repository would contain all revisions and full history of the two repositories that we wanted to merge.
The final step now is to sync this local repository ‘RaGMirror’ back to the public Google Code repository. Akin to what we did in step five in the previous section, but of course changing credentials and the source and destination URLs. However, before this can be done, the repository at Google Code must be reset to revision 0.
Be careful as many things can go wrong, be sure to have backups of every repository you are about to change and use the information provided here at your own risk. The image below shows the two revisions where both repositories merged seamlessly.
Technorati Tags: SVN