Tuesday 16 February 2016

SVN to Git - Only migrate some branches

I have recently been moving some core repositories from SVN to Git. These repositories typically use a standard layout of trunk, branches, and tags. If moving a repository with the standard layout, there are many tutorials to help with this and my particular favourite, which I have followed for most of our repositories, is this one from Atlassian.

However, when I went to move our final and largest repository, I realised that it includes over 40 legacy branches and 90 legacy tags. Many of these hadn't been required for a number of years. Instead of moving these branches and cluttering up the new Git repository, I decided to only move branches and tags that are currently active. Unfortunately, to do this requires a different formula to online tutorials. The steps to achieve this and only move select branches is outlined below:

Create an authors file as normal

An authors.txt file links the username used for SVN commits to an email address in the Git repository. This can be done using the svn-migration-scripts.jar from Atlassian or an alternative script that can be used on an svn checkout is below.

svn log -q | awk -F '|' '/^r/ {sub("^ ", "", $2); sub(" $", "", $2); print $2" = "$2" <"$2">"}' | sort -u > authors.txt

Once this file is created, you need to edit it so that the section between the angled brackets to the email address of your users. For example change

jbloggs = jbloggs <jbloggs>

into

jbloggs = jbloggs <jbloggs@your-company.com>

Clone trunk branch of the repository

Once you have an authors.txt file you can now clone your repository using git-svn. For this method, your first clone should only include the trunk branch.

git svn clone -T trunk --authors-file=authors.txt svn-repo  

Under your .git/config you should now see a new svn section such as:

[svn-remote "svn"]
 url = svn://your/path
 fetch = trunk:refs/remotes/trunk
[svn]
    authorsfile = /path/to/authors.txt

Add branches and tags

After waiting for the initial clone to complete, you can add your branches and tags. To do this you should edit the .git/config file and add them to the svn-remote section.

For branches add them as follows:

 fetch = branches/branch1:refs/remotes/branches/branch1
 fetch = branches/branch2:refs/remotes/branches/branch2
 fetch = branches/branch2:refs/remotes/branches/branch2
 branches = branches/{branch1,branch1,branch1}:refs/remotes/branches/* 

For tags addas follows:

 tags = tags/{tag1,tag2,tag3}:refs/remotes/tags/*

This should result in a final section such as:

[svn-remote "svn"]
 url = svn://your/path
 fetch = trunk:refs/remotes/trunk
        fetch = branches/branch1:refs/remotes/branches/branch1
        fetch = branches/branch2:refs/remotes/branches/branch2
        fetch = branches/branch2:refs/remotes/branches/branch2
        branches = branches/{branch1,branch1,branch1}:refs/remotes/branches/*
        tags = tags/{tag1,tag2,tag3}:refs/remotes/tags/*

Note: the`fetch =` line were recommended from here. In some cases they may not be needed.

Fetch the tags and branches

git svn fetch

Once completed all branches and tags specified should be available and can be viewed by running.

git branch -r
git tag 

Clean the branches

At this stage in the Atlassian stdlayout tutorial you will run the clean-git command to link the SVN branches to Git branches. However, the jar file provided by Atlassian does not support the branch syntax that we used. A patch is available but I cannot confirm it if works because of issues recompiling the program. As a result of this I manually pieced together the steps required to finish the repository move from the source of the program and various blogs.

To move the tags you can use the following script:

    #!/bin/sh
    # Based on https://github.com/haarg/convert-git-dbic

    set -u
    set -e

    git for-each-ref --format='%(refname)' refs/remotes/tags/* | while read r; do
        tag=${r#refs/remotes/tags/}
        sha1=$(git rev-parse "$r")

        commiterName="$(git show -s --pretty='format:%an' "$r")"
        commiterEmail="$(git show -s --pretty='format:%ae' "$r")"
        commitDate="$(git show -s --pretty='format:%ad' "$r")"

        # Print the commit subject and body separated by a newline
        git show -s --pretty='format:%s%n%n%b' "$r" | \
        env GIT_COMMITTER_EMAIL="$commiterEmail" GIT_COMMITTER_DATE="$commitDate" GIT_COMMITTER_NAME="$commiterName" \
        git tag -a -m "Tag: ${tag} sha1: ${sha1} using '${commiterName}', '${commiterEmail}' on '${commitDate}'" "$tag" "$sha1"

        # Remove the svn/tags/* ref
        git update-ref -d "$r"
    done

To move the branches you need this script:

    #!/bin/bash

    for branch in `git branch -a | grep remotes | grep -v HEAD | grep -v master | grep -v trunk`; do
       #git branch ${branch##*/} $branch
       #git branch ${branch#*remotes/origin/} $branch
       xbc=${branch#*remotes/branches/}
       echo "$xbc    --    $branch"
       createcmd="git branch -f $xbc $branch"

       #required since git v1.8.4
       trackcmd="git config branch.$xbc.merge $branch"

       eval $createcmd
       eval $trackcmd
    done

Share your code

Your local Git repository is now following the upstream SVN repository and you can share the repository with colleagues by pushing to a central server (e.g. Github, Bitbucket). You can use standard Git commands to do this:

git add origin <server>
git push -u origin --all
git push --tags


If you are able to close your SVN repository you can submit all commits to the new git repository. However, if you have to keep the SVN repository open for commits for a period of time you will need to sync commits from SVN to Git.

No comments:

Post a Comment