Combining multiple git repositories into subdirectories of a single repository

In a few projects I worked with, the legacy code base were split into multple git repositories when they were interdependent. What I mean is that to compile the microservice deployable, you need to create the libraries contained in these multiple git repositories in a specific sequence. In addition, these libraries are not used anywhere except in the microservice. The worst culprit I encountered was a microservice that was split into six different git repositories – five Java libraries and one Java WAR file. It makes a developer’s life much easier if these six repositories are structured as a single Java maven multimodule project in a single git repository.

Obviously, you can just copy all the code to one directory and commit that. However, git allows you to copy the content from one repository into another, preserving the commit history at the same time. No one wants to lose all those change histories, right?

As an example, the code base for project x is split into three repositories called project-x-client, project-x-core and project-x-ws. You want to combine them all in project-x-ws, with each individual repository in a subdirectory. (This will later be converted into a maven submodule). After the migration, you want the code in project-x-client to be in a subdirectory called client under project-x-ws, project-x-core core, and project-x-ws in webservice.

First, you need to move all the files in project-x-core into a subdirectory called core. By doing this, when you copy all the files from this repository, the files will all be neatly located inside the directory core, instead of being under the root directory. We will use the master branch for the move.

cd /home/me/git-stuff/project-x-core
git checkout master
mkdir core
git mv src core
git mv README.md core
...
git commit -am "preparing project-x-core for migration"

You don’t need to push the change to remote. The copy can be done entirely using the local repository. To copy the project-x-core repository into project-x-ws:

cd /home/me/git-stuff/project-x-ws
git remote add r /home/me/git-stuff/project-x-core
git fetch r
git merge r/master --allow-unrelated-histories
git remote rm r

You have now pulled all the files and their commit histories into the existing repository for project-x-ws. Repeat this for client and ws. When you finish. push project-x-ws to remote to share this change with your team.

Stop accidental git commits of local dev changes to config files

During development, I often make changes to a few configuration files for local testing. Most of the time, I add each file individually into the staging area so these local config changes aren’t committed. Yesterday, I made a mistake and committed the local config. I wasn’t sure how it happened but I must have clicked the commit all tracked files accidentally. The test server was then built with my local config. Oops.

To stop this from happening again, I did some googling and found this handy git command:

git update-index --assume-unchanged <file>

This will temporarily ignore changes in the specified file. All without changing .gitignore which is a tracked file in the project.

Caching Password for Git HTTP/HTTPS connections

I got sick of entering my username and password every time I do a git operation. Luckily, git provides handy options for caching passwords. The safer option is probably to just cache the credentials in memory

git config --global credential.helper cache

This would keep the password in memory for 15 minutes. To permanently save the credentials on disk (in plain text format), use

git config --global credential.helper store

PS. I chose the later unsecure lazy option.

The Strange Default Behaviour of Git Push

It felt strange that only after a year of using git, I encountered this strange pushing logic from git. (I’m blaming it on gerrit, where pushes are always done to the review staging area using refs/for/master, instead of directly to origin/master).

My work has recently moved from svn to git. I worked on my features by creating a branch locally that tracked changes in remote master

git checkout -b feature origin/master

However when I tried to push using

git push origin/master

I got a warning along the lines of push.default is unset. Git helpfully suggested me to look at ‘git help config’. From the built in help pages and googling, I found that a simple push mode was introduced in 1.7.11. This is the default behaviour and will only push if the upstream branch’s name is the same as the local one. Because I always create a local branch using a feature name, git can’t push it to the remote server using the default behaviour.

To allow a different local branch name, I need to set the push.default config variable to upstream, which simply pushes the current branch to its upstream branch.

git config --global push.default upstream

Getting new subversion branches after initial svn-git cloning

Earlier this week, I needed to work on a feature branch on the company’s subversion repository. (The one I did a git copy of a month ago).

Imagine my surprise when I couldn’t see the feature branch with git branch -r. The command shows all the branches and tags in a git repository. The branch was created after my initial cloning, and was not pulled down with subsequent git svn rebase.

It turned out to get subversion branches created after cloning, you need to do a fetch instead.

git svn fetch
git branch -r

Reading the git-svn man page more carefully, rebase only fetches revisions from the SVN parent of the current HEAD. In comparison, fetch fetches unfetched revisions from the tracked Subversion remote.

Using git with a large subversion repos

I recently found myself working on a very large subversion (svn) repository. The repository (possibly) contains the company’s entire code base, spanning nearly 90,000 revisions. Even within the module I’m working on, there are over 300 branches.

Being a git convert, the first thought that occurred to me was to try git-svn, instead of moving back to svn. I love the ability to create multiple local feature branches. (I also prefer the way git merging works. But git-svn is restricted to the svn merge model).

For projects with many branches, it is recommended you clone one directory only. Otherwise, the resulting git repository will be many times larger than the original trunk.

git svn clone http://svn.mycompany.com/projectx/iteration123
Partial Cloning

However, I wanted to clone the trunk and a selection of branches. There are the options –trunk, –tags, –branches for svn repositories with non-standard layouts. Git-svn will also skip checking out paths specified with the option –ignore-paths.

After some experimentation, I found it easier to create an empty git repository, manually edit .git/config, and then fetch. Instead of using the command line options to clone.

git svn init http://svn.mycompany.com/projectx

Edit .git/config

[svn-remote "svn"]
url = http://svn.mycompany.com/projectx
fetch = trunk:refs/remotes/trunk
branches = branches/{iteration123}:refs/remotes/branches/*

The branches entry must contain a wildcard, or git-svn will complain with an error message along the lines of ‘glob needed’. I used the curly braces to fetch the branch iteration123 only. Multiple branches can be specified comma separated, like {iteration123,iteration124}.

I also restricted the fetch to recent revisions, by using the -r option

git svn fetch -r 80000:HEAD

Workflow

After the initial cloning hurdle, I found git-svn very straightforward to use.

At the start of each day, I merged changes from svn by using

git svn rebase

I committed changes to my local git repository with

git commit (-a)

Git-svn pushes each local commit as a separate commit to the remote svn. However, I committed very often and I really didn’t fancy polluting the svn log with lots of ‘reverting’, ‘trying x approach’. Therefore, I squashed all my commits for a scrum task into one commit before pushing the changes back to svn trunk.

git rebase -i branches/iteration123
git svn dcommit