Friday, July 10, 2009

Git and CVS

In my last post I talked a little about my experience starting to learn to use Git. Now I'm going to talk about what I'm doing with it currently in an environment where Git can't be used as the end-to-end version control system (at least not immediately).

The Situation

At work we're using CVS for a legacy project that I'm working on. It's been around a while and the practices with it are pretty set. Branches are used for release versions of the product and that's about it. I tend to have a feature or two I'm working on and I also occasionally do something experimental. The trouble is, getting changes related to separate things mixed together can make it confusing to know what needs to be committed and what doesn't when it comes time to commit something (committing early and often would help with this, but it's not quite how things are done). Especially problematic would be if changes related to different things affected the same file.

To deal with this, I'd sometimes check out a separate copy of the project into another directory and work on something specific there, but this wasn't very efficient. I'm programming in Java using Eclipse, and in addition to the issue of keeping track of multiple complete copies of the project, I'd have to switch workspaces or open up another copy of Eclipse to work with those copies. This wasn't very efficient.

The Solution

When I started looking into Git and experimenting with it, I realized that I could probably use it locally with a totally different VCS acting as the central repository. Git stores all its data in a single folder in the root of the working directory, which means that its files aren't mixed into your entire directory structure the way they are with CVS and its CVS subdirectories in every directory.

Here's the process I used to get this project set up for work with Git:
  • Check out a clean copy of the project into a new directory to use.
  • Get the environment all set up for normal use in Eclipse.
  • Do git init to create the Git repository at the root of the project.
  • Create and set the .gitignore file, having it ignore compiled code output folders, among other things. (Note: It should also ignore all CVS folders.)
  • git add . everything that isn't ignored, making sure you aren't getting anything you don't want to.
  • Do the initial git commit.
  • From there, use Git normally... create a development branch, and branches off that for features, experiments, etc.
  • When you're ready to commit something to CVS, merge the changes all the way down to the master branch and commit in CVS.
When I switch branches, I just refresh the project in Eclipse and it immediately reflects the branch I'm now on. Since the files that exist may differ between branches, I also like to use tasks in Mylyn. I have tasks that are specific to whatever I'm doing on a specific branch, and I activate them while working on them. I'd done a little of that before, but it's even more useful in this situation. If I'm working on some files and need to switch to another branch that doesn't have them, I just deactivate the task and the editor tabs go away. I switch branches, do whatever I need, and then when I switch back and activate the task. Despite the fact that the files ceased to exist on the files system for a while, when the task is reactivated the files are right back there like before!

This approach has made it far easier and more fun to work on multiple separate things at once, and it's easy to do even though the central repository is the ancient CVS! Thanks for being awesome, Git!

Update:

How you interact with CVS when doing this is important. First, the .gitignore file should be set to ignore all CVS folders! You don't want changes to the files in the CVS folders to have to be committed to your Git repository. You want to ignore the fact that you're also using CVS as much as possible. Given that, you should also only ever update from CVS or commit to CVS from your master branch! Things can get really weird if you update or commit in a branch you've created.

Wednesday, July 8, 2009

Learning Git

One thing that I've been realizing more and more recently is the importance of keeping up with advancements in developer tools. I'm a strong believer that if you aren't using the best tools available for the job you're doing, you aren't being as effective as you could be, regardless of how much effort you put in to it. Plus, as a lazy programmer I certainly don't want to put more effort in to something when a better tool could save me from that.

Up until recently, I'd only used CVS and Subversion for source control... CVS at work and a little in college, and Subversion at home because I knew it was more recent than CVS. As I started to pay more attention to what I was seeing around the web on blogs and Hacker News, I began to notice a lot of positive references to Git. It seemed a little intimidating to me, at first... largely command line based, with no full-featured shell integration like Tortoise for Windows? I didn't know about that... I like just right-clicking on files for diffs, commits, etc.

Tales of the wonders of branching in Git were probably what brought me around to finally try it out this weekend... well, that and GitHub, which is awesome. Anyway, on sitting down and starting to learn it I discovered that it really wasn't hard at all to pick up. A quick git init, git add . and git commit and I had a small existing project that I hadn't put in version control yet committed to a Git repository. And from there, pushing it to a repo on GitHub was incredibly easy as well. I then spent some time experimenting with branching and merging and all that fun stuff. I'll admit that I certainly still don't know all the intricacies of it, but the basic process was as easy and wonderful and rainbows as advertised.

One of the things I really love about Git is the fact that it uses a local repository in addition to a remote one (and that only if you want/need it to). Branches are such a great way of keeping logically separate units of work... well, separate, and that includes experimental changes. It's nice to be able to have experimental branches without them needing to be stored in the central repository that everyone on a project uses. Also great is the concept of the working directory, and how easily and quickly the files in it change as you change branches. Having one directory you can work from with all the different branches of your project has many advantages. I think I'll get in to that and some details of how I'm starting to make use of Git at work in my next post, which will probably be related to using Git locally with a central CVS repository.