Work on an open-source project from anywhere: Difference between revisions

Latest revision as of 16:41, 6 September 2010

Here are quick and easy steps to set up git to overlay your changes on an open-source project that uses svn.

We'll jump right in. If you want more of an introduction, or if the language just doesn't click, read through the overview, as well as this more-detailed approach, first.

Requirements

Here's the software you need:

git
Any current project accessible via subversion

Make sure you have git configured properly, including assigning your name and defining your editor. It's probably worthwhile to grab the code directly from svn and make sure you can compile it, if you have the time or need.

Here are our goals:

Set up one repository for interaction with svn:
- Get latest svn code
- Merge differences between new svn code and your overlay
- Commit your changes to svn repository
Set up any number of additional repositories to work on your code at any location

Setting up your own copy of svn

NOTE: I'm going to use ampache in this example, because that's what I wanted to work on.

Create a local git repository of the project's subversion repository, using [git svn]. First, browse to the project's svn repo to find the current "revision number". By default, git will pull down every revision ever made in svn. It's very likely you will actually only want the most recent (or at least fairly recent) revision; use the current revision number to make your decision. Next:

ssh client1
cd ampache_ext
git svn clone -r2451 https://svn.ampache.org/ trunk

We now have a repo of the project under git control. Let's immediately update it, so we know we have the latest revision:

git svn rebase

You can actually start editing on the project immediately, if you have an itch to scratch. Note that git has placed the code under the default "master" branch; that's going to be good enough for us in this approach.

As you edit, you're only changing local files. git refers to the locally modified files as your "working set". git doesn't track anything about your working set - you're free to abuse it as you see fit. If you want to keep changes you've made, you need to add each file you've changed or added to the git "index" (also called the "cache"). Then, when you're ready, you commit all the indexed changes to the repository in one single commit. So git provides the "index" as a space in which you can manage your next commit.

Let's say you've changed play.php. Here's how to commit your change to your git repo (but not svn yet):

git add play.php
git commit -m "I updated the play page."

The index is an extra layer that you don't usually have with cvs/svn. It's totally a benefit, and you can ignore it by doing the add and commit in one step. This will add every file that is already in the repo and has been modified to the index, then commit, in one step ("cvs/svn style"):

git commit -a -m "I changed this to that."

Note that this will not add new files you've created to the repo, you'll need to specifically use [git add] for that.

So over time, you'll have a series of commits. Later, you'll want to get the latest svn changes, and combine them with your changes. Here's what is going to happen:

git saves off your changes, reverting the repository to the last svn revision you grabbed
git grabs the latest svn changes (from the last svn revision you grabbed to the latest available revision)
git drops the changes right on top of your repo - the changes will fit perfectly
git drops your changes back on top of the latest svn

This is as smooth as it could possibly be. Here's the command - yes it's really this simple:

git svn rebase

Now, that doesn't mean it will always be easy. If there is any significant change to the same file in both the latest svn changes and your changes, you are obviously going to need to merge the two. This is a basic truth. :>

Committing to the svn repo

Now you are keeping your personal masterpiece updated with the latest changes from the open-source developers. But we want to contribute, right? Well, once you have permission to do so, git makes it pretty easy, once again.

A minor bit of advice: no matter what language or tech you are using, try to extend others' code with grace. Try not to jump into the middle and start refactoring like a banshee. You will pay the piper when it comes time to merge. Try to extend what is there through derived or extension classes, try to keep your code in separate modules, etc. Anything you can do to separate your work will be worth the effort.

If you are like me, you will have a series of commits, some major, some just being fixes for something stupid you did. We all have our egos to deal with - that's partly why we're in this to begin with - so what can we do to clean up before we post everything for the world to see? git to the rescue once again. We're going to rebase our changes again, packaging them up into something more attractive. Viva la rebase!

First, make sure you have grabbed the latest svn revisions, to keep things as simple as possible. See the previous section if you have ADD and can't remember how. You may still collide with changes coming in (which you can handle), but this will reduce the chances.

Next, we're going to check on the history of the repo. We're specifically looking at our most recent set of changes that have not yet been published. We want to find the exact point where they begin. Try this:

git log

You will see a list of commits. Navigate down the list until you find the first local commit for your latest set of changes. Copy the hash that identifies the commit JUST BEFORE your first new commit. That is, the last known commit from svn. Now paste it into this command:

git rebase -i #hash_number#

This is going to work some magic for you. It will pull together all your recent commits and display the commit messages for you in your editor. Here's what you get to do in this kind of rebase. You can remove a commit (probably not a good idea). You can pick a commit (this just keeps the commit as-is). And, the most cool of all, you can squash a commit. Squashing a commit combines it with the previous commit. So, before I drop my cruft on some other kind soul's svn repo, I like to squash all my local commits into one. To do so, put an "s" in front of all your commits after the first one. Once you exit the rebase edit, your editor will display all the messages for the commits that were squashed together. Now you can rewrite history! What a beautiful thing. Simply delete all the embarrassing commits (ha!) and rewrite the message as you need to sound like you knew what you were doing all along. How sweet is that. Once you exit the editor, git will squash the commits into one, and apply the new commit message. You are now a hero, take a bow.

The commit should now go smooth as butta. The commit and commit message is already queued up, so this just seals the deal:

git svn dcommit

Set up a second client

We now have powerful direct two-way interaction with someone else's svn repository. The client you have already configured will remain the primary means by which you rebase from and commit to svn. But you're not much of a developer if you're chained to one machine. git makes it very easy to clone your repo at other locations. Let's try it out. For this approach, I'm assuming you have ssh access to the other locations.

ssh client2
cd ampache_ext
git svn clone git+ssh://client1/ampache_ext/trunk ampache_ext

Wow, seriously that's it? Yep.

Another bit of advice to ignore at your own peril. :> Always try to push changes on these secondary clients back to the primary repo after each session. You'll go crazy with trying to track your changes if you don't. You can DO that with git, if you need to, but you're just making your life more difficult. Committing is as simple as this:

git commit -a -m "My changes"
git push

I would make it even easier, by putting this in a commit script that you can fire off in an instant.

There are a couple other tools to mention at this time. On linux, the command line rules and git is your friend. Elsewhere (read Mac and Windows), I recommend setting up Eclipse and Egit. Egit will provide you with a really nice gui to help you survive in those hostile environments :>, and Eclipse is actually a good IDE to use there, as well. All IMHO. Wait, that wasn't fair to OS X, git works just fine from the command line. And the Eclipse solution works just fine on linux. So, there you have it - choice. Yay.

That's the quick basic rundown, you'll learn as you go. The biggest challenge will be resolving those conflicts, but keep a cool head and you'll be a pro in no time.

Comments welcome on the blog...