Track your changes to an open-source project with git: Difference between revisions

From Bitpost wiki
mNo edit summary
No edit summary
Line 127: Line 127:


Check out [[Managing your next commit with git]], or [[Submitting your commits with git]], for more juicy stuff.
Check out [[Managing your next commit with git]], or [[Submitting your commits with git]], for more juicy stuff.
== Managing changes  ==
Two caveats to keeping things running smoothly:
# Use [git-push -f] when pushing from a client with a fresh svn rebase to the public repo, to "reset" the public repo.
# Before pulling when public has an updated svn rebase, remove all mybranch commits to get a fast-forward merge.
So you can rebase to the latest svn commits on any client, but then you should be careful about pushing those changes to public then pulling to other clients.

Revision as of 16:15, 17 March 2008

Here are quick and easy steps to track your changes to an open-source project.

Once you get a handle on git, you'll find it very powerful. But getting started can be the biggest challenge. Distributed source control requires a different way of thinking than using a central repository. git's terminology is a little bent from what you're used to. Use this guide to get past the first hurdles.

Requirements

Getting git on gentoo:

emacs /etc/portage/package.keywords
dev-util/git
dev-util/tig
dev-util/qgit
emacs /etc/portage/package.use
dev-util/git tk
emerge -Davu git tig qgit

Before you can start making changes to a project, you should get the code and make sure you can compile and run it. In this guide, we assume that the project is currently under subversion(svn) control. Check out a copy and get it up and running.

The next set of requirements is pretty simple:

  • I need my own branch off of the svn branch.
  • I need the ability to merge the latest svn changes, over time, with the changes in my branch.

Notice how simple this is - we're not even asking to commit anything yet. These are fundamental requirements that any developer on a non-trivial open-source project would have. You need to do this if you are going to bang on an open-source project.

Now let's add one more requirement. I'm going to have this code on at least three different machines, so I need to be able to have a common repository for my changes.

  • ability to work from several locations, pushing/pulling my changes to a central repo

Another basic source control requirement, and we're done.

But try to meet this set of requirements with cvs or svn. No longer simple. Enter git. It was born to do this job. If you're with me so far, and it sounds like we're on the right track, I promise you: step through the rest of this guide and you'll be happy you did. Let's get to it.

Setting up your own branch off svn

NOTE: I'm going to use mythtv in this example, because that's what I wanted to work on.

Create a local git repository of the project's subversion repository, using git-svn. There are two ways to do this. If you do not specify a svn revision number, git will grab the entire history of the project as it is available in the svn repo. WARNING: this may be HUGE for older bigger projects!

ssh client1
cd my_git_repos
git svn clone http://svn.mythtv.org/svn/trunk/mythtv mythtv

If you specify a revision number, git will grab just that version, and then we can grab all changes after that version, too. It's probably worth digging into the project history to find a reasonable revision number.

git svn clone -r15502 http://svn.mythtv.org/svn/trunk/mythtv mythtv

We now have a repo of the project under git control. Simple, eh? git uses branches, and named our svn grab the "master" branch in this repo. See the list of branches:

git branch

Note that you don't want to change anything in the master branch. It will remain a "clean" copy of what's in svn. Now we can easily update our master branch with the latest svn commits at any time. Do this now to make sure you're completely up-to-date:

git svn rebase

Now we create a branch of our own, derived from the svn base. git loves branches!

git checkout -b mybranch
git branch

Now that you have your branch checked out, you can bang away and change anything you want to (except the top-level .git directory).

Eventually you'll want to commit your changes. Time for some terminology (I swear I'll be brief - on a "need to know" basis). As you edit, you're only changing local files. git refers to the locally modified files as your "working set". git doesn't track anything about your working set - you're free to abuse it as you see fit. If you want to keep changes you've made, you need to add each file you've changed or added to the git "index" (also called the "cache"). Then, when you're ready, you commit all the indexed changes to the repository in one single commit. So git provides the "index" as a space in which you can manage your next commit.

Let's say you've changed the configure script. Here's how to commit the change to your git branch:

git add configure
git commit -m "I updated the configure script."

The index is an extra layer that you don't usually have with cvs/svn. It's totally a benefit, and you can ignore it by doing the add and commit in one step. This will add every file that is in the repo and has been modified to the index, then commit, in one step ("cvs/svn style"):

git commit -a -m "I changed this to that."

So over time, you'll have a series of commits to your branch. Later, you'll want to get the latest svn changes and incorporate them. We're going to use rebase again. Previously, we used it to update from svn, let's do that again. Make sure you switch to the svn branch first.

git checkout master
git svn rebase

Now it gets interesting. We're going to switch back to our branch, and once again use rebase. To be more specific, rebase means to take your current set of commits all the way back to the last rebase, REMOVE them from the branch, "rebase" the branch as specified, and reapply all our commits. This is good stuff!

git checkout mybranch
git rebase master

git has now reapplied your changes on top of the latest svn. You won't get any svn conflicts, because the latest svn changes were applied directly to the original svn base. You get to concentrate on what's important: overlaying your modifications on top of the latest svn changes. I told you it was cool!

We've now handled our first set of requirements. Next, we'll tackle setting up a public git repo. If you don't need that, you might want to skip to Managing your next commit with git, or Submitting your commits with git.

Making a public repo

git wants you to share. You can share your repo using ssh, http or the git protocol (most efficient but requires port 9418 to be available). We'll keep it simple and secure here and use ssh. Assuming you can access your public server with ssh, you can use it for your git public repo.

Next, we take our current repo, with the "master" svn-based branch and the "mybranch" derivative, and clone it into a special repo marked as public. The --bare option tells git that this repo will just track changes from others, but not have its own working set.

ssh server
cd git_public

If the first repo was created on a client, not the publicly-accessible server:

git clone --bare ssh://client1/path/to/git/mythtv mythtv

If the first repo was on the server already, you can just use a local path instead of the ssh url:

git clone --bare /path/to/repo1/mythtv mythtv

Now you have a public shared repository!

Sharing changes on a public repo

Let's set up a repo on a second client that uses the public repo. Originally, I thought there might be two ways to go: set up just like we did the first client, by starting with a grab of the svn repo and branching it, then connecting it to the public repo; or set it up using the public repo, then connect it to svn. The second method would be nice, as we wouldn't have to hammer the svn repo as much, but unfortunately didn't work as easily as I had hoped (see How (not) to retrofit a repo connection to svn).

So we start out by setting up client2 just like we did with client1:

ssh client2
cd my_git_repos
git svn clone -r15502 http://svn.mythtv.org/svn/trunk/mythtv mythtv
git svn rebase
git checkout -b mybranch

Now we can retrofit the dependency on the public repo by editing the .git/config file (don't worry, you're supposed to! and it's easy...). Add this to the bottom:

[remote "origin"]
       url = ssh://server/git_public/mythtv
       fetch = +refs/heads/*:refs/remotes/origin/*
[branch "master"]
   remote = origin
   merge = refs/heads/master
[branch "mybranch"]
   remote = origin
   merge = refs/heads/mybranch

We're all set now. Pull the latest changes from the public repo if you like:

git pull

Over time, as you rebase your svn branches, it is unlikely that all of them will remained synced, and you'll start to get this error when you push to your public repo (which, by default, pushes all matching branches):

error: remote 'refs/heads/master' is not a strict subset of local ref 'refs/heads/master'.
maybe you are not up-to-date and need to pull first?

That's because rebasing changes the various master repos so they no longer have common histories. But do we really need to track master in the public repo? No, it's just a copy of the svn repo. So let's kill the master branch in the public repo (yes, sounds scary, but master is just another branch).

I found that I needed to pull off a little "trick" to do this. You can't just switch from the master branch in the bare public repo, because it doesn't have a working set. Instead, just directly edit .git's HEAD.

ssh server
cd git_public/mythtv

Now edit HEAD, changing from:

ref: refs/heads/master

to:

ref: refs/heads/mybranch

Now you can delete master.

git branch # to verify that mybranch is active
git branch -d master

You can now push to and pull from your public repo to your heart's content.

You're now a git pro. Finally, as the reward for slogging through all this, here's a diagram of what you've got - seriously cool!  :>

                   repo1
            mybranch<->master
           /                 \
public repo                   svn repo
           \                 /
            mybranch<->master
                   repo2

Check out Managing your next commit with git, or Submitting your commits with git, for more juicy stuff.


Managing changes

Two caveats to keeping things running smoothly:

  1. Use [git-push -f] when pushing from a client with a fresh svn rebase to the public repo, to "reset" the public repo.
  2. Before pulling when public has an updated svn rebase, remove all mybranch commits to get a fast-forward merge.

So you can rebase to the latest svn commits on any client, but then you should be careful about pushing those changes to public then pulling to other clients.