Work on an open-source project from anywhere: Difference between revisions

From Bitpost wiki
No edit summary
No edit summary
Line 20: Line 20:
NOTE: I'm going to use ampache in this example, because that's what I wanted to work on.
NOTE: I'm going to use ampache in this example, because that's what I wanted to work on.


Create a local git repository of the project's subversion repository, using [git svn].  There are two ways to do this. If you do not specify a svn revision number, git will grab the entire history of the project as it is available in the svn repo.  WARNING: this may be HUGE for older bigger projects!
Create a local git repository of the project's subversion repository, using [git svn].  First, browse to [https://svn.ampache.org/ the project's svn repo] to find the current "revision number".  By default, git will pull down every revision ever made in svn.  It's very likely you will actually only want the most recent (or at least fairly recent) revision; use the current revision number to make your decision.  Next:
ssh client1
cd ampache_ext
git svn clone -r2451 https://svn.ampache.org/ trunk
We now have a repo of the project under git control.  You can actually start editing on the project immediately, if you have an itch to scratch.  Note that git has placed the code under the default "master" branch; that's going to be good enough for us in this approach.
 
As you edit, you're only changing local files.  git refers to the locally modified files as your "working set".  git doesn't track anything about your working set - you're free to abuse it as you see fit.  If you want to keep changes you've made, you need to add each file you've changed or added to the git "index" (also called the "cache").  Then, when you're ready, you commit all the indexed changes to the repository in one single commit.  So git provides the "index" as a space in which you can manage your next commit.
 
Let's say you've changed play.php.  Here's how to commit the change to your git branch:
git add play.php
git commit -m "I updated the play page."
The index is an extra layer that you don't usually have with cvs/svn.  It's totally a benefit, and you can ignore it by doing the add and commit in one step.  This will add every file that is already in the repo and has been modified to the index, then commit, in one step ("cvs/svn style"):
git commit -a -m "I changed this to that."
Note that this will not add new files you've created to the repo, you'll always need to use [git add] for that.  Read this paragraph again until it makes sense.  :>
 
So over time, you'll have a series of commits to your branch.  Later, you'll want to get the latest svn changes and incorporate them.
 
 








ssh client1
cd my_git_repos/mythtv_svn
git svn clone http://svn.mythtv.org/svn/trunk trunk
If you specify a revision number, git will grab just that version, and then we can grab all changes after that version, too.  It's probably worth digging into the project history to find a reasonable revision number.
git svn clone -r15502 http://svn.mythtv.org/svn/trunk trunk
We now have a repo of the project under git control.  Simple, eh?  git uses branches, and named our svn grab the "master" branch in this repo.  See the list of branches:
git branch
Note that you don't want to change anything in the master branch.  It will remain a "clean" copy of what's in svn.  Now we can easily update our master branch with the latest svn commits at any time.  Do this now to make sure you're completely up-to-date:
git svn rebase
Now we create a branch of our own, derived from the svn base.  git loves branches!
git branch mybranch
git checkout mybranch
Now that you have your branch checked out, you can bang away and change anything you want to (except the top-level .git directory).


Eventually you'll want to commit your changes.  Time for some terminology (I swear I'll be brief - on a "need to know" basis).  As you edit, you're only changing local files.  git refers to the locally modified files as your "working set".  git doesn't track anything about your working set - you're free to abuse it as you see fit.  If you want to keep changes you've made, you need to add each file you've changed or added to the git "index" (also called the "cache").  Then, when you're ready, you commit all the indexed changes to the repository in one single commit.  So git provides the "index" as a space in which you can manage your next commit.


Let's say you've changed the configure script.  Here's how to commit the change to your git branch:
We're going to use rebase again.  Previously, we used it to update from svn, let's do that again.  Make sure you switch to the svn branch first.
git add configure
git commit -m "I updated the configure script."
The index is an extra layer that you don't usually have with cvs/svn.  It's totally a benefit, and you can ignore it by doing the add and commit in one step.  This will add every file that is in the repo and has been modified to the index, then commit, in one step ("cvs/svn style"):
git commit -a -m "I changed this to that."
So over time, you'll have a series of commits to your branch.  Later, you'll want to get the latest svn changes and incorporate them.  We're going to use rebase again.  Previously, we used it to update from svn, let's do that again.  Make sure you switch to the svn branch first.
  git checkout master
  git checkout master
  git svn rebase
  git svn rebase

Revision as of 14:01, 6 September 2010

Here are quick and easy steps to set up git to overlay your changes on an open-source project that uses svn.

We'll jump right in. If you want more of an introduction, or if the language just doesn't click, read through the overview, as well as this more-detailed approach, first.

Requirements

Here's the software you need:

Here are our goals:

  • Set up one repository for interaction with svn:
    • Get latest svn code
    • Merge differences between new svn code and your overlay
    • Commit your changes to svn repository
  • Set up any number of additional repositories to work on your code at any location

Setting up your own branch off svn

NOTE: I'm going to use ampache in this example, because that's what I wanted to work on.

Create a local git repository of the project's subversion repository, using [git svn]. First, browse to the project's svn repo to find the current "revision number". By default, git will pull down every revision ever made in svn. It's very likely you will actually only want the most recent (or at least fairly recent) revision; use the current revision number to make your decision. Next:

ssh client1
cd ampache_ext
git svn clone -r2451 https://svn.ampache.org/ trunk

We now have a repo of the project under git control. You can actually start editing on the project immediately, if you have an itch to scratch. Note that git has placed the code under the default "master" branch; that's going to be good enough for us in this approach.

As you edit, you're only changing local files. git refers to the locally modified files as your "working set". git doesn't track anything about your working set - you're free to abuse it as you see fit. If you want to keep changes you've made, you need to add each file you've changed or added to the git "index" (also called the "cache"). Then, when you're ready, you commit all the indexed changes to the repository in one single commit. So git provides the "index" as a space in which you can manage your next commit.

Let's say you've changed play.php. Here's how to commit the change to your git branch:

git add play.php
git commit -m "I updated the play page."

The index is an extra layer that you don't usually have with cvs/svn. It's totally a benefit, and you can ignore it by doing the add and commit in one step. This will add every file that is already in the repo and has been modified to the index, then commit, in one step ("cvs/svn style"):

git commit -a -m "I changed this to that."

Note that this will not add new files you've created to the repo, you'll always need to use [git add] for that. Read this paragraph again until it makes sense.  :>

So over time, you'll have a series of commits to your branch. Later, you'll want to get the latest svn changes and incorporate them.





We're going to use rebase again. Previously, we used it to update from svn, let's do that again. Make sure you switch to the svn branch first.

git checkout master
git svn rebase

Now it gets interesting. We're going to switch back to our branch, and once again use rebase. To be more specific, rebase means to take your current set of commits all the way back to the last rebase, REMOVE them from the branch, "rebase" the branch as specified, and reapply all our commits. This is good stuff!

git checkout mybranch
git rebase master

git has now reapplied your changes on top of the latest svn. You won't get any svn conflicts, because the latest svn changes were applied directly to the original svn base. You get to concentrate on what's important: overlaying your modifications on top of the latest svn changes. I told you it was cool!

We've now handled our first set of requirements. Next, we'll tackle setting up a public git repo.

Making a public repo

git wants you to share. You can share your repo using ssh, http or the git protocol (most efficient but requires port 9418 to be available). We'll keep it simple and secure here and use ssh. Assuming you can access your public server with ssh, you can use it for your git public repo.

Next, we take our current repo, with the "master" svn-based branch and the "mybranch" derivative, and clone it into a special repo marked as public. The --bare option tells git that this repo will just track changes from others, but not have its own working set.

ssh server
cd git_public/mythtv_svn

If the first repo was created on a client, not the publicly-accessible server:

git clone --bare ssh://client1/path/to/git/trunk trunk

If the first repo was on the server already, you can just use a local path instead of the ssh url:

git clone --bare /path/to/repo1/trunk trunk

Now you have a public shared repository. You can now change mybranch and push to and pull from your public repo to your heart's content.

Sharing changes on a public repo

We're now sharing two branches on the public repo. Do we need to share the svn branch? We could, but things will get complicated fast. It is a big step to get the latest svn code and rebase your changes to it. We really don't want to have to go through that on every repo we have. The more efficient way is to perform the rebase on one repo, and force-push it through to every other repo. Note: if you're not convinced, check out some of the difficulties I had trying to sync the svn branch.

So we can kill the master branch in the public repo (yes, sounds scary, but master is just another branch). You'll need to pull off a little trick to do this. You can't just switch from the master branch in the bare public repo, because it doesn't have a working set. Instead, just directly edit .git's HEAD.

ssh server
cd git_public/mythtv_svn/trunk

Now edit HEAD, changing from:

ref: refs/heads/master

to:

ref: refs/heads/mybranch

You've just manually changed the active branch on a 'bare' repository. Now you can delete master.

git branch # to verify that mybranch is active
git branch -d master

Set up a second client

Let's set up a repo on a second client that uses the public repo. We start out by setting up client2 just like we did with client1:

ssh client2
cd my_git_repos/mythtv_svn
git svn clone -r15502 http://svn.mythtv.org/svn/trunk trunk
git svn rebase
git checkout -b mybranch

Now we connect it to the public repo. We retrofit the dependency on the public repo by editing the .git/config file (don't worry, you're supposed to! and it's easy...). It should look something like this:

[core]
	repositoryformatversion = 0
	filemode = true
	bare = false
	logallrefupdates = true
[svn-remote "svn"]
	url = http://svn.mythtv.org/svn/trunk
	fetch = :refs/remotes/git-svn

Add this to the bottom:

[remote "origin"]
       url = server.com:/git_public/mythtv_svn/trunk
       fetch = +refs/heads/*:refs/remotes/origin/*
[branch "mybranch"]
   remote = origin
   merge = refs/heads/mybranch

We're all set now. Pull the latest changes from the public repo if you like:

git pull

Rebasing from SVN across your repos

When you're ready to grab the latest SVN changes and rebase your code on top, make sure your working repos are all committed and updated. Then sync each of them to the same SVN revision:

ssh client1
git checkout master
git svn rebase
ssh client2
git checkout master
git svn rebase

Now you're ready to rebase. Note that you will have to work through any conflicts during the last step.

ssh client1
git checkout mybranch
git rebase --interactive master

Once you're happy with the rebase, force-push the result to the public repository. We have to do this because our branch has a new SVN head!

git commit -a 
git push -f

Now pull to the other working repositories. First, we flush our changes in mybranch, since we'll be repulling them. This is essential or git will try to preserve two sets of the same changes, yuck. Be careful here, you're basically trashing ALL your local work, make sure you know that's what you want!  :>

ssh client2
git checkout mybranch
git reset --hard master   # mybranch now looks like master!

Now pull. I've found that the [--rebase] option reduces chances of conflicts.

git pull --rebase

If you broke the rule about keeping your repos in sync before attempting this, it will leave you with a mess of conflicts. It happens. But it's easy enough to reset. Make sure client2 doesn't have any new changes, we're going to force it to match client1 now. We'll back up the config file to preserve our branch connections:

cp .git/config .git/config_backup
git checkout mybranch
git reset --hard
git checkout master
git branch -D mybranch
git branch mybranch
cp .git/config_backup .git/config
git checkout mybranch
git pull --rebase

Setting up a third client at a later date

You can easily set up another client at a later date. Just follow the same directions for setting up the second client, and then rebase across your repos. The best part is that you can pull down only the newest svn revision(s) - you don't have to pull down all the revisions on the other clients. As long as all the clients are synced up to the same "most-recent" revision, they'll sync perfectly to each other. Sweet!

You can use this strategy to clean out older SVN revisions if you've been pulling down from SVN for a while. Just re-set up an existing client as if it were new, following the instructions above.

Result

You're now a git pro. You're ready to script all those steps and get up to top speed. Finally, as the reward for slogging through all this, here's a diagram of what you've got - seriously cool!  :>

                   repo1
            mybranch<->master
           /                 \
public repo                   svn repo
           \                 /
            mybranch<->master
                   repo2

Comments welcome on the blog...