pg TUTORIAL =========== INTRODUCTION ------------ pg is a set of tools for managing a collection of patches. pg is a GIT porcelain (a higher-level user interface than the raw GIT command set). MAJOR FEATURES -------------- - Maximum compatibility with other GIT porcelains. - Simplified command line user interface. - Preserves change history of patches. - Mix and matching of changes (bug fixes/features/whatever). - Fast!(ish) CLONING AN EXISTING REPOSITORY ------------------------------ Before working with any GIT repository you should create your own local clone of it so you can edit and test your changes in isolation: ------------------------------------------------ $ pg-clone /source/path mydir ------------------------------------------------ This will copy the repository currently located in directory `/source/path` to the directory `mydir`. It will also arrange for a remote entry called `origin` to be setup for `/source/path` allowing you to pull updated revisions when they are made available. [NOTE] Advanced GIT users: no special setup or initialization step is required for pg. pg will automatically perform any setup tasks when it is first required to do so. Therefore you are free to construct a clone in whatever method suits you, or to just start using pg on an existing repository. Using `pg-clone` isn't required. MAKING CHANGES -------------- Before you start making any changes to files you should use `pg-new` to create a patch to contain those changes. The name you pass to `pg-new` should be a short (<100 character) description of the changes you intend on making. (Don't worry, you can easily change this name in the future if you don't like it.) ------------------------------------------------ $ pg-new My-First-Change ------------------------------------------------ at this point the patch `My-First-Change` will become the top most applied patch. You can check this with `pg-series`: ------------------------------------------------ $ pg-series +> My-First-Change ------------------------------------------------ The '+' indicates the patch `My-First-Change` is currently applied to this working directory. The '>' indicates that `My-First-Change` is the top most applied patch, which is where changes are always recorded. At this point you should feel free to start changing files. Use any editor/tools you like or need to use. ------------------------------------------------ (assume the following edit sequence below) $ rm src/big-nasty-bug.c # delete bug $ vi src/hello.c # fix code $ cp ~/new.c src/new.c # add a file ------------------------------------------------ At any point you can see which files you have modified with `pg-status`: ------------------------------------------------ $ pg-status M src/hello.c g src/big-nasty-bug.c x src/new.c ------------------------------------------------ in the example above the file `src/hello.c` has been locally modified: it has changes which still need to be recorded in the patch. `src/big-nasty-bug.c` is gone, we must have fixed the bug we are working on by deleting this file. `src/new.c` is an extra file, that is pg doesn't know about this file yet. Assuming we want to keep `src/new.c` as part of our project we can add it with `pg-add`: ------------------------------------------------ $ pg-add src/new.c ------------------------------------------------ and assuming we want to really remove `big-nasty-bug.c`: ------------------------------------------------ $ pg-rm src/bug-nasty-bug.c ------------------------------------------------ but I actually prefer using the '-a' switch with these commands as it means that I don't have to enter the file paths. (Note that using the -a switch implies that your .gitignore file(s) are setup properly and that gone files really should be gone.) ------------------------------------------------ $ pg-rm -a # delete all 'g' (gone) files $ pg-add -a # add all 'x' (extra/new) files ------------------------------------------------ Now I'm ready to record my changes with `pg-ci`: ------------------------------------------------ $ pg-ci ------------------------------------------------ with no parameters `pg-ci` will open your editor ($VISUAL, $EDITOR or vi) and allow you to type in a commit message. This message will be recorded with all file modifications made. You can also use the -m switch to pass the message on the command line and bypass the editor: ------------------------------------------------ $ pg-ci -m"Fixed feature wizbang." -m"It was broken when ..." ------------------------------------------------ each occurrence of the -m switch will place its argument into a new paragraph within the commit message. Messages supplied on the command line will be line-wrapped automatically by `pg-ci`. VIEWING/RENAMING PATCHES ------------------------ To view the patches currently known to pg use `pg-series`: ------------------------------------------------ $ pg-series + WhosWho +> BugReports - StopBuild - HeyYou ------------------------------------------------ here we have 4 patches: - the first two (WhosWho and BugReports) are currently applied (hence '+' prefix). This means the changes contained within these patches are currently present in our working directory. - WhosWho is the top-most applied patch (hence the '>' in front of it). Any changes committed through `pg-ci` (or any other reasonable GIT porcelain) will be added to that patch. - Two other patches (StopBuild and HeyYou) are not currently applied (hence the '-' prefix). These patches have changes stored within this repository, but those changes are not currently part of this working directory. To rename an existing patch use `pg-rename`: ------------------------------------------------ $ pg-rename WhosWho Fix-Credits $ pg-series + Fix-Credits +> BugReports - StopBuild - HeyYou ------------------------------------------------ PUSHING/POPPING PATCHES ----------------------- Each patch should be a single logical change to your project. For example: - one patch per bug fix - one patch per new feature - one patch per feature enhancement - one patch for each non-critical change - one patch for each critical (or possibly critical) change Why is this good? I often need to be able to mix-and-match current ongoing bug fixes and feature changes as the work day progresses. Or I get asked by management to selectively include some changes into the next build being sent to testing while excluding other (perceived) higher-risk changes until a future date. If I keep each logical change in its own patch I can easily add or remove *that single change* from my work directory with `pg-push` and `pg-pop`. But if the changes were all made in the same patch I can't do this. Given this, what does a typical workday look like? ------------------------------------------------ $ pg-new Spiffy-Risky-Feature $ ...hack hack... $ pg-ci -m"Cool widgets dude." $ ...hack hack... $ pg-ci -m"And awesome colors too." $ ...hack hack...discover bug in code... $ pg-ci -m"Checkpoint" $ pg-new Fix-Just-Discovered-Bug $ ...fix bug... $ pg-ci -m"Fixed possibly critical bug." $ pg-series + Spiffy-Risky-Feature +> Fix-Just-Discovered-Bug $ pg-pop -a $ pg-push Fix-Just-Discovered-Bug $ pg-push Spiffy-Risky-Feature $ pg-series + Fix-Just-Discovered-Bug +> Spiffy-Risky-Feature $ ...hack hack... ------------------------------------------------ What happened here? I started working on a really cool new feature, so I created a patch Spiffy-Risky-Feature to contain this development effort. While working on this patch I checked in changes every so often with `pg-ci`. And at some point I started looking at a chunk of code, scratched my head and said "Hmmm... That's a bug! I'll fix that real quick..." But I don't want the bug fix to be tied to Spiffy-Risky-Feature if I can avoid it, because it is highly likely that management will ask me to send this bug fix to testing (and eventually release) before Spiffy-Risky-Feature is finished. But I also need this bug fixed to continue development of Spiffy-Risky-Feature as its preventing me from getting work done there. So I don't fix the bug just yet. I need to do a few things in pg first: - I use `pg-ci` to checkpoint my changes thus far, because `pg-pop` won't operate if there are any changes which aren't checked in. - I push down a new patch for the bug fix with `pg-new`. This tells pg that any changes made from here on are part of the bug fix, not Spiffy-Risky-Feature. Now I fix the bug, but I'm doing so with Spiffy-Risky-Feature's changes still present in my working directory, so I can test the fix against those changes. (Remember I found this bug while testing these changes so its good to work with them.) When I'm done with the bug fix I record the changes with `pg-ci`. The changes will be automatically associated with Fix-Just-Discovered-Bug, not Spiffy-Risky-Feature. So now that the bug is fixed I want to continue on with Spiffy-Risky-Feature's development. If I start making changes again pg will associate these newest changes with the bug fix, not Spiffy-Risky-Feature, as the bug fix is the top most applied patch. (This can be seen by looking at `pg-series`. The '+>' indicator is pointing at Fix-Just-Discovered-Bug and not Spiffy-Risky-Feature.) To get pg refocused I use `pg-pop -a` to pop all patches off the stack. Now I'm back at my base - this is should be the version of code currently in testing. I push down the bug fix with `pg-push Fix-Just-Discovered-Bug`. Then I push down the new feature patch. At this point I have the bug fix in place, and I can start making changes again to the new feature. How long did the above process take? Not very long. Here's some basic execution times when running against the Linux kernel repository (18,742 files) on my 1.5 GHz PowerBook (Mac OS 10.4): pg-ci : 2.132s (8.79 files/ms) pg-new : 5.971s pg-pop -a: 5.949s (3.15 files/ms) pg-push : 3.011s (6.22 files/ms) So jumping into that new bug fix patch would cost me approximately 8 seconds. Refocusing back to the new feature patch would probably cost me another 12 seconds (`pg-pop -a`; `pg-push ...`; `pg-push ...`). But you can expect future improvements here as I'd like to streamline this even more. REBASING (TRACKING REMOTE CHANGES) ---------------------------------- If you want to follow an upstream development repository you can update your local repository and reapply all currently applied patches by rebasing: ------------------------------------------------ $ pg-rebase ------------------------------------------------ By default `pg-rebase` will attempt to rebase your existing repository to `origin`. This default behavior is probably what you want as `origin` is an alias for the repository that your current repository was cloned from (by `pg-clone` or `git-clone`). A rebase will: - temporarily pop all of your currently applied patches, - make your working directory look exactly like the repository you have chosen to rebase against, - then try to push the patches it popped. The end result of a successful rebase is a working directory which contains all changes available in the selected remote repository plus any changes contained within your applied patches. If an applied patch was received from the remote repository during `pg-rebase` then it will be deleted from the local directory rather then being pushed back onto the stack. This is usually the right thing to do as the upstream maintainer has accepted your change and you have just received it back from them. A `pg-rebase` can always be undone with `pg-rebase --undo`. Undo is useful if your patches won't apply cleanly and you don't have time to fix them up by hand right now or if the code you received from the remote repository doesn't function as expected and you want to continue working with the your last good state. Because of this undo feature `pg-rebase` is generally a pretty safe operation. PERFORMANCE ----------- pg is pretty fast. Or at least it tries to be. In most cases its just as fast as the raw GIT commands are, as pg is nothing but a few lines of shell wrapping those raw GIT commands. There is of course still some room for improvement, but I think that pg is already at the point where user functionality would have to be sacrificed to obtain better performance. - `pg-new` `pg-new` is *almost* constant time. If no patches are currently applied or the -f switch is given then it runs in constant time; otherwise its time is proportional to your project size as it tries to verify that the output of `pg-status` is empty (there are no uncommitted changes). - `pg-series` The time required to run `pg-series` is proportional to the number of patches (both applied and unapplied) within this repository. The main time killer here appears to be some shell code that is executed for each patch. Future versions of pg may improve this time significantly - with the goal of making it nearly a constant time operation. - `pg-status` The time for `pg-status` is proportional to the project size as it must check each file to see if it is modified, extra, or gone. To improve performance `pg-status` only compares the file modification date with what is in the GIT index. If you perform an operation on a file which changes its modification date but not its content `pg-status` will still report it as modified ('M ') due to this optimization. `pg-status -q` takes slightly longer as it compares file contents, not file modification dates. If the modification date is different but the content is the same `pg-status -q` will update the GIT index with the new modification date so future uses of `pg-status` won't show the file. - `pg-ci` The time for `pg-ci` is proportional both to the size of the project and the number of files being checked in. The size of the project matters here as `pg-ci` must scan each file to see if it has been modified, and if so update the GIT index file so it will become part of the new commit object. This project wide scan is the primary reason `pg-ci` is slower than say `git-write-tree`. :-) Note that using the -m switch is faster than launching the editor as `pg-ci` can eliminate one (of two) scans over the project directory by being slightly less user friendly. - `pg-pop` The time for `pg-pop` is proportional to the number of files in the project. A pop is quite simply a rewind of `HEAD` followed by an update of the working directory to match; this means GIT must check each file in the project to see if it needs to be reverted (and if so revert it). `pg-pop` will nearly halve its running time if the -f switch is used as it doesn't bother to check for uncommitted changes before rewinding. Unfortunately this comes at a price: if you didn't commit something and didn't realize it when you popped the patch you lost those changes. - `pg-push` The time for `pg-push` can vary. If the push is trivial then its pretty much the same as `pg-pop`: its simply a fast-forward of `HEAD` and an update of the working directory. If however the push is non-trivial than a patch must be generated and applied; this takes time proportional to the number of commits contained within the patch and the number of files modified by those commits. BRANCHING --------- pg maintains different patch stacks for each GIT branch (where a branch is a file in the `refs/heads` directory). If another porcelain changes the current branch by changing the `HEAD` link pg will automatically follow to the new branch. Branch management is currently outside of the scope of pg. See `git-branch` as one way to manage branches. PORCELAIN COMPATIBILITY ----------------------- pg is written to be as compatible as possible with other GIT porcelains. This was actually a major design goal. For the most part pg doesn't care what other porcelains do to the repository. However creating a new GIT commit object (and thus advancing `HEAD`) in another porcelain will cause pg to behave as follows: - If no patches are applied: pg will behave as though the base is changing. When a patch is pushed down with `pg-push` it will be merged onto this new base. This may cause the patch to not push down cleanly if conflicts occur. This would be similar to what `pg-rebase` does. - If one or more patches are applied: pg will behave as though the new commit is a new revision of the top most applied patch (the patch shown with leader '+>' by `pg-series`). Popping this top most patch with `pg-pop` will cause the commit to be removed from the working directory; repushing the patch will cause the commit to return. [WARNING] If `HEAD` is rewound by another porcelain (or human!) to a commit which occurs before the first commit of the top most applied patch `pg-pop` will record the wrong `last` ref for the patch, but will pop cleanly. During a future `pg-push` the patch will not push correctly as it will attempt to generate and apply a patch which is traveling backwards in time. It is unlikely that `pg-pop` will ever check for this error condition. So just don't do funny things with `HEAD` (like suddenly rewind it) while you have any pg managed patches applied. If another porcelain changes the current branch by changing the `HEAD` link pg will automatically operate on the newly selected branch, possibly creating a new `refs/pg-patches/` subdirectory if necessary. Switching branches is really the only way that `HEAD` should be rewound back in time, as each branch has its own patch stack. CREATING A NEW REPOSITORY (REALLY NOT SUPPORTED) ------------------------------------------------ You probably don't want to create a new repository. A new repository should only be created if this is the first time you are putting these files under GIT's control. If the files already exist in a GIT repository *somewhere* then you really want to clone that repository instead. See 'CLONING AN EXISTING REPOSITORY' above. So you really want to create a new repository for a bunch of files? Here is the process: ------------------------------------------------ $ mkdir my-new-project $ cd my-new-project $ git-init-db $ git-update-ref HEAD \ $(echo "Creating empty repository." \ | git-commit-tree $(git-write-tree)) $ pg-new Initial-Creation $ touch .gitignore $ pg-add .gitignore $ cp ~/thefiles/* . $ pg-status $ pg-add -a $ pg-ci -m"Adding initial set of files." $ pg-fold -ay ------------------------------------------------ Why do I create a .gitignore file? Because `pg-add -a` is going to add *every* extra file (files with a status of x) to the repository. This may not be what you want. If there are files which shouldn't be part of the repository you should edit .gitignore to mask them out, so they don't appear in the output of `pg-status`. The `pg-fold -ay` after `pg-ci` will remove the single initial creation patch from the stack. This is probably a good idea as you won't want (or need) to pop the Initial-Creation patch. [NOTE] Advanced GIT users: no special setup or initialization step is required for pg. pg will automatically perform any setup tasks when it is first required to do so. Therefore you are free to construct a repository in whatever method suits you, or to just start using pg on an existing repository. pg requires that HEAD be a valid commit, which is why you can't use it to create the first commit. THE MAGIC OF git-am ------------------- If you are already very reliant on tools such as `git-am` (to process a mailbox of patches) you can still make use of pg. Remember that pg will always assume that any changes committed after `pg-new` or `pg-push` are to be part of top most applied patch, even if the commit doesn't occur through `pg-ci`. This lets you do something such as the following: ------------------------------------------------ $ pg-new Memory-Stuff $ git-am mbox1 $ git-am mbox2 $ ...crap...fixup merge... $ git-am --resolved $ git-am mbox3 $ pg-pop $ pg-push Other-Stuff $ pg-push Memory-Stuff ------------------------------------------------ Any commits created by any GIT tool between `pg-new` and `pg-pop` will be kept as part of the patch Memory-Stuff. Later when you push down Memory-Stuff on top of a patch which it hadn't been created on top of a two-way merge will be generated and written as a new commit. In this way the GIT project history will preserve all of the patch records which came in through the mboxes, as well as the larger maintainer patches. Contrast this with StGIT which (at least in Jan. 2006) won't keep the commit history or let you use git-am.