<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Insufficiently Random</title>
	<atom:link href="http://www.spearce.org/feed" rel="self" type="application/rss+xml" />
	<link>http://www.spearce.org</link>
	<description>The lonely musings of a loosely connected software developer.</description>
	<lastBuildDate>Wed, 10 Feb 2010 16:47:48 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>The Eclipse.org JGit follies continue&#8230;</title>
		<link>http://www.spearce.org/2010/02/the-eclipse-org-jgit-follies-continue.html</link>
		<comments>http://www.spearce.org/2010/02/the-eclipse-org-jgit-follies-continue.html#comments</comments>
		<pubDate>Wed, 10 Feb 2010 16:47:48 +0000</pubDate>
		<dc:creator>spearce</dc:creator>
				<category><![CDATA[Random Musings]]></category>
		<category><![CDATA[egit]]></category>
		<category><![CDATA[jgit]]></category>

		<guid isPermaLink="false">http://www.spearce.org/?p=144</guid>
		<description><![CDATA[Another day.  Another compliant from me about running a project at Eclipse.org.  This time it wound up in the jgit-dev mailing list archives, as replies to a thread that I think started from my blog post on the tragedy of Eclipse.
Instead of reposting the whole thing, I&#8217;ll just point to my two messages [...]]]></description>
			<content:encoded><![CDATA[<p>Another day.  Another compliant from me about running a project at Eclipse.org.  This time it wound up in the <a href="http://dev.eclipse.org/mhonarc/lists/jgit-dev">jgit-dev mailing list archives</a>, as replies to <a href="http://dev.eclipse.org/mhonarc/lists/jgit-dev/msg00103.html">a thread</a> that I think started from my blog post on <a href="http://www.spearce.org/2010/02/the-tragedy-of-eclipse-org.html">the tragedy of Eclipse</a>.</p>
<p>Instead of reposting the whole thing, I&#8217;ll just point to my two messages in context:</p>
<p><a href="http://dev.eclipse.org/mhonarc/lists/jgit-dev/msg00109.html">why do I need to spend my time on this crap?</a><br />
<a href="http://dev.eclipse.org/mhonarc/lists/jgit-dev/msg00111.html">why is the new file header what it is?</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.spearce.org/2010/02/the-eclipse-org-jgit-follies-continue.html/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>The tragedy of Eclipse.org</title>
		<link>http://www.spearce.org/2010/02/the-tragedy-of-eclipse-org.html</link>
		<comments>http://www.spearce.org/2010/02/the-tragedy-of-eclipse-org.html#comments</comments>
		<pubDate>Tue, 09 Feb 2010 02:28:17 +0000</pubDate>
		<dc:creator>spearce</dc:creator>
				<category><![CDATA[Random Musings]]></category>
		<category><![CDATA[egit]]></category>
		<category><![CDATA[jgit]]></category>

		<guid isPermaLink="false">http://www.spearce.org/?p=124</guid>
		<description><![CDATA[I&#8217;ve probably posted something about this before.  But I&#8217;m really getting fed up with the Eclipse Development Process.  Its a frelling nightmare for a committer to work with.  I&#8217;m really starting to regret moving JGit there.
Right now, if I have X hours to work on a project, I seem to be averaging [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve probably posted something about this before.  But I&#8217;m really getting fed up with the <a href="http://www.eclipse.org/projects/dev_process/development_process.php">Eclipse Development Process</a>.  Its a frelling nightmare for a committer to work with.  I&#8217;m really starting to regret moving JGit there.</p>
<p>Right now, if I have X hours to work on a project, I seem to be averaging what feels like X/2 hours in paperwork and other &#8220;important steps&#8221; of the development process.  None of which have helped my project to ship higher quality, or more feature complete code.  Which means either my or my employer&#8217;s time is being wasted.  I don&#8217;t have time to waste when I have 108 bugs open in <a href="http://code.google.com/p/gerrit/">Gerrit Code Review</a>, and 64 bugs open in <a href="http://www.eclipse.org/EGit">EGit</a> and <a href="http://www.eclipse.org/jgit/">JGit</a>.</p>
<p>Based on a private email chain I&#8217;m having with the Eclipse IP review team, it looks like the initial EGit code contribution was bungled not just by myself, but also by the foundation&#8217;s IP review process.  Which means I probably have to run EGit back through IP review, almost from scratch.  But only after I write a script to datamine contributors out of the <a href="http://repo.or.cz/w/egit.git/shortlog/refs/heads/historical/pre-eclipse">old EGit history</a> and inject a complete, per-file <code>git short-log</code> into each file header.  Its a good thing I have an <a href="http://git-scm.com/">awesome version control system like Git</a> to keep these records for me.  Too bad nobody else on the planet can use it to obtain information they might want to know about our source code.  I guess running software to read information about a file is too scary for some individuals.  So I have to do it for them.  Now, and for every change we make in the future.  Yay.  <img src='http://www.spearce.org/wordpress/wp-includes/images/smilies/icon_sad.gif' alt=':-(' class='wp-smiley' /> </p>
<p>The astute reader may notice in that above paragraph, &#8220;private email chain&#8221; doesn&#8217;t jive with other publications from the Eclipse Foundation demanding that projects be run in an open and transparent manner (see how do I start a project on <a href="http://www.eclipse.org/home/newcomers.php">Eclipse Newcomers</a>).  I really do feel like JGit is a less open project now that it has moved to Eclipse.org.  Conversations with the Eclipse IP team about the legal status of any contribution is always discussed by private email.  These things never make it to the project mailing list.  The IPzilla database is closed to everyone but committers.  There are backroom deals going on about what our file headers should look like in order to sufficiently convey that the source code is under the <a href="http://www.opensource.org/licenses/bsd-license.php">new-style BSD</a>.  The discussion that led to the approval of the <a href="http://www.eclipse.org/egit/iplog/v0.7.0.pdf">EGit IP log for 0.7.0</a>, approved despite what appears to be an error in the initial review, also happened by private email.</p>
<p>It took a significant amount of effort on my part to even get JGit hosted at Eclipse.org.  Originally, the new-style BSD license wasn&#8217;t permissible for a hosted project, and I had to seek a special exemption from the Eclipse Board of Directors.  A process that required significant backroom conversations, over at least 6 months.  Again, not exactly open.  The only reason I think I haven&#8217;t pulled the project back is because of the huge initial investment I&#8217;ve already made in this.</p>
<p>Maybe JGit and EGit are just unique projects.  But in my experience, I am not a unique snowflake, and neither is my work.  I&#8217;m not as special as I might seem at first glance.</p>
<p>I wouldn&#8217;t be surprised if I&#8217;ve lost at least 2 days every month to paperwork.  That&#8217;s about 30 days, or 1.5 person-months since the project really started this move in January 2009.  1/12 of my time over the past year has just gone to catering to the Eclipse development process.  Food for thought.  Join Eclipse&#8230; make sure you pick up at least 1/12 of another full-time developer just to deal with the red-tape.</p>
<p>The part that really troubles me with the red-tape isn&#8217;t so much that it is there, but that committers bear the brunt of the effort, while large corporations that are <a href="http://www.eclipse.org/membership/showMembersWithTag.php?TagID=strategic">strategic members</a> reap the benefits of having a concise change history listed inside of each source code file, or knowing that every contributor who ever touched this source code has been <a href="https://bugs.eclipse.org/bugs/show_bug.cgi?id=300397#c4">grilled in detail on a bug tracker</a>.</p>
<p>So back to my post title.  The real tragedy is, these corporations who sell commercial products based on top of Eclipse.org distributions are pushing not just the open source development work, but also a whole ton of onerous legal and reporting constraints back onto their project committers.  Its enough to make this committer start to reconsider things.  I wish I had been using a time clock this past year, to accurately record how many days the Eclipse development process has robbed me of since the start of all of this.  It feels significant enough that if I went to my manager with it, I think he&#8217;d go ballistic.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.spearce.org/2010/02/the-tragedy-of-eclipse-org.html/feed</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Why commit messages matter</title>
		<link>http://www.spearce.org/2010/02/why-commit-messages-matter.html</link>
		<comments>http://www.spearce.org/2010/02/why-commit-messages-matter.html#comments</comments>
		<pubDate>Thu, 04 Feb 2010 19:04:12 +0000</pubDate>
		<dc:creator>spearce</dc:creator>
				<category><![CDATA[Random Musings]]></category>

		<guid isPermaLink="false">http://www.spearce.org/?p=112</guid>
		<description><![CDATA[Some folks wonder why I want longer, detailed commit messages in a project.  Often other people claim &#8220;Fix the frobinator bug when it frobs too slow&#8221; might be sufficiently detailed to cover a change.  But its usually not.

As you explored the issue and tried to understand the problem, you filled your head up [...]]]></description>
			<content:encoded><![CDATA[<p>Some folks wonder why I want longer, detailed commit messages in a project.  Often other people claim &#8220;Fix the frobinator bug when it frobs too slow&#8221; might be sufficiently detailed to cover a change.  But its usually not.<br />
<span id="more-112"></span><br />
As you explored the issue and tried to understand the problem, you filled your head up with important details about how the frobinator works, what a frob even is, what a slow frob looks like, and why a slow frob shouldn&#8217;t be permitted in this context.  All of this information is necessary for you to understand the problem and code a patch that resolves it.  Moreover, if this detail wasn&#8217;t necessary for you to code the patch, you wouldn&#8217;t have had the slow frobbing in the first place.  It would have been fairly obvious at the time of original development.</p>
<p>Commit messages, when combined with a powerful blame engine in your version control, can give you really powerful insight into what you were thinking at the time.  This can be incredibly handy when someone asks a question later.</p>
<p>Yesterday, <a href="http://gitster.livejournal.com/">Junio Hamano</a>, git maintainer extraordinaire, asked me why <a href="http://git.spearce.org/?p=git-gui.git;a=summary">git-gui</a> implements its own clone function.  When I wrote this code, it must have been really obvious to me why it needed to reimplement the same logic as <a href="http://www.kernel.org/pub/software/scm/git/docs/git-clone.html"><code>git clone</code></a>.  But I wrote it back in 2007.  I&#8217;ve done a ton of things since then.  There&#8217;s no way I can remember what I was doing, or why I was doing it.  I do however remember thinking, &#8220;this code is done, it works, I&#8217;ll never have to look at or think about it again&#8221;.  Famous last words.</p>
<p>When Junio asked this question&#8230; I honestly couldn&#8217;t remember what I was doing.  I&#8217;m usually somewhat against reinventing the wheel, and I try to avoid rewriting something unless I seem to have a good reason for it.  So I really was looking at his question saying, &#8220;yea, why did I do that there&#8230;&#8221;.</p>
<p>Fortunately, I write fairly detailed commit messages, and <a href="http://www.kernel.org/pub/software/scm/git/docs/git-blame.html"><code>git blame</code></a> is an incredible tool:</p>
<blockquote><pre>
  $ git blame lib/choose_repository.tcl
  ...
  81d4d3dd (Shawn O. Pearce     2007-09-24 08:40:44 -0400  633)
  81d4d3dd (Shawn O. Pearce     2007-09-24 08:40:44 -0400  634)           $o_cons start \
  81d4d3dd (Shawn O. Pearce     2007-09-24 08:40:44 -0400  635)                   [mc "Counting objects"] \
  81d4d3dd (Shawn O. Pearce     2007-09-24 08:40:44 -0400  636)                   [mc "buckets"]
  ...
  81d4d3dd (Shawn O. Pearce     2007-09-24 08:40:44 -0400  673)           update
  81d4d3dd (Shawn O. Pearce     2007-09-24 08:40:44 -0400  674)
  ab08b363 (Shawn O. Pearce     2007-09-22 03:47:43 -0400  675)           file mkdir [file join .git objects pack]
  ab08b363 (Shawn O. Pearce     2007-09-22 03:47:43 -0400  676)           foreach i [glob -tails -nocomplain \
  ab08b363 (Shawn O. Pearce     2007-09-22 03:47:43 -0400  677)                   -directory [file join $objdir pack] *] {
  ab08b363 (Shawn O. Pearce     2007-09-22 03:47:43 -0400  678)                   lappend tolink [file join pack $i]
  ab08b363 (Shawn O. Pearce     2007-09-22 03:47:43 -0400  679)           }
  81d4d3dd (Shawn O. Pearce     2007-09-24 08:40:44 -0400  680)           $o_cons update [incr bcur] $bcnt
  81d4d3dd (Shawn O. Pearce     2007-09-24 08:40:44 -0400  681)           update
  81d4d3dd (Shawn O. Pearce     2007-09-24 08:40:44 -0400  682)
  81d4d3dd (Shawn O. Pearce     2007-09-24 08:40:44 -0400  683)           foreach i $buckets {
  ab08b363 (Shawn O. Pearce     2007-09-22 03:47:43 -0400  684)                   file mkdir [file join .git objects $i]
  ab08b363 (Shawn O. Pearce     2007-09-22 03:47:43 -0400  685)                   foreach j [glob -tails -nocomplain \
  ab08b363 (Shawn O. Pearce     2007-09-22 03:47:43 -0400  686)                           -directory [file join $objdir $i] *] {
  ab08b363 (Shawn O. Pearce     2007-09-22 03:47:43 -0400  687)                           lappend tolink [file join $i $j]
  ab08b363 (Shawn O. Pearce     2007-09-22 03:47:43 -0400  688)                   }
  81d4d3dd (Shawn O. Pearce     2007-09-24 08:40:44 -0400  689)                   $o_cons update [incr bcur] $bcnt
  81d4d3dd (Shawn O. Pearce     2007-09-24 08:40:44 -0400  690)                   update
  ab08b363 (Shawn O. Pearce     2007-09-22 03:47:43 -0400  691)           }
</pre>
</blockquote>
<p>It would seem that <a href="http://git.spearce.org/?p=git-gui.git;a=commit;h=81d4d3dddc5e96aea45a2623c9b1840491348b92"><code>81d4d3dd</code></a>, and <a href="http://git.spearce.org/?p=git-gui.git;a=commit;h=ab08b3630414dfb867825c4a5828438e1c69199d"><code>ab08b363</code></a> are commits adding code to do a clone.</p>
<blockquote><pre>
  $ git show 81d4d3dd
  commit 81d4d3dddc5e96aea45a2623c9b1840491348b92
  Author: Shawn O. Pearce <spearce <at> spearce.org>
  Date:   Mon Sep 24 08:40:44 2007 -0400

    git-gui: Keep the UI responsive while counting objects in clone

    If we are doing a "standard" clone by way of hardlinking the
    objects (or copying them if hardlinks are not available) the
    UI can freeze up for a good few seconds while Tcl scans all
    of the object directories.  This is espeically noticed on a
    Windows system when you are working off network shares and
    need to wait for both the NT overheads and the network.

    We now show a progress bar as we count the objects and build
    our list of things to copy.  This keeps the user amused and
    also makes sure we run the Tk event loop often enough that
    the window can still be dragged around the desktop.

    Signed-off-by: Shawn O. Pearce <spearce <at> spearce.org>

  $ git show ab08b363
  commit ab08b3630414dfb867825c4a5828438e1c69199d
  Author: Shawn O. Pearce <spearce <at> spearce.org>
  Date:   Sat Sep 22 03:47:43 2007 -0400

    git-gui: Allow users to choose/create/clone a repository
  &hellip;
    Rather than relying on the git-clone Porcelain that ships with
    git we build the new repository ourselves and then obtain content
    by git-fetch.  This technique simplifies the entire clone process
    to roughly: `git init &#038;&#038; git fetch &#038;&#038; git pull`.  Today we use
    three passes with git-fetch; the first pass gets us the bulk of
    the objects and the branches, the second pass gets us the tags,
    and the final pass gets us the current value of HEAD to initialize
    the default branch.

    If the source repository is on the local disk we try to use a
    hardlink to connect the objects into the new clone as this can
    be many times faster than copying the objects or packing them and
    passing the data through a pipe to index-pack.  Unlike git-clone
    we stick to pure Tcl [file link -hard] operation thus avoiding the
    need to fork a cpio process to setup the hardlinks.  If hardlinks
    do not appear to be supported (e.g. filesystem doesn't allow them or
    we are crossing filesystem boundaries) we use file copying instead.

    Signed-off-by: Shawn O. Pearce <spearce <at> spearce.org>
</pre>
</blockquote>
<p>So 30 seconds after being asked, I&#8217;ve managed to remember this was mostly about git-gui on Windows, where Cygwin can be pretty slow for file operations, and hardlinks are available on NTFS if your application knows how to make them.  By doing the clone logic within Tcl, which is a native Win32 application, we can bypass Cygwin overheads, including the need to fork and execute a bunch of commands from the <code>git-clone.sh</code> shell script.  Because, back in 2007, git-clone was still just a shell script.</p>
<p>In hindsight, that paragraph above should also be in the commit messages.  And I probably should have ported git clone to C instead.  Its C now, but not because of my efforts.  And now git-gui maybe should just call it.  It would have made git-gui a whole lot smaller.</p>
<p>You can follow the <a href="http://thread.gmane.org/gmane.comp.version-control.git/138612/focus=138874">rest of the thread</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.spearce.org/2010/02/why-commit-messages-matter.html/feed</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>How class names can go horribly wrong</title>
		<link>http://www.spearce.org/2010/01/how-class-names-can-go-horribly-wrong.html</link>
		<comments>http://www.spearce.org/2010/01/how-class-names-can-go-horribly-wrong.html#comments</comments>
		<pubDate>Tue, 05 Jan 2010 01:28:33 +0000</pubDate>
		<dc:creator>spearce</dc:creator>
				<category><![CDATA[Random Musings]]></category>

		<guid isPermaLink="false">http://www.spearce.org/?p=108</guid>
		<description><![CDATA[Somehow I found myself writing this in a JGit test case:


assertTrue("isa TransportHttp", t instanceof TransportHttp);
assertTrue("isa HttpTransport", t instanceof HttpTransport);


What is wrong with me&#8230;
]]></description>
			<content:encoded><![CDATA[<p>Somehow I found myself writing this in a JGit test case:</p>
<blockquote>
<pre>
assertTrue("isa TransportHttp", t instanceof TransportHttp);
assertTrue("isa HttpTransport", t instanceof HttpTransport);
</pre>
</blockquote>
<p>What is wrong with me&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.spearce.org/2010/01/how-class-names-can-go-horribly-wrong.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Eclipse is the new RPM hell</title>
		<link>http://www.spearce.org/2009/12/eclipse-is-the-new-rpm-hell.html</link>
		<comments>http://www.spearce.org/2009/12/eclipse-is-the-new-rpm-hell.html#comments</comments>
		<pubDate>Wed, 16 Dec 2009 00:57:50 +0000</pubDate>
		<dc:creator>spearce</dc:creator>
				<category><![CDATA[Random Musings]]></category>

		<guid isPermaLink="false">http://www.spearce.org/?p=105</guid>
		<description><![CDATA[Remember back when RedHat was the best Linux distribution?  If not, let me remind you of RPM hell.  A situation where you can&#8217;t install Foo 1.X, because it needs Bar 2.3.Y, but you also have Zidget 8.Z installed and that needs Bar 2.2.Y.  Long story short, you can have either Foo or [...]]]></description>
			<content:encoded><![CDATA[<p>Remember back when RedHat was the best Linux distribution?  If not, let me remind you of <a href="http://www.germane-software.com/~ser/Files/Essays/RPM_Hell.html">RPM hell</a>.  A situation where you can&#8217;t install Foo 1.X, because it needs Bar 2.3.Y, but you also have Zidget 8.Z installed and that needs Bar 2.2.Y.  Long story short, you can have either Foo or Zidget on your system, but not both.</p>
<p>Today I just tried to install the <a href="http://www.eclipse.org/tptp/">Eclipse Test &amp; Performance Tools</a>, so I could try to find out why <a href="http://mina.apache.org/sshd/">Apache MINA SSHD</a> has such poor throughput during uploads into the server.  Unfortunately I can&#8217;t run version 4.5 because it <a href="https://bugs.eclipse.org/bugs/show_bug.cgi?id=240677#c2">depends on a decade old version of libstdc++</a>.  But I can&#8217;t install version 4.6, which supposedly has a newer linkage, because the 4.6 requires SWT 3.4.0.  But I&#8217;m running Eclipse 3.4.2, which apparently does not have a new enough SWT.</p>
<p>Folks.  Seriously?</p>
<p>The only major consumer of SWT that really matters is the Eclipse SDK.  And the SDK platform version numbers don&#8217;t match SWT version numbers.  And the test and performance tools require a decade old shared library which isn&#8217;t even distributed with <a href="http://releases.ubuntu.com/hardy/">Ubuntu Hardy</a>.</p>
<p>I guess since its Java, its OK to repeat decade old mistakes, because its in a different programming language.</p>
<p> <img src='http://www.spearce.org/wordpress/wp-includes/images/smilies/icon_sad.gif' alt=':-(' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.spearce.org/2009/12/eclipse-is-the-new-rpm-hell.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>My ancient history and the art of software</title>
		<link>http://www.spearce.org/2009/12/my-ancient-history-and-the-art-of-software.html</link>
		<comments>http://www.spearce.org/2009/12/my-ancient-history-and-the-art-of-software.html#comments</comments>
		<pubDate>Wed, 09 Dec 2009 08:59:09 +0000</pubDate>
		<dc:creator>spearce</dc:creator>
				<category><![CDATA[Random Musings]]></category>

		<guid isPermaLink="false">http://www.spearce.org/?p=96</guid>
		<description><![CDATA[This week I&#8217;m traveling.  I&#8217;m in Miami for the 2 day Eclipse board of directors meeting.  When I&#8217;m forced to fly, which I usually try to avoid doing, I tend to take a stack of books with me to try and pass the time on the airplane.  Unfortunately, airline flight speeds haven&#8217;t quite caught up [...]]]></description>
			<content:encoded><![CDATA[<p>This week I&#8217;m traveling.  I&#8217;m in Miami for the 2 day Eclipse board of directors meeting.  When I&#8217;m forced to fly, which I usually try to avoid doing, I tend to take a stack of books with me to try and pass the time on the airplane.  Unfortunately, airline flight speeds haven&#8217;t quite caught up with Moore&#8217;s law as it relates to the power draw of modern laptops.  Nor has the airline seat space kept up with the size of my 15&#8243; PowerBook and my increasing girth.  So books it is.</p>
<p>So I&#8217;m currently reading Peter Seibel&#8217;s <a href="http://www.codersatwork.com/">Coders at Work</a>.  Tonight, while reading through Brendan Eich&#8217;s interview and his history at Netscape, it reminded me of my own development history around the same time period.  I really don&#8217;t talk about my past very much, so before I get to be too old to remember it, I might as well write some of it down.  :-)</p>
<p><span id="more-96"></span>During the mid-to-late 90&#8217;s I was still in high school, but was working part-time in the afternoons after school as a software developer (err, script monkey) for a now defunct website and ISP called <a href="http://web.archive.org/web/19961106235534/http://www.injersey.com/">INJersey</a>.  INJersey was the aspiring, and perhaps too early for its time, online arm of the local daily print paper, the <a href="http://www.app.com/">Asbury Park Press</a>.  Somehow its owners realized this Internet thing was worth investing in, before a lot of users figured it out and actually got online.</p>
<p>Back then, we didn&#8217;t have <a href="http://httpd.apache.org/">Apache</a> and all of its module glory.  We had a pile of patch files you had to apply to NSCA httpd to get <strong><a href="http://httpd.apache.org/docs/1.3/misc/FAQ.html#name">A PA</a></strong><a href="http://httpd.apache.org/docs/1.3/misc/FAQ.html#name">t</a><strong><a href="http://httpd.apache.org/docs/1.3/misc/FAQ.html#name">CH</a></strong><a href="http://httpd.apache.org/docs/1.3/misc/FAQ.html#name">y server</a>.  We didn&#8217;t have FastCGI, we had plain old CGI scripts, often written in Perl 4, where <code>&amp;article'render()</code> was an actual function call and not some syntax error brought on by a lack of caffeine.  We didn&#8217;t even have JavaScript or animated images.  <a href-"http://www.google.com/search?q=netscape+blink+tag">&lt;blink&gt;</a> was as good as it got.</p>
<p>My job back then was really simple.  Take Microsoft Office documents from writers who couldn&#8217;t be bothered to learn HTML, and put them online in HTML.  I wrote a lot of Perl and AppleScript to rip the stuff apart and put it back together again as plain HTML files that we could serve to web users.  This was out of shear laziness, the <a href="http://c2.com/cgi/wiki?LazinessImpatienceHubris">first great virtue of a programmer</a>.  INJersey hired me not as a software developer, but as a data entry monkey to copy and paste the text from Office into an HTML document, and stick in the &lt;td&gt; or &lt;b&gt; tag when necessary.  I quickly grew bored with that task and found it much more fun to write scripts to do my job for me, while I explored the wonders of a <a href="http://en.wikipedia.org/wiki/T-carrier">T1</a> internet connection.</p>
<p>My managers quickly realized I was able to do more than just copy and paste text with a computer, so they started giving me simple programming assignments.  When Netscape 2.0 launched, one of our real programmers figured out we could do <a href="http://oreilly.com/openbook/cgi/ch11_01.html">server-push based animated images</a>.  This is one of those monstrously stupid ideas that I&#8217;m glad has died on the web.  I&#8217;m quite happy that I actually can&#8217;t come up with a great reference link for it anymore.  Anyway, INJersey&#8217;s website was just awesome one day, because we had animated images, and others didn&#8217;t.</p>
<p>About this time one of my managers saw this website called JangaChat.  It was a free online web based chat room system.  No IRC client needed.  No Java applets.  No plugins.  Just Netscape 2.0.  Even better, they allowed HTML to be entered without escaping, so you could write fancy messages like &#8220;Bob, that was the best fish &lt;b&gt;ever&lt;/b&gt;!&#8221; and actually have it come out in bold.  Their site was more awesome than our animated images.  We somehow had to out awesome them again.</p>
<p>Remember, this is like 1995.  We just got <a href="http://www.antipope.org/charlie/journo/netscape.html">Netscape 2.0</a>, and <a href="http://inventors.about.com/od/jstartinventions/a/JavaScript.htm">Brendan Eich</a> had just unleased this JavaScript thing on the world.  We didn&#8217;t know what we could do with it&#8230; or the damage it could cause.  Cross site scripting hadn&#8217;t even been invented yet.</p>
<p>My managers gave me a simple task, create an INJersey version of JangaChat that we could run on our own servers, so our users could chat online in real time about whatever they felt like, without needing to first install an IRC client.</p>
<p>At the time, I only really knew Perl, and was only self-taught at that.  So I wrote the first version as a Perl CGI.  I remember abusing the server-based multipart push feature we used for image animation to allow the Perl CGI to stream new messages to the browser as they arrived in the chat room.  JangaChat, and this system I worked on at INJersey, may have been some of the first uses of <a href="http://alex.dojotoolkit.org/2006/03/comet-low-latency-data-for-the-browser/">hanging GET</a>s.</p>
<p>Unfortunately, not only did I not know C, I also didn&#8217;t know how to do proper interprocess communication on UNIX.  So I implemented with what I did know:  each chat room was assigned a local file.  New messages were appended onto the end of the file using a POST CGI, and messages were read out to active users by their hanging GET CGIs, which were continuously trying to read the tail of the room&#8217;s file.  Over a decade later, I can&#8217;t believe I was once foolish enough to believe this was a good idea.</p>
<p>We launched under the name <a href="http://web.archive.org/web/19980111133651/http://chat.injersey.com/">ChatterBox</a>.  We quickly stole most of the users from JangaChat, plus picked up our own users, and our little Pentium 133 server could barely keep up with all of those Perl based hanging GET CGI processes demanding resources.  Messages were also often truncated or otherwise badly mangled, as I had no file locking, and no way to ensure a full message was written before being read and sent to a browser.</p>
<p>Management really wanted this product to work, because the demo was flashy, and their local advertising customers were wowed by it.  I rewrote the Perl code into C, but kept the basic design of individual hanging GET CGIs per user, and a single log file per chat room.  Even in C, we still couldn&#8217;t keep up with traffic.  Somehow I convinced management that buying some &#8220;real server hardware&#8221;, rather than our tiny single processor Pentium 133, would solve the scaling problems, and they went out and purchased a pair of <a href="http://en.wikipedia.org/wiki/SGI_Origin_200">SGI Origin 200</a> servers running Irix.  Today of course, all 3 of you reading this are screaming &#8220;but its the software that wasn&#8217;t scalable!&#8221;.  That&#8217;s what a decade of learning gets you.  :-)</p>
<p>So upon getting this shiny new server hardware, I start thinking about how I might have done things differently.  I knew I couldn&#8217;t rely on the (then pretty crappy and unscalable) dbm library for user data management, like I had with Perl.  So I started poking around at things like <a href="http://www.hughes.com.au/products/msql/">mSQL</a>, <a href="http://www.postgresql.org/">PostgreSQL</a>, and <a href="http://www.mysql.com/">MySQL</a>.  Back in 1996, mSQL was the best there was, but it was single threaded.  PostgreSQL was pretty slow and barely ran, it was more of a research project than it was a production database engine.  MySQL crashed more than it stayed up, severely lacked features compared to mSQL, and required nasty pthread libraries which weren&#8217;t exactly standard or well supported on UNIX systems back then.</p>
<p>So I toyed with writing my own.  As it turned out, Irix had a decent pthread implementation at the time.  I actually ended up with a thread safe, balanced B*-tree that stored user record data in fixed size leaf nodes, and used an arbitrary byte sequence as the record key.  To make the implementation easier on myself, I made every block of the file the same size (I think I used 2048 bytes, but I can&#8217;t remember), and the parent that pointed to the block told you the type using the lower bits of the file offset.  E.g. offset &amp; 1 == 0 meant the block was another btree intermediate node, while offset &amp; 1 == 1 meant it was an application data record.</p>
<p>Within the user data record, I didn&#8217;t want to write all of the glue required for a formal DDL like a proper SQL server would support.  After all, I just had to store a couple of values for each user account, like their email address, date of last login, and a handful of preference settings.  So I basically did what <a href="http://code.google.com/apis/protocolbuffers/docs/overview.html">Google protocol buffers</a> does, and stored a flexible bag of key/value pairs in binary form.  Schema changes were accomplished incrementally, as users were updated during normal application processing, rather than all up front when the software changed.</p>
<p>So, I solved my data storage problem by just rolling my own system.  Looking back on it, its not too different from the approach Google takes with <a href="http://labs.google.com/papers/bigtable.html">BigTable</a>.  Only they scale across machines by partitioning the key space, while I just tossed the entire key space onto a single node.</p>
<p>For the interactive web components, I finally got a clue and realized that the hanging GET CGI processes were my real scaling factor.  They consumed too much of the system&#8217;s resources per process.  My local UNIX system administrator finally got around to telling me about sar and ps and how to use them to figure out that I was being a moron.</p>
<p>Since we still didn&#8217;t have FastCGI, I wrote my own HTTP server.  From scratch.  In C.  First and last time I&#8217;ve ever done that.  Lesson learned.  There&#8217;s a reason I haven&#8217;t been involved in the Apache HTTPd project.  My scars still haven&#8217;t healed.</p>
<p>Initially, my HTTP server was using pthreads, spawning a pthread per browser request.  Which meant that my server used 1 thread per hanging GET.  At first I thought this would be OK, threads were lightweight compared to a process, right?  Until my local UNIX administrator brought out the clue stick and made me think about it for a few minutes.  In the process-per-connection implementation we had very little allocated memory, just a kilobyte or so of global state, and the entire program executable was mapped as shared memory between all of the instances.  The real memory cost was in the program&#8217;s stack, which the OS had a minimum size on.  With pthreads we didn&#8217;t gain much resource savings over the process-per-connection approach because we still had the same per-connection state, and the same minimum thread stack size.</p>
<p>I started digging around the Irix manual pages, and one day found this nifty thing called <a href="http://linux.die.net/man/2/select">select()</a>.  I wrote a quick simple server using it, and realized non-blocking asynchronous IO was the best thing since sliced bread.  That night I completely tore apart the pthread based HTTP server and rewrote it as a non-blocking, asynchronous IO server.</p>
<p>By now I had completely abandoned the idea of the message distribution going through a local log file, and instead stored it in shared memory protected by pthread mutexes and condition variables.  Threads posting messages into a chat room appended their message object onto a distribution queue and signaled one of the IO threads to start streaming the messages out using asynchronous IO to each connected browser.</p>
<p>Along the way I also had to write an HTML parser, because we wanted to allow messages like &#8220;Here is a &lt;font color=red&gt;red rose&lt;/font&gt;&#8221; to render as intended by the author, but we didn&#8217;t want to have unclosed tags like &#8220;&lt;h1&gt;Hi!&#8221; ruin the formatting for every subsequent message on the page.  End users also figured out cross site scripting before we did, with plenty of &#8220;&lt;script&gt;window.close()&lt;/script&gt;&#8221; nosense being pasted into rooms and closing hundreds of browsers at once.  Yes, we learned about cross site scripting attacks the hard way.  Back in 1996/97, nobody really thought about this sort of stuff.</p>
<p>We relaunched in the winter of 1997.  The Wayback machine has a snapshot of the rewritten <a href="http://web.archive.org/web/19980111133651/http://chat.injersey.com/">homepage dated April 13, 1997</a>.</p>
<p>Of all of that though, the thing I&#8217;m still most excited about is the &#8220;non-Shockwave&#8221; version of the ChatterBox website.  Back then we didn&#8217;t have AJAX and XMLHttpRequest.  Heck, we didn&#8217;t even have iframe.  We had straight up boring &lt;frameset&gt;.  I rolled my own AJAX system from scratch using JavaScript, a hand rolled message queue, and a 1 pixel high frame in a frameset that performed POSTs driven by JavaScript.  In very early 1997.  Apparently I discovered parts of the web that we didn&#8217;t see again until XMLHttpRequest was widely supported.</p>
<p>Unfortunately management pulled the plug a year or so later, and that neat little HTTP server with its pre-AJAX AJAX and pre-Google Talk hanging GETs disappeared into the sands of time.</p>
<p>So, tonight, while reading Brendan Eich&#8217;s interview talking about building JavaScript in two weeks, I remembered just how much I learned that year in my part time job.  I went from being a Perl script monkey who couldn&#8217;t even compile a C program, to having written an asynchronous IO HTTP server from the ground up, mastered multi-threading, mutexes, condition variables, AJAX, cross site scripting, and a balanced binary tree implementation that bears a lot of resemblance to BigTable.  All without taking any CS classes in high school.</p>
<p>Maybe that&#8217;s why college, and my later jobs, were all so boring for me.  I&#8217;d done all of it already.  Maybe that&#8217;s why I was less amazed than my peers when Google Maps launched.  I didn&#8217;t make something nearly as awesome, or as empowering to so many people around the world as Lars and the Maps team did.  But I had certainly seen, worked with, and discovered much of today&#8217;s web.  In 1997.</p>
<p>What is <a href="http://www.whatwg.org/">HTML 5</a> going to bring us that we haven&#8217;t discovered yet?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.spearce.org/2009/12/my-ancient-history-and-the-art-of-software.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Introduction to Gerrit Code Review</title>
		<link>http://www.spearce.org/2009/12/introduction-to-gerrit-code-review.html</link>
		<comments>http://www.spearce.org/2009/12/introduction-to-gerrit-code-review.html#comments</comments>
		<pubDate>Tue, 08 Dec 2009 08:19:02 +0000</pubDate>
		<dc:creator>spearce</dc:creator>
				<category><![CDATA[Random Musings]]></category>
		<category><![CDATA[gerrit]]></category>

		<guid isPermaLink="false">http://www.spearce.org/?p=94</guid>
		<description><![CDATA[Yesterday R. Tyler Ballance (aka rtyler on #git) started poking me about Gerrit Code Review.  One day later, he&#8217;s writing an amazing blog post describing how to install the latest development version, and why it so awesome to use for team development.  He even has screenshots!  :-)
Thanks rtyler.
]]></description>
			<content:encoded><![CDATA[<p>Yesterday R. Tyler Ballance (aka rtyler on <a href="irc://irc.freenode.net/%26git">#git</a>) started poking me about <a href="http://code.google.com/p/gerrit/">Gerrit Code Review</a>.  One day later, he&#8217;s writing an <a href="http://unethicalblogger.com/posts/2009/12/code_review_gerrit_mostly_visual_guide">amazing blog post</a> describing how to install the latest development version, and why it so awesome to use for team development.  He even has screenshots!  :-)</p>
<p>Thanks rtyler.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.spearce.org/2009/12/introduction-to-gerrit-code-review.html/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>EGit in 2010?</title>
		<link>http://www.spearce.org/2009/12/egit-in-2010.html</link>
		<comments>http://www.spearce.org/2009/12/egit-in-2010.html#comments</comments>
		<pubDate>Mon, 07 Dec 2009 21:37:42 +0000</pubDate>
		<dc:creator>spearce</dc:creator>
				<category><![CDATA[Random Musings]]></category>

		<guid isPermaLink="false">http://www.spearce.org/?p=92</guid>
		<description><![CDATA[Mike just described some changes coming to Eclipse in 2010, including Git for projects.  Yay!
However, he&#8217;s right.  EGit has to get better, fast.  We need more contributors who know and love the SWT/JFace/Resource APIs and can crank out the UI improvements necessary to bring the Git team provider up to the same level as the [...]]]></description>
			<content:encoded><![CDATA[<p>Mike just described some changes coming to Eclipse in 2010, <a href="http://dev.eclipse.org/blogs/mike/2009/12/07/project-community-enhancements-for-2010/">including Git for projects</a>.  Yay!</p>
<p>However, he&#8217;s right.  EGit has to get better, fast.  We need more contributors who know and love the SWT/JFace/Resource APIs and can crank out the UI improvements necessary to bring the Git team provider up to the same level as the CVS team provider.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.spearce.org/2009/12/egit-in-2010.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Moving my git repositories</title>
		<link>http://www.spearce.org/2009/12/moving-my-git-repositories.html</link>
		<comments>http://www.spearce.org/2009/12/moving-my-git-repositories.html#comments</comments>
		<pubDate>Sat, 05 Dec 2009 22:01:27 +0000</pubDate>
		<dc:creator>spearce</dc:creator>
				<category><![CDATA[Random Musings]]></category>
		<category><![CDATA[SCM]]></category>

		<guid isPermaLink="false">http://www.spearce.org/?p=88</guid>
		<description><![CDATA[I finally got around to creating git.spearce.org.  I&#8217;ve been contributing to Git since Feb 17 2006 and yet I couldn&#8217;t be bothered to setup my own Git host for my repositories.  For the past 3 years I&#8217;ve primarily leaned on Pasky&#8217;s excellent repo.or.cz service.  But I&#8217;ve always wanted a more permanent home for my projects.
Last week I [...]]]></description>
			<content:encoded><![CDATA[<p>I finally got around to creating <a href="http://git.spearce.org/">git.spearce.org</a>.  I&#8217;ve been contributing to Git since <a href="http://git.kernel.org/?p=git/git.git;a=commit;h=772d8a3b63ff669c285edb8aff0c63b609614933">Feb 17 2006</a> and yet I couldn&#8217;t be bothered to setup my own Git host for my repositories.  For the past 3 years I&#8217;ve primarily leaned on Pasky&#8217;s excellent <a href="http://repo.or.cz/">repo.or.cz</a> service.  But I&#8217;ve always wanted a more permanent home for my projects.</p>
<p>Last week I threw <a href="http://spamassassin.apache.org/">SpamAssassin</a> off my domain&#8217;s server, which meant I finally had virtual memory free to run <a href="http://www.kernel.org/pub/software/scm/git/docs/git-daemon.html"><code>git daemon</code></a>.  So you can now find most of my projects at <a href="http://git.spearce.org/">git.spearce.org</a>, with proper git:// URLs available for efficient cloning.  Maybe sometime later this month I&#8217;ll get smart HTTP enabled as well.</p>
<p>I&#8217;ll continue to update the repositories on repo.or.cz, but I&#8217;ll primarily be using the ones hosted on git.spearce.org.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.spearce.org/2009/12/moving-my-git-repositories.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>EGit at Eclipse</title>
		<link>http://www.spearce.org/2009/12/egit-at-eclipse.html</link>
		<comments>http://www.spearce.org/2009/12/egit-at-eclipse.html#comments</comments>
		<pubDate>Thu, 03 Dec 2009 22:36:34 +0000</pubDate>
		<dc:creator>spearce</dc:creator>
				<category><![CDATA[egit]]></category>
		<category><![CDATA[jgit]]></category>

		<guid isPermaLink="false">http://www.spearce.org/?p=81</guid>
		<description><![CDATA[A few months ago we moved EGit, the Git team provider for Eclipse, over to the Eclipse Foundation.  Along the way we decided to try out some new development techniques, like taking advantage of my day-job project Gerrit Code Review to help us discuss pending changes.  This lead us down the road of not paying [...]]]></description>
			<content:encoded><![CDATA[<p>A few months ago we moved <a href="http://www.eclipse.org/egit/">EGit</a>, the Git team provider for Eclipse, over to the <a href="http://www.eclipse.org/org/">Eclipse Foundation</a>.  Along the way we decided to try out some new development techniques, like taking advantage of my day-job project <a href="http://code.google.com/p/gerrit/">Gerrit Code Review</a> to help us discuss pending changes.  This lead us down the road of not paying too much attention to the <a href="http://www.eclipse.org/projects/dev_process/ip-process-in-cartoons.php">Eclipse IP process</a>, and failing to tag all contributed patches with the +iplog flag in Bugzilla.</p>
<p>Fortunately Wayne Beaton helped us get Gerrit configured in a way that meets the foundation&#8217;s IP process guidelines, and has <a href="http://dev.eclipse.org/blogs/wayne/2009/12/01/git-at-eclipse/">encouraged us to continue forward</a>.  This is great, because it means we can rely on Git for attribution tracking, rather than Bugzilla.</p>
<p>Also, since the project moved homes we picked up 3 prolific contributors, and all of them have turned into committers on the project.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.spearce.org/2009/12/egit-at-eclipse.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
