Files changed (13)
-2008 is a leap year. That means that three hundred and sixty six days ago, almost to the minute, I was sitting alone in a booth at Zeke's Sports Bar and Grill on 3rd Street in San Francisco. I wouldn't normally hang out at a sports bar, let alone a sports bar in SOMA, but back then Thursday was "I Can Has Ruby" night. I guess back then "I can has _______" was also a reasonable moniker to attach to pretty much anything. ICHR was a semi-private meeting of like minded Ruby Hackers that generally and willingly devolved into late night drinking sessions. Normally these nights would fade away like my hangover the next morning, but this night was different. This was the night that "GitHub":http://github.com/ was born.
-I think I was sitting at the booth alone because I'd just ordered a fresh Fat Tire and needed a short break from the socializing that was happening over at the long tables in the dimly lit aft portion of the bar. On the fifth or sixth sip, Chris Wanstrath walked in. I have trouble remembering now if I'd even classify Chris and I as "friends" at the time. We knew each other through Ruby meetups and conferences, but only casually. Like a mutual "hey, I think your code is awesome" kind of thing. I'm not sure what made me do it, but I gestured him over to the booth and said "dude, check this out." About a week earlier I'd started work on a project called "Grit":http://github.com/mojombo/grit that allowed me to access Git repositories in an object oriented manner via Ruby code. Chris was one of only a handful of Rubyists at the time that was starting to become serious about Git. He sat down and I started showing him what I had. It wasn't much, but it was enough to see that it had sparked something in Chris. Sensing this, I launched into my half-baked idea for some sort of website that acted as hub for coders to share their Git repositories. I even had a name: GitHub. I may be paraphrasing, but his response was along the lines of a very emphatic "I'm in. Let's do it!"
-The next night, Friday, October 19, 2007 at 10:24pm Chris made the first commit to the GitHub repository and sealed in digital stone the beginning of our joint venture. There were, so far, no agreements of any kind regarding how things would proceed. Just two guys that decided to hack together on something that sounded cool.
-Remember those amazing few minutes in Karate Kid where Daniel is training to become a martial arts expert? Remember the music? Well, you should probably go buy and listen to "You're The Best":http://phobos.apple.com/WebObjects/MZStore.woa/wa/viewAlbum?i=260417864&id=260417040&s=143441 by Joe Esposito in iTunes because I'm about to hit you with a montage.
-For the next three months Chris and I spent ridiculous hours planning and coding GitHub. I kept going with Grit and designed the UI. Chris built out the Rails app. We met in person every Saturday to make design decisions and try to figure out what the hell our pricing plan would look like. I remember one very rainy day we talked for a good two hours about various pricing strategies over some of the best Vietnamese egg rolls in the city. All of this we did while holding other engagements. I, for one, was employed full time at Powerset as a tools developer for the Ranking and Relevance team.
-In mid January, after three months of nights and weekends, we launched into private beta mode, sending invites to our friends. In mid February PJ Hyett joined in and made us three-strong. We publicly launched the site on April 10th. TechCrunch was not invited. At this point it was still just three 20-somethings without a single penny of outside investment.
-I was still working full time at Powerset on July 1, 2008 when we learned that Powerset had just been acquired by Microsoft for around $100 million. This was interesting timing. With the acquisition, I was going to be faced with a choice sooner than I had anticipated. I could either sign on as a Microsoft employee or quit and go GitHub full time. At 29 years old, I was the oldest of the three GitHubbers, and had accumulated a proportionally larger amount of debt and monthly expenditure. I was used to my six digit lifestyle. Further confounding the issue was the imminent return of my wife, Theresa, from her PhD fieldwork in Costa Rica. I would soon be transitioning from make-believe bachelor back to married man.
-To muddy the waters of decision even more, the Microsoft employment offer was juicy. Salary + $300k over three years juicy. That's enough money to make anybody think twice about anything. So I was faced with this: a safe job with lots of guaranteed money as a Microsoft man –or– a risky job with unknown amounts of money as an entrepreneur. I knew things with the other GitHub guys would become extremely strained if I stayed on at Powerset much longer. Having saved up some money and become freelancers some time ago, they had both started dedicating full time effort to GitHub. It was do or die time. Either pick GitHub and go for it, or make the safe choice and quit GitHub to make wheelbarrows full of cash at Microsoft.
-If you want a recipe for restless sleep, I can give you one. Add one part "what will my wife think" with 3,000 parts Benjamin Franklin; stir in a "beer anytime you damn well please" and top with a chance at financial independence.
-I've become pretty good at giving my employers the bad news that I'm leaving the company to go do something cooler. I broke the news to my boss at Powerset on the day the employment offer was due. I told him I was quitting to go work full time on GitHub. Like any great boss, he was bummed, but understanding. He didn't try to tempt me with a bigger bonus or anything. I think deep down he knew I was going to leave. I may have even received a larger incentive to stay than others, on account of my being a flight risk. Those Microsoft managers are crafty, I tell you. They've got retention bonuses down to a science. Well, except when you throw an entrepreneur, the singularity of the business world, into the mix. Everything goes wacky when you've got one of those around.
-In the end, just as Indiana Jones could never turn down the opportunity to search for the Holy Grail, I could no less turn down the chance to work for myself on something I truly love, no matter how safe the alternative might be. When I'm old and dying, I plan to look back on my life and say "wow, that was an adventure," not "wow, I sure felt safe."
-For an entrepreneur, the line between horrible mistake and runaway success can be so thin that even Kate Moss would be envious. I lived with "Gravatar":http://gravatar.com/ for nearly four years before that line even became thick enough to measure.
-As it's become one of my favorite parables, I'll save the details of how I came up with the idea for Gravatar for a future post. What's important to know is that the idea was spawned not from a business perspective, but from a desperate desire to create something new in the world of blogging.
-Spin the clock back four years and you'll find me sitting at my Windows desktop machine in my underwear with a box of Life cereal to my left and a day old Coke to my right. Since I'd been laid off from my job as a Java developer some months earlier, I'd decided to take the entrepreneurial plunge doing what I knew best: web design. Working in a cone of isolation, I'd become accustomed to waking up late, swinging my legs over the right side of the bed, and in one fluid movement sliding over to the ratty chair I stole from my old college dorm room. I'd spend most of the day working on client projects in ColdFusion or PHP. It was hard work and could become a bit tiresome. I needed an outlet, something that didn't have a suit on the other end of a telephone telling me how blue was the wrong color and things would be so much better if only the photo had a slightly bigger border. Gravatar would become that outlet for me.
-I was really big into web standards at the time, having recently read Zeldman's seminal work, and became a true believer. Eric Meyers, Dan Cederholm, and Jon Hicks became like gods to me. I worked very hard at making relevant and witty comments around the right kinds of blogs. Being a part of that movement became a significant goal for me. My Movable Type weblog rarely went more than two days days without a post on design or standards.
-Two weeks after I had the idea for Gravatar the first version was written and deployed. Every request hit the database and dynamically generated a properly sized gravatar via PHP's gd2 api. Premature optimization and all that, right? The first thing I did after getting the system to a workable state was email all the bloggers I looked up to (and that had no idea who I was). Blog comments at the time were a pretty dreary affair and I guess Cederholm was intrigued enough by my idea that he linked to it in a sidebar micro-post on simplebits.com.
-That single mention kicked off a slow but steady trickle of interest in the system. A few blogs here and there installed the plugin and the world started seeing avatars that mysteriously followed them around. At the same time, just as people must have thought Cheez Whiz was a stupid idea when it first came out, some bloggers started railing against Gravatar, calling it frivolous, inefficient, and "an abomination." This was my first nibble at the smorgasbord of what was to become the "horrible mistake" aspect of Gravatar.
-Due to the inherently self-advertising nature of gravatars (the "what the hell is that and how do I get one?" brand of advertising), Gravatar adoption increased at a rapid rate. Having crafted the idea for Gravatar without any semblance of business model or growth projection or build-out strategy, things took a rather dramatic dive away from "runaway success" as my server (yes, singular) buckled under the pressure of tens of requests per second! As it turns out, regenerating a gravatar on every request is not very CPU efficient. Gravatars worldwide suddenly turned into little red Xs. Then, in what has become known as the Twitter Effect, a barrage of emails hit me complaining about how the free service on which they had come to depend was down, and how this would adversely impact my well-being.
-I fixed the code. Gravatar came back online with caching. All the while I'd had the bright idea that gravatars would be rated for content, MPAA-style. Because users clearly were not fit to rate their own images, I was manually rating 400 or more avatars each day. If I missed a day, I'd have damn near a thousand waiting for me the next day. In addition to the angry mob, I was very fortunate to have an amazingly supportive group of users that volunteered to help me rate images. I owe them my sanity, and it freed up enough time for me to work on the next iteration of the site.
-G2, as I called it, would be written in Rails and use lighttpd plus a convoluted directory structure of symlinks to enable me to pre-render every gravatar (1x1 up to 80x80) and serve only static images. I did this to avoid having to rent or buy the kind of hardware necessary to hook up a properly scaled system. Up until the end I ran Gravatar on a maximum of two rented commodity servers that set me back a mere $300/month, a pittance for the kind of traffic I was serving. I say it was a pittance, but that's not really true. Donations didn't even come close to covering that cost.
-At some point early in the development of G2, Toni Schneider, the CEO of Automattic (the company behind WordPress.com and Akismet) contacted me after hearing my "interview about the future of Gravatar":http://blog.gravatar.com/2007/10/18/automattic-gravatar/ on the WordPress podcast. This was exciting news! How perfect a fit would it be for Gravatar to be bought by Automattic? I was already planning a trip up to San Francisco to meet with the Powerset guys, so the timing worked out perfectly for me to meet in person with Toni. I ended up having lunch with Toni and Matt Mullenweg at 21st Ammendment on 2nd Street. It was a bit intimidating to come to the mecca of tech startups to meet with such huge players in the blogging community. Turned out that both Matt and Toni are great guys and so we had drinks for about two hours, talking about my ideas for Gravatar and how we might be able to work together. Everything seemed great—I was jazzed and they seemed excited—but a few weeks later Toni let me know that the timing was wrong and they couldn't make a play at that time. He suggested I proceed with G2 and they'd proceed with their own avatar system. I was pretty bummed about the outcome, but I took their advice and kept going.
-A few weeks before G2 was finished, the site imploded in a big way. One machine. Hundreds of requests per second. That poor CPU must have thought it was the End Times. Instead of wasting time on getting the existing system back up, I put on my headphones, turned it up to eleven, and got back to work on G2.
-I'm not sure if you've ever had to work on a project while your users are publicly skewering you on your blog for allowing the service to go down, but it's as close to depression as I've ever come. If I'd had less pride, I would have popped everyone a huge middle finger and let the service die, but instead I waded through the hundreds of comments and deleted the ones with threats, hatred, or my favorite, the words "fuck you" repeated 600 times. It wasn't fair, I told myself, that I should be sitting here with high blood pressure trying to raise Gravatar from the dead while the unappreciative masses do what they do best on the internet. The only thing that kept me going was never being able to tell which side of the mistake/success boundary I was sitting on. It was hard to think of the situation as anything but a huge failure, but the shitstorm the downtime was causing indicated that people found the service valuable. I want to say that Twitter went through the same thing, but they suffered their downtime with millions of dollars in the bank. The only thing I had was a full time job unrelated to Gravatar and a credit card that reminded me every month of my bad judgment.
-Finally G2 was done, deployed, and fully operational. I have no idea how many users I lost due to the several weeks of downtime, but I don't think it was very many. There seems to be a corollary to the Twitter Effect that I'd call the Forgiveness Effect. It dictates that if a user enjoys a free service and that service is currently up, all past atrocities will be easily and quickly forgiven. With the site running again, things looked to be shifting back towards a success.
-Things were not all rainbows and unicorns though. My Rube Goldbergian architecture had a few quirks that needed to be dealt with. The site still had some elusive bugs from the overly-rapid development cycle. And just like new lanes on freeways always fill up immediately, the two new servers I was running started causing expensive bandwidth overages. I had taken a job at Powerset at this point and the combined pressures of these two commitments started to weigh me down. Once again I started feeling like all the effort I put into Gravatar was for nothing. Like I would never benefit from any of it.
-In a last ditch effort to save Gravatar from final doom, I emailed Toni and pitched him again on Gravatar. I figured it was a long shot, but what the hell, couldn't hurt. Things must have changed in the prior 6 months because Toni was very receptive to the idea. We met again, at 21st Ammendment, and hashed out a tentative deal over drinks. I'd never sold anything like this before, so my technique was probably very amateurish. I'm almost certain I could have gotten a better deal out of it, but I had the smell of desperation about me and I really did want to see Gravatar end up in Automattic's hands.
-Four days later, Automattic made their official offer. On September 21st, 2007 we inked the deal and Gravatar became both the first company that I ever sold and the first company that Automattic ever acquired.
-I am quite satisfied with the sale to Automattic. Some will say that I should have pursued VC funding. Indeed, I was contacted by several firms but never travelled very far down that road. I always felt like Gravatar was a feature, and I wasn't comfortable building a company on such a tiny foundation. Reinforcing this decision, no viable business model ever coalesced during the time I was building the site. It was also made clear by Toni that Automattic would maintain Gravatar as a separate brand and continue its evolution (instead of just absorbing it into WordPress). This appealed to my ego. Most companies kill or maim everything they acquire, but here was a chance for Gravatar to carry forward with all of Automattic's resources behind it (instead of two measly servers). Part of me just wanted to see what Gravatar could become with time, money, and man-power moving it forward.
-Over the last few months I've seen a number of people looking for cofounders on Hacker News or via their own personal blogs. I think this is, at best, a highly inefficient way to find a cofounder and, at worst, a way to fool yourself into finding the *wrong* cofounder. In any case, it's a naive approach to finding the person that will need to stand by your side in the coming storm that we call "running a startup."
-Don't get me wrong, the internet is an amazing tool for meeting people. The wider the net you cast, the more likely you are to find the perfect match. But the internet has its limitations. I've had internet friends that were engaging, witty, and brilliant online, but in person felt awkward and boring. Conversely, I know people that are volatile and inflammatory online, but present an attitude of friendliness and caring in person. This phenomenon makes it difficult to gauge an individual's personality from online interaction alone.
-A far better use of the internet is to find groups of people that share your interests. Track down the local users group for your language or technology of choice. The simple fact that members of these groups take time out of their day to show up means that they're more motivated and driven than the average person. Even if it's a bit of a commute to get to the meetings, start showing up regularly. Prepare a few presentations on topics that you're passionate about. Bonus points if you present on ideas related to your potential startup. Don't worry about revealing your game-changing secrets; stealth mode is bullshit. Talk to everyone. Steer the conversation toward your interests and if someone there is excited about the same things, it will be clear.
-It may take weeks or months, but in a good group you'll find a handful of people that you really like. If at all possible, go out drinking with these people after the meetups. This is one of the easiest ways to go from "acquaintance" to "friend" and gives you free license to bring up your craziest of ideas without sounding like too much of a nutjob.
-Of the people that you like, several may make excellent candidates for cofounders. Do a little research on these individuals. What does their code look like? Have they done much open source? Do they demonstrate an entrepreneurial spirit? Can they stick with a single project for a long time? Have they been loyal to their friends and companies in the past? A good cofounder should be someone with whom you feel privileged to work. And they should feel privileged to work with you. The two of you should be on very solid ground before you begin your startup adventure, because once you do, the impact of every argument is going to feel like it's been multiplied by a thousand.
-This all sounds like a lot of hard work. Maybe you're wondering if it would be better to just go solo. I did that with Gravatar, and, in retrospect, it's painfully obvious that I made a lot of stupid mistakes. When it's just you and your thoughts it becomes too easy to pick the first thing that pops into your head. We're programmed to think all of our ideas are good, but reality tells a different story. Truly good decisions are forged from the furnace of argument, not plucked like daisies from the pasture of a peaceful mind. A good cofounder tells you when your ideas are half-baked and ensures that your good ideas actually get implemented.
-The second biggest danger with going solo is the loss of motivation. Solipsism might make you feel important at first, but the constant lack of feedback and the absence of support during tough times can easily lead to a premature end to your adventure. Cofounders are like workout buddies. Just when you think there's no possible way you can do another rep, there they are, rooting you on toward an achievement that wouldn't be possible without them.
-Your choice of cofounder will affect everything you do in your startup. They'll share every defeat with you and celebrate every success. They'll help you understand your own ideas better by offering a different perspective. They'll be the single most important decision you make during the tenure of your startup, so choose wisely and with extreme care.
-Back in 2000, when I thought I was going to be a professional writer, I spent hours a day on LiveJournal doing writing practice with other aspiring poets and authors. Since then I've blogged at three different domains about web standards, print design, photography, Flash, illustration, information architecture, ColdFusion, package management, PHP, CSS, advertising, Ruby, Rails, and Erlang.
-I love writing. I get a kick out of sharing my thoughts with others. The act of transforming ideas into words is an amazingly efficient way to solidify and refine your thoughts about a given topic. But as much as I enjoy blogging, I seem to be stuck in a cycle of quitting and starting over. Before starting the current iteration, I resolved to do some introspection to determine the factors that were leading to this destructive pattern.
-I already knew a lot about what I *didn't* want. I was tired of complicated blogging engines like WordPress and Mephisto. I wanted to write great posts, not style a zillion template pages, moderate comments all day long, and constantly lag behind the latest software release. Something like Posterous looked attractive, but I wanted to style my blog, and it needed to be hosted at the domain of my choosing. For the same reason, other hosted sites (wordpress.com, blogger.com) were disqualified. There are a few people directly using GitHub as a blog (which is very cool), but that's a bit too much of an impedance mismatch for my tastes.
-On Sunday, October 19th, I sat down in my San Francisco apartment with a glass of apple cider and a clear mind. After a period of reflection, I had an idea. While I'm not specifically trained as an author of prose, I *am* trained as an author of code. What would happen if I approached blogging from a software development perspective? What would that look like?
-First, all my writing would be stored in a Git repository. This would ensure that I could try out different ideas and explore a variety of posts all from the comfort of my preferred editor and the command line. I'd be able to publish a post via a simple deploy script or post-commit hook. Complexity would be kept to an absolute minimum, so a static site would be preferable to a dynamic site that required ongoing maintenance. My blog would need to be easily customizable; coming from a graphic design background means I'll always be tweaking the site's appearance and layout.
-Over the last month I've brought these concepts to fruition and I'm pleased to announce "Jekyll":http://github.com/mojombo/jekyll. Jekyll is a simple, blog aware, static site generator. It takes a template directory (representing the raw form of a website), runs it through Textile and Liquid converters, and spits out a complete, static website suitable for serving with Apache or your favorite web server. If you're reading this on the website (http://tom.preston-werner.com), you're seeing a Jekyll generated blog!
-To understand how this all works, open up my "TPW":http://github.com/mojombo/tpw repo in a new browser window. I'll be referencing the code there.
-Take a look at "index.html":http://github.com/mojombo/tpw/tree/master/index.html. This file represents the homepage of the site. At the top of the file is a chunk of YAML that contains metadata about the file. This data tells Jekyll what layout to give the file, what the page's title should be, etc. In this case, I specify that the "default" template should be used. You can find the layout files in the "_layouts":http://github.com/mojombo/tpw/tree/master/_layouts directory. If you open "default.html":http://github.com/mojombo/tpw/tree/master/_layouts/default.html you can see that the homepage is constructed by wrapping index.html with this layout.
-You'll also notice Liquid templating code in these files. "Liquid":http://www.liquidmarkup.org/ is a simple, extensible templating language that makes it easy to embed data in your templates. For my homepage I wanted to have a list of all my blog posts. Jekyll hands me a Hash containing various data about my site. A reverse chronological list of all my blog posts can be found in <code>site.posts</code>. Each post, in turn, contains various fields such as <code>title</code> and <code>date</code>.
-Jekyll gets the list of blog posts by parsing the files in the "_posts":http://github.com/mojombo/tpw/tree/master/_posts directory. Each post's filename contains the publishing date and slug (what shows up in the URL) that the final HTML file should have. Open up the file corresponding to this blog post: "2008-11-17-blogging-like-a-hacker.textile":http://github.com/mojombo/tpw/tree/master/_posts/2008-11-17-blogging-like-a-hacker.textile. GitHub renders textile files by default, so to better understand the file, click on the "raw":http://github.com/mojombo/tpw/tree/master/_posts/2008-11-17-blogging-like-a-hacker.textile?raw=true view to see the original file. Here I've specified the <code>post</code> layout. If you look at that file you'll see an example of a nested layout. Layouts can contain other layouts allowing you a great deal of flexibility in how pages are assembled. In my case I use a nested layout in order to show related posts for each blog entry. The YAML also specifies the post's title which is then embedded in the post's body via Liquid.
-Posts are handled in a special way by Jekyll. The date you specify in the filename is used to construct the URL in the generated site. This post, for instance, ends up at <code>http://tom.preston-werner.com/2008/11/17/blogging-like-a-hacker.html</code>.
-Files that do not reside in directories prefixed with an underscore are mirrored into a corresponding directory structure in the generated site. If a file does not have a YAML preface, it is not run through the Liquid interpreter. Binary files are copied over unmodified.
-Jekyll is still a very young project. I've only developed the exact functionality that I've needed. As time goes on I'd like to see the project mature and support additional features. If you end up using Jekyll for your own blog, drop me a line and let me know what you'd like to see in future versions. Better yet, fork the project over at GitHub and hack in the features yourself!
-I've been living with Jekyll for just over a month now. I love it. Driving the development of Jekyll based on the needs of my blog has been very rewarding. I can edit my posts in TextMate, giving me automatic and competent spell checking. I have immediate and first class access to the CSS and page templates. Everything is backed up on GitHub. I feel a lightness now when I'm writing a post. The system is simple enough that I can keep the entire conversion process in my head. The distance from my brain to my blog has shrunk, and, in the end, I think that will make me a better author.
-Git is a simple, but extremely powerful system. Most people try to teach Git by demonstrating a few dozen commands and then yelling "tadaaaaa." I believe this method is flawed. Such a treatment may leave you with the ability to use Git to perform simple tasks, but the Git commands will still feel like magical incantations. Doing anything out of the ordinary will be terrifying. Until you understand the concepts upon which Git is built, you'll feel like a stranger in a foreign land.
-The following parable will take you on a journey through the creation of a Git-like system from the ground up. Understanding the concepts presented here will be the most valuable thing you can do to prepare yourself to harness the full power of Git. The concepts themselves are quite simple, but allow for an amazing wealth of functionality to spring into existence. Read this parable all the way through and you should have very little trouble mastering the various Git commands and wielding the awesome power that Git makes available to you.
-Imagine that you have a computer that has nothing on it but a text editor and a few file system commands. Now imagine that you have decided to write a large software program on this system. Because you're a responsible software developer, you decide that you need to invent some sort of method for keeping track of versions of your software so that you can retrieve code that you previously changed or deleted. What follows is a story about how you might design one such version control system (VCS) and the reasoning behind those design choices.
-Alfred is a friend of yours that works down at the mall as a photographer in one of those "Special Moments" photo boutiques. All day long he takes photos of little kids posing awkwardly in front of jungle or ocean backdrops. During one of your frequent lunches at the pretzel stand, Alfred tells you a story about a woman named Hazel who brings her daughter in for a portrait every year on the same day. "She brings the photos from all the past years with her," Alfred tells you. "She likes to remember what her daughter was like at each different stage, as if the snapshots really let her move back and forth in time to those saved memories."
-Like some sort of formulaic plot device, Alfred's innocent statement acts as a catalyst for you to see the ideal solution to your version control dilemma. Snapshots, like save points in a video game, are really what you care about when you need to interact with a VCS. What if you could take snapshots of your codebase at any time and resurrect that code on demand? Alfred reads the dawning realization spreading across your face and knows you're about to leave him without another word to go back and implement whatever genius idea he just caused you to have. You do not disappoint him.
-You start your project in a directory named <code>working</code>. As you code, you try to write one feature at a time. When you complete a self-contained portion of a feature, you make sure that all your files are saved and then make a copy of the entire working directory, giving it the name <code>snapshot-0</code>. After you perform this copy operation, you make sure to never again change the code files in the new directory. After the next chunk of work, you perform another copy, only this time the new directory gets the name <code>snapshot-1</code>, and so on.
-To make it easy to remember what changes you made in each snapshot, you add a special file named <code>message</code> to each snapshot directory that contains a summary of the work that you did and the date of completion. By printing the contents of each message, it becomes easy to find a specific change that you made in the past, in case you need to resurrect some old code.
-After a bit of time on the project, a candidate for release begins to emerge. Late nights at the keyboard finally yield <code>snapshot-99</code>, the nascent form of what will become Release Version 1.0. It comes to pass that this snapshot is packaged and distributed to the eagerly awaiting masses. Stoked by excellent response to your software, you push forward, determined to make the next version an even bigger success.
-Your VCS has so far been a faithful companion. Old versions of your code are there when you need them and can be accessed with ease. But not long after the release, bug reports start to come in. Nobody's perfect, you reassure yourself, and <code>snapshot-99</code> is readily retrievable, glad to be brought back to life for the purposes of applying bug fixes.
-Since the release, you've created 10 new snapshots. This new work must not be included in the 1.0.1 bug fix version you now need to create. To solve this, you copy <code>snapshot-99</code> to <code>working</code> so that your working directory is at exactly the point where Version 1.0 was released. A few swift lines of code and the bug is fixed in the working directory.
-It is here that a problem becomes apparent. The VCS deals very well with linear development, but for the first time ever, you need to create a new snapshot that is not a direct descendent of the preceding snapshot. If you create a <code>snapshot-110</code> (remember that you created 10 snapshots since the release), then you'll be interrupting the linear flow and will have no way of determining the ancestry of any given snapshot. Clearly, you need something more powerful than a linear system.
-Studies show that even short exposures to nature can help recharge the mind's creative potential. You've been sitting behind the artificially polarized light of your monitor for days on end. A walk through the woods in the brisk Autumn air will do you some good and with any luck, will help you arrive at an ideal solution to your problem.
-The great oaks that line the trail have always appealed to you. They seem to stand stark and proud against the perfectly blue sky. Half the ruddy leaves have departed from their branches, leaving an intricate pattern of branches in their wake. Fixating on one of the thousands of branch tips you idly try to follow it back to the solitary trunk. This organically produced structure allows for such great complexity, but the rules for finding your way back to the trunk are so simple, and perfect for keeping track of multiple lines of development! It turns out that what they say about nature and creativity are true.
-By looking at your code history as a tree, solving the problem of ancestry becomes trivial. All you need to do is include the name of the parent snapshot in the <code>message</code> file you write for each snapshot. Adding just a single upstream pointer will enable you to easily and accurately trace the history of any given snapshot all the way back to the root.
-Your code history is now a tree. Instead of having a single latest snapshot, you have two: one for each branch. With a linear system, your sequential numbering system let you easily identify the latest snapshot. Now, that ability is lost.
-Creating new development branches has become so simple that you'll want to take advantage of it all the time. You'll be creating branches for fixes to old releases, for experiments that may not pan out; indeed it becomes possible to create a new branch for every feature you begin!
-But like everything good in life, there is a price to be paid. Each time you create a new snapshot, you must remember that the new snapshot becomes the latest on its branch. Without this information, switching to a new branch would become a laborious process indeed.
-Every time you create a new branch you probably give it a name in your head. "This will be the Version 1.0 Maintenance Branch," you might say. Perhaps you refer to the former linear branch as the "master" branch.
-Think about this a little further, though. From the perspective of a tree, what does it mean to name a branch? Naming every snapshot that appears in the history of a branch would do the trick, but requires the storage of a potentially large amount of data. Additionally, it still wouldn't help you efficiently locate the latest snapshot on a branch.
-The least amount of information necessary to identify a branch is the location of the latest snapshot on that branch. If you need to know the list of snapshots that are part of the branch you can easily trace the parentage.
-Storing the branch names is trivial. In a file named <code>branches</code>, stored outside of any specific snapshot, you simply list the name/snapshot pairs that represent the tips of branches. To switch to a named branch you need only look up the snapshot for the corresponding name from this file.
-Because you're only storing the latest snapshot on each branch, creating a new snapshot now contains an additional step. If the new snapshot is being created as part of a branch, the <code>branches</code> file must be updated so that the name of the branch becomes associated with the new snapshot. A small price to pay for the benefit.
-After using branches for a while you notice that they can serve two purposes. First, they can act as movable pointers to snapshots so that you can keep track of the branch tips. Second, they can be pointed at a single snapshot and never move.
-The first use case allows you to keep track of ongoing development, things like "Release Maintenance". The second case is useful for labeling points of interest, like "Version 1.0" and "Version 1.0.1".
-Mixing both of these uses into a single file feels messy. Both types are pointers to snapshots, but one moves and one doesn't. For the sake of clarity and elegance, you decide to create another file called <code>tags</code> to contain pointers of the second type.
-Keeping these two inherently different pointers in separate files will help you from accidentally treating a branch as a tag or vice versa.
-Working on your own gets pretty lonely. Wouldn't it be nice if you could invite a friend to work on your project with you? Well, you're in luck. Your friend Zoe has a computer setup just like yours and wants to help with the project. Because you've created such a great version control system, you tell her all about it and send her a copy of all your snapshots, branches, and tags so she can enjoy the same benefits of the code history.
-It's great to have Zoe on the team but she has a habit of taking long trips to far away places without internet access. As soon as she has the source code, she catches a flight to Patagonia and you don't hear from her for a week. In the meantime you both code up a storm. When she finally gets back, you discover a critical flaw in your VCS. Because you've both been using the same numbering system, you each have directories named 'snapshot-114', 'snapshot-115', and so on, but with different contents!
-To make matters worse, you don't even know who authored the changes in those new snapshots. Together, you devise a plan for dealing with these problems. First, snapshot messages will henceforth contain author name and email. Second, snapshots will no longer be named with simple numbers. Instead, you'll use the contents of the message file to produce a hash. This hash will be guaranteed to be unique to the snapshot since no two messages will ever have the same date, message, parent, and author. To make sure everything goes smoothly, you both agree to use the SHA1 hash algorithm that takes the contents of a file and produces a 40 character hexadecimal string. You both update your histories with the new technique and instead of clashing 'snapshot-114' directories, you now have distinct directories named '8ba3441b6b89cad23387ee875f2ae55069291f4b' and 'db9ecb5b5a6294a8733503ab57577db96ff2249e'.
-With the updated naming scheme, it becomes trivial for you to fetch all the new snapshots from Zoe's computer and place them next to your existing snapshots. Because every snapshot specifies its parent, and identical messages (and therefor identical snapshots) have identical names no matter where they are created, the history of the codebase can still be drawn as a tree. Only now, the tree is comprised of snapshots authored by both Zoe and you.
-This point is important enough to warrant repeating. A snapshot is identified by a SHA1 that uniquely identifies it (and its parent). These snapshots can be created and moved around between computers without losing their identity or where they belong in the history tree of a project. What's more, snapshots can be shared or kept private as you see fit. If you have some experimental snapshots that you want to keep to yourself, you can do so quite easily. Just don't make them available to Zoe!
-Zoe's travel habits cause her to spend countless hours on airplanes and boats. Most of the places she visits have no readily available internet access. At the end of the day, she spends more time offline than online.
-It's no surprise, then, that Zoe raves about your VCS. All of the day to day operations that she needs to do can be done locally. The only time she needs a network connection is when she's ready to share her snapshots with you.
-Before Zoe left on her trip, you had asked her to start working off of the branch named 'math' and to implement a function that generated prime numbers. Meanwhile, you were also developing off of the 'math' branch, only you were writing a function to generate magic numbers. Now that Zoe has returned, you are faced with the task of merging these two separate branches of development into a single snapshot. Since you both worked on separate tasks, the merge is simple. While constructing the snapshot message for the merge, you realize that this snapshot is special. Instead of just a single parent, this merge snapshot has two parents! The first parent is your latest on the 'math' branch and the second parent is Zoe's latest on her 'math' branch. The merge snapshot doesn't contain any changes beyond those necessary to merge the two disparate parents into a single codebase.
-Once you complete the merge, Zoe fetches all the snapshots that you have that she does not, which include your development on the 'math' branch and your merge snapshot. Once she does this, both of your histories match exactly!
-Like many software developers you have a compulsion to keep your code clean and very well organized. This carries over into a desire to keep your code history well groomed. Last night you came home after having a few too many pints of Guinness at the local brewpub and started coding, producing a handful of snapshots along the way. This morning, a review of the code you wrote last night makes you cringe a little bit. The code is good overall, but you made a lot of mistakes early on that you corrected in later snapshots.
-Let's say the branch on which you did your drunken development is called 'drunk' and you made three snapshots after you got home from the bar. If the name 'drunk' points at the latest snapshot on that branch, then you can use a useful notation to refer to the parent of that snapshot. The notation 'drunk^' means the parent of the snapshot pointed to by the branch name 'drunk'. Similarly 'drunk^^' means the grandparent of the 'drunk' snapshot. So the three snapshots in chronological order are 'drunk^^', 'drunk^', and 'drunk'.
-You'd really like those three lousy snapshots to be two clean snapshots. One that changes an existing function, and one that adds a new file. To accomplish this revision of history you copy 'drunk' to 'working' and delete the file that is new in the series. Now 'working' represents the correct modifications to the existing function. You create a new snapshot from 'working' and write the message to be appropriate to the changes. For the parent you specify the SHA1 of the 'drunk^^^' snapshot, essentially creating a new branch off of the same snapshot as last night. Now you can copy 'drunk' to 'working' and roll a snapshot with the new file addition. As the parent you specify that snapshot you created just before this one.
-The history of the 'drunk' branch now represents a nicer version of what you did last night. The other snapshots that you've replaced are no longer needed so you can delete them or just leave them around for posterity. No branch names are currently pointing at them so it will be hard to find them later on, but if you don't delete them, they'll stick around.
-As much as you try to keep your new modifications related to a single feature or logical chunk, you sometimes get sidetracked and start hacking on something totally unrelated. Only half-way into this do you realize that your working directory now contains what should really be separated as two discrete snapshots.
-To help you with this annoying situation, the concept of a staging directory is useful. This area acts as an intermediate step between your working directory and a final snapshot. Each time you finish a snapshot, you also copy that to a <code>staging</code> directory. Now, every time you finish an edit to a new file, create a new file, or remove a file, you can decide whether that change should be part of your next snapshot. If it belongs, you mimic the change inside <code>staging</code>. If it doesn't, you can leave it in <code>working</code> and make it part of a later snapshot. From now on, snapshots are created directly from the staging directory.
-This separation of coding and preparing the stage makes it easy to specify what is and is not included in the next snapshot. You no longer have to worry too much about making an accidental, unrelated change in your working directory.
-You have to be a bit careful, though. Consider a file named <code>README</code>. You make an edit to this file and then mimic that in <code>staging</code>. You go on about your business, editing other files. After a bit, you make another change to <code>README</code>. Now you have made two changes to that file, but only one is in the staging area! Were you to create a snapshot now, your second change would be absent.
-The lesson is this: every new edit must be added to the staging area if it is to be part of the next snapshot.
-With a working directory, a staging area, and loads of snapshots laying around, it starts to get confusing as to what the specific code changes are between these directories. A snapshot message only gives you a summary of what changed, not exactly what lines were changed between two files.
-Using a diffing algorithm, you can implement a small program that shows you the differences in two codebases. As you develop and copy things from your working directory to the staging area, you'll want to easily see what is different between the two, so that you can determine what else needs to be staged. It's also important to see how the staging area is different from the last snapshot, since these changes are what will become part of the next snapshot you produce.
-There are many other diffs you might want to see. The differences between a specific snapshot and its parent would show you the "changeset" that was introduced by that snapshot. The diff between two branches would be helpful for making sure your development doesn't wander too far away from the mainline.
-After a few more trips to Namibia, Istanbul, and Galapagos, Zoe starts to complain that her hard drive is filling up with hundreds of nearly identical copies of the software. You too have been feeling like all the file duplication is wasteful. After a bit of thinking, you come up with something very clever.
-You remember that the SHA1 hash produces a short string that is unique for a given file contents. Starting with the very first snapshot in the project history, you start a conversion process. First, you create a directory named <code>objects</code> outside of the code history. Next, you find the most deeply nested directory in the snapshot. Additionally, you open up a temporary file for writing. For each file in this directory you perform three steps. Step 1: Calculate the SHA1 of the contents. Step 2: Add an entry into the temp file that contains the word 'blob' (binary large object), the SHA1 from the first step, and the filename. Step 3: Copy the file to the objects directory and rename it to the SHA1 from step 1. Once finished with all the files, find the SHA1 of the temp file contents and use that to name the temp file, also placing it in the objects directory.
-If at any time the objects directory already contains a file with a given name, then you have already stored that file's contents and there is no need to do so again.
-Now, move up one directory and start over. Only this time, when you get to the entry for the directory that you just processed, enter the word 'tree', the SHA1 of the temp file from last time, and the directory's name into the new temp file. In this fashion you can build up a tree of directory object files that contain the SHA1s and names of the files and directory objects that they contain.
-Once this has been accomplished for every directory and file in the snapshot, you have a single root directory object file and its corresponding SHA1. Since nothing contains the root directory, you must record the root tree's SHA1 somewhere. An ideal place to store it is in the snapshot message file. This way, the uniqueness of the SHA1 of the message also depends on the entire contents of the snapshot, and you can guarantee with absolute certainty that two identical snapshot message SHA1s contain the same files!
-It's also convenient to create an object from the snapshot message in the same way that you do for blobs and trees. Since you're maintaining a list of branch and tag names that point to message SHA1s you don't have to worry about losing track of which snapshots are important to you.
-With all of this information stored in the objects directory, you can safely delete the snapshot directory that you used as the source of this operation. If you want to reconstitute the snapshot at a later date it's simply a matter of following the SHA1 of the root tree stored in the message file and extracting each tree and blob into their corresponding directory and file.
-For a single snapshot, this transformation process doesn't get you much. You've basically just converted one filesystem into another and created a lot of work in the process. The real benefits of this system arise from reuse of trees and blobs across snapshots. Imagine two sequential snapshots in which only a single file in the root directory has changed. If the snapshots both contain 10 directories and 100 files, the transformation process will create 10 trees and 100 blobs from the first snapshot but only one new blob and one new tree from the second snapshot!
-By converting every snapshot directory in the old system to object files in the new system, you can drastically reduce the number of files that are stored on disk. Now, instead of storing perhaps 50 identical copies of a rarely changed file, you only need to keep one.
-Eliminating blob and tree duplication significantly reduces the total storage size of your project history, but that's not the only thing you can do to save space. Source code is just text. Text can be very efficiently compressed using something like the LZW or DEFLATE compression algorithms. If you compress every blob before computing its SHA1 and saving it to disk you can reduce the total storage size of the project history by another very admirable quantity.
-The VCS you have constructed is now a reasonable facsimile of Git. The main difference is that Git gives you very nice command lines tools to handle such things as creating new snapshots and switching to old ones (Git uses the term "commit" instead of "snapshot"), tracing history, keeping branch tips up-to-date, fetching changes from other people, merging and diffing branches, and hundreds of other common (and not-so-common tasks).
-As you continue to learn Git, keep this parable in mind. Git is really very simple underneath, and it is this simplicity that makes it so flexible and powerful. One last thing before you run off to learn all the Git commands: remember that it is almost impossible to lose work that has been committed. Even when you delete a branch, all that's really happened is that the pointer to that commit has been removed. All of the snapshots are still in the objects directory, you just need to dig up the commit SHA. In these cases, look up <code>git reflog</code>. It contains a history of what each branch pointed to and in times of crisis, it will save the day.
-Here are some resources that you should follow as your next step. Now, go, and become a Git master!
-"RDoc":http://rdoc.rubyforge.org is an abomination. It's ugly to read in plain text, requires the use of the inane :nodoc: tag to prevent private method documentation from showing up in final rendering, and does nothing to encourage complete or unambiguous documentation of classes, methods, or parameters. "YARD":http://yardoc.org is much better but goes too far in the other direction (and still doesn't look good in plain text). Providing an explicit way to specify parameters and types is great, but having to remember a bunch of strict tag names in order to be compliant is not a good way to encourage coders to write documentation. And again we see a @private tag that's necessary to hide docs from the final render.
-Three years ago, after suffering with these existing documentation formats for far too long, I started using my own documentation format. It looked a bit like RDoc but had a set of conventions for specifying parameters, return values, and the expected types. It used plain language and full sentences so that a human could read and understand it without having to parse machine-oriented tags or crufty markup. I called this format TomDoc, because if Linus can name stuff after himself, then why can't I?
-After years in the making, TomDoc is finally a well specified documentation format. You can find the full spec at "http://tomdoc.org":http://tomdoc.org.
-At first glance you'll notice a few things. First, and most important, is that the documentation looks nice in plain text. When I'm working on a project, I need to be able to scan and read method documentation quickly. Littering the docs with tags and markup (especially HTML markup) is not acceptable. Code documentation should be optimized for human consumption. Second, all parameters and return values, and their expected types are specified. Types are generally denoted by class name. Because Ruby is so flexible, you are not constrained by a rigid type declaration syntax and are free to explain precisely how the expected types may vary under different circumstances. Finally, the basic layout is designed to be easy to remember. Once you commit a few simple conventions to memory, writing documentation becomes second nature, with all of the tricky decision making already done for you.
-Today's Ruby libraries suffer deeply from haphazard versioning schemes. Even RubyGems itself does not follow a sane or predictable versioning pattern. This lack of discipline stems from the absence of well defined Public APIs. TomDoc attempts to solve this problem by making it simple to define an unambiguous Public API for your library. Instead of assuming that all classes and methods are intended for public consumption, TomDoc makes the Public API opt-in. To denote that something is public, all you have to do is preface the main description with "Public:". By forcing you to explicitly state that a class or method is intended for public consumption, a deliberate and thoughtful Public API is automatically constructed that can inform disciplined version changes according to the tenets of "Semantic Versioning":http://semver.org. In addition, the prominent display of "Public" in a method description ensures that developers are made aware of the sensitive nature of the method and do not carelessly change the signature of something in the Public API.
-Once a Public API has been established, some very exciting things become possible. We're currently working on a processing tool that will render TomDoc into various forms (terminal, HTML, etc). If you run this tool on a library, you'll get a printout of the Public API documentation. You can publish this online so that others have easy access to it. When you roll a new version of the library, you can run the tool again, giving it a prior version as a base, and have it automatically display only the methods that have changed. This diff will be extremely useful for users while they upgrade to the new version (or so they can evaluate whether an upgrade is warranted)!
-While I've been using various nascent forms of TomDoc for several years, we're just now starting to adopt it for everything we do at GitHub. Now that I've formalized the spec it will be easy for the entire team to write compliant TomDoc. The goal is to have every class, method, and accessor of every GitHub library documented. In the future, once we have proper tooling, we'd even like to create a unit test that will fail if anything is missing documentation.
-TomDoc is still a rough specification so I'm initially releasing it as 0.9.0. Over the coming months I'll make any necessary changes to address user concerns and release a 1.0.0 version once things have stabilized. If you'd like to suggest changes, please open an issue on the "TomDoc GitHub repository":http://github.com/mojombo/tomdoc.
-I hear a lot of talk these days about TDD and BDD and Extreme Programming and SCRUM and stand up meetings and all kinds of methodologies and techniques for developing better software, but it's all irrelevant unless the software we're building meets the needs of those that are using it. Let me put that another way. A perfect implementation of the wrong specification is worthless. By the same principle a beautifully crafted library with no documentation is also damn near worthless. If your software solves the wrong problem or nobody can figure out how to use it, there's something very bad going on.
-Fine. So how do we solve this problem? It's easier than you think, and it's important enough to warrant its very own paragraph.
-First. As in, before you write any code or tests or behaviors or stories or ANYTHING. I know, I know, we're programmers, dammit, not tech writers! But that's where you're wrong. Writing a Readme is absolutely essential to writing good software. Until you've written about your software, you have no idea what you'll be coding. Between The Great Backlash Against Waterfall Design and The Supreme Acceptance of Agile Development, something was lost. Don't get me wrong, waterfall design takes things way too far. Huge systems specified in minute detail end up being the WRONG systems specified in minute detail. We were right to strike it down. But what took its place is too far in the other direction. Now we have projects with short, badly written, or entirely missing documentation. Some projects don't even have a Readme!
-This is not acceptable. There must be some middle ground between reams of technical specifications and no specifications at all. And in fact there is. That middle ground is the humble Readme.
-It's important to distinguish Readme Driven Development from Documentation Driven Development. RDD could be considered a subset or limited version of DDD. By restricting your design documentation to a single file that is intended to be read as an introduction to your software, RDD keeps you safe from DDD-turned-waterfall syndrome by punishing you for lengthy or overprecise specification. At the same time, it rewards you for keeping libraries small and modularized. These simple reinforcements go a long way towards driving your project in the right direction without a lot of process to ensure you do the right thing.
-* Most importantly, you're giving yourself a chance to think through the project without the overhead of having to change code every time you change your mind about how something should be organized or what should be included in the Public API. Remember that feeling when you first started writing automated code tests and realized that you caught all kinds of errors that would have otherwise snuck into your codebase? That's the exact same feeling you'll have if you write the Readme for your project before you write the actual code.
-* As a byproduct of writing a Readme in order to know what you need to implement, you'll have a very nice piece of documentation sitting in front of you. You'll also find that it's much easier to write this document at the beginning of the project when your excitement and motivation are at their highest. Retroactively writing a Readme is an absolute drag, and you're sure to miss all kinds of important details when you do so.
-* If you're working with a team of developers you get even more mileage out of your Readme. If everyone else on the team has access to this information before you've completed the project, then they can confidently start work on other projects that will interface with your code. Without any sort of defined interface, you have to code in serial or face reimplementing large portions of code.
-* It's a lot simpler to have a discussion based on something written down. It's easy to talk endlessly and in circles about a problem if nothing is ever put to text. The simple act of writing down a proposed solution means everyone has a concrete idea that can be argued about and iterated upon.
-Consider the process of writing the Readme for your project as the true act of creation. This is where all your brilliant ideas should be expressed. This document should stand on its own as a testament to your creativity and expressiveness. The Readme should be the single most important document in your codebase; writing it first is the proper thing to do.
-Two days ago I had the pleasure of speaking at Startup School, a yearly conference on entrepreneurism put on by the great folks at Y Combinator. Never before have I see such a high concentration of smart ambitious people in one place.
-<center><a href="http://www.justin.tv/c3oorg/b/272031754"><img src="http://img.skitch.com/20101018-p9ux9isde8m64ht8wertuytxfd.jpg" /></a></center>
-Since I only had about 25 minutes for the talk and 5 minutes for questions, I wanted to expand upon and clarify some of the ideas I introduced during the talk and then make myself available for additional questions. So today (Monday, 18 October 2010) I'll be answering any questions you have via Hacker News:
-<center><b><a href="http://news.ycombinator.com/item?id=1804443">Ask me a question on HN!</a></b></center>
-The very first commit to GitHub was made exactly three years ago tomorrow. In that time our team of thirteen has signed up over 420,000 developers and now hosts 1.3 million Git repositories, making us the largest code host on the planet. And we've done all of this without ever taking a dime of funding from outside the company. In fact, even within the company we only invested a few thousand dollars out of our own pockets during the first months to cover legal fees.
-During the presentation I talk about a choice between optimizing for happiness and optimizing for money. When I say "optimizing for money" I mean following the traditional venture capital route of raising a ton of money to stash in your bank account and going for a huge exit. The unfortunate reality of this approach is that for aspiring entrepreneurs that are not well connected to the VC world, it can take an extraordinary amount of time and effort to raise that money. Even if you are able to raise capital, you are suddenly responsible to your investors and will need to align your interests with theirs.
-In a world dominated by news about Facebook, Apple, Google, YouTube, Zappos, and other companies heavily funded by venture capital, it's easy to forget that you can still build a highly profitable business with significant impact on a global market without having to first spend three months on Sand Hill Road asking for permission to build your product.
-The infrastructure components necessary to run an internet business are finally cheap enough that you can get started without a huge up-front investment. In the months that you would traditionally spend in glass-walled conference rooms you can now build a sophisticated prototype of your product and start getting users signed up and engaging you with useful feedback.
-This is what I mean by optimizing for happiness: I'm a hacker; I'm happy when I'm building things of value, not when I'm writing a business plan filled with make believe numbers.
-When Chris and I started GitHub, I was working full time at Powerset and Chris was doing consulting work and plugging away on a product of his own. GitHub became the leisure activity that I worked on when I got home from the office. I could craft it however I pleased, and there was nobody telling me what to do. This feeling of control and ownership of something you own is intoxicating.
-Within three months we had a simple product and moved into private beta. In six months we launched to the public and started charging for private plans. We've been profitable every month since public launch except for one (in which we hired two new employees at once). We did this by making a paycheck via other means until GitHub was generating enough revenue to support us full time at about 2/3 of what we were accustomed to making. We then raised our salaries over the next months when we hit specific revenue goals that allowed us to remain profitable. This happened about one year after inception.
-A side effect of bootstrapping a sustainable company is what I like to call <b>infinite runway.</b> This is another element of optimizing for happiness. With venture backed endeavors you generally find that during the first several years the numbers in your bank account are perpetually decreasing, giving your company an expiration date. Your VCs have encouraged you to grow fast and spend hard, which makes perfect sense for them, but not necessary for you. Not if you're trying to optimize for happiness.
-VCs want to see quick success or quick failure. They are optimizing for money. There's nothing wrong with that as long as you want the same things they do. But if you're like me, then you care more about building a kickass product than you do about having a ten figure exit. If that's true, then maybe you should be optimizing for happiness. One way to do this is by bootstrapping a sustainable business with infinite runway. When there are fewer potentially catastrophic events on the horizon, you'll find yourself smiling a lot more often.
-The ironic thing about bootstrapping and venture capital is that once you demonstrate some success, investors will come to YOU. When this happens you will be in a much better place to make a more reasoned choice about taking on additional capital and all the complexities that come with it. Talking to VCs with some leverage in your back pocket is an entirely different game from throwing yourself in front of a conference table full of general partners and trying to persuade them that you're worth their time and money. Power is happiness.
-There are other really great things you can do when you optimize for happiness. You can throw away things like financial projections, hard deadlines, ineffective executives that make investors feel safe, and everything that hinders your employees from building amazing products.
-At GitHub we don't have meetings. We don't have set work hours or even work days. We don't keep track of vacation or sick days. We don't have managers or an org chart. We don't have a dress code. We don't have expense account audits or an HR department.
-We pay our employees well and give them the tools they need to do their jobs as efficiently as possible. We let them decide what they want to work on and what features are best for the customers. We pay for them to attend any conference at which they've gotten a speaking slot. If it's in a foreign country, we pay for another employee to accompany them because traveling alone sucks. We show them the profit and loss statements every month. We expect them to be responsible.
-We make decisions based on the merits of the arguments, not on who is making them. We strive every day to be better than we were the day before.
-We do all this because we're optimizing for happiness, and because there's nobody to tell us that we can't.
-<b>NOTE: This post was written in late December of 2008, more than two years ago. It has stayed in my drafts folder since then, waiting for the last 2% to be written. Why I never published it is beyond my reckoning, but it serves as a great reminder of how I perceived the world back then. In the time since I wrote this we've grown from four people to twenty-six, settled into an office, installed a kegerator, and still never taken outside funding. In some ways, things have changed a great deal, but in the most important ways, things are still exactly the same. Realizing this puts a big smile on my face.</b>
-The end of the year is a great time to sit down with a glass of your favorite beverage, dim the lights, snuggle up next to the fire and think about what you've learned over the past twelve months.
-For me, 2008 was the year that I helped design, develop, and launch GitHub. Creating a new startup is an intense learning experience. Through screwups and triumphs, I have learned some valuable lessons this year. Here's a few of them.
-When Chris and I started working on GitHub in late 2007, Git was largely unknown as a version control system. Sure, Linux kernel hackers had been using it since day one, but outside of that small microcosm, it was rare to come across a developer that was using it on a day-to-day basis. I was first introduced to Git by Dave Fayram, a good friend and former coworker during my days at Powerset. Dave is the quintessential early adopter and, as far as I can calculate, patient zero for the spread of Git adoption in the Ruby community and beyond.
-Back then, the Git landscape was pretty barren. Git had only recently become usable by normal people with the 1.5 release. As for Git hosting, there was really only "repo.or.cz":http://repo.or.cz/, which felt to me very limited, clumsy, and poorly designed. There were no commercial Git hosting options whatsoever. Despite this, people were starting to talk about Git at the Ruby meetups. About how awesome it was. But something was amiss. Git was supposed to be this amazing way to work on code in a distributed way, but what was the mechanism to securely share private code? Your only option was to setup user accounts on Unix machines and use that as an ad-hoc solution. Not ideal.
-And so GitHub was born. But it was born into a world where there was no existing market for paid Git hosting. We would be _creating_ the market. I vividly remember telling people, "I don't expect GitHub to succeed right away. Git adoption will take a while, but we'll be ready when it happens." Neither Chris nor I were in any particular hurry for this to happen. I was working full time at Powerset, and he was making good money as a Rails consultant. By choosing to build early on top of a nascent technology, we were able to construct a startup with basically no overhead, no competition, and in our free time.
-Here's a seemingly paradoxical piece of advice for you: Listen to your customers, but don't let them tell you what to do. Let me explain. Consider a feature request such as "GitHub should let me FTP up a documentation site for my project." What this customer is really trying to say is "I want a simple way to publish content related to my project," but they're used to what's already out there, and so they pose the request in terms that are familiar to them. We could have implemented some horrible FTP based solution as requested, but we looked deeper into the underlying question and now we allow you to publish content by simply pushing a Git repository to your account. This meets requirements of both functionality _and_ elegance.
-Another company that understands this concept at a fundamental level is Apple. I'm sure plenty of people asked Apple to make a phone but Steve Jobs and his posse looked beneath the request and figured out what people really wanted: a nice looking, simple to use, and easy to sync mobile device that kicked some serious ass. And that's the secret. Don't give your customers what they ask for; give them what they _want_.
-I went to college at a little school in California called Harvey Mudd. Yeah, I know you haven't heard of it, but if you remember those US News & World Report "Best Colleges" books that you obsessed over in highschool (ok, maybe you didn't, but I did), Harvey Mudd was generally ranked as the engineering school with the greatest number of hours of homework per night. Yes, more than MIT, and yes, more than Caltech. It turned out to be true, as far as I can tell. I have fond memories of freaking out about ridiculously complex spring/mass/damper systems and figuring the magnetic flux of a wire wrapped around a cylinder in a double helix. We studied hard--very hard. But we played hard too. It was the only thing that could possibly keep us sane.
-Working on a startup is like that. It feels a bit like college. You're working on insanely hard projects, but you're doing it with your best friends in the world and you're having a great time (usually). In both environments, you have to goof off a lot in order to balance things out. Burnout is a real and dangerous phenomenon. Fostering a playful and creative environment is critical to maintaining both your personal health, and the health (and idea output) of the company.
-I've found Twitter to be an extremely valuable resource for instant feedback. If the site is slow for some reason, Twitter will tell me so. If the site is unreachable for people in a certain country (I'm looking at you China), I'll find out via Twitter. If that new feature we just released is really awesome, I'll get a nice ego boost by scanning the Twitter search.
-People have a tendency to turn to Twitter to bitch about all the little bugs they see on your website, usually appended with the very tiresome "FAIL". These are irksome to read, but added together are worth noticing. Often times these innocent tweets will inform a decision about whether an esoteric bug is worth adding to the short list. We also created a GitHub account on Twitter that our support guy uses to respond to negative tweets. Offering this level of customer service almost always turns a disgruntled customer into a happy one.
-If you have an iPhone, I heartily recommend the "Summizer":http://fanzter.com/products/download/summizer app from Fanzter, Inc. It makes searching, viewing, and responding to tweets a cinch.
-At the first RailsConf I had the pleasure of hearing Martin Fowler deliver an amazing keynote. He made some apt metaphors regarding agile development that I will now paraphrase and mangle.
-Imagine you're tasked with building a computer controlled gun that can accurately hit a target about 50 meters distant. That is the only requirement. One way to do this is to build a complex machine that measures every possible variable (wind, elevation, temperature, etc.) before the shot and then takes aim and shoots. Another approach is to build a simple machine that fires rapidly and can detect where each shot hits. It then uses this information to adjust the aim of the next shot, quickly homing in on the target a little at a time.
-The difference between these two approaches is to realize that bullets are cheap. By the time the former group has perfected their wind detection instrument, you'll have finished your simple weapon and already hit the target.
-In the world of web development, the target is your ideal offering, the bullets are your site deploys, and your customers provide the feedback mechanism. The first year of a web offering is a magical one. Your customers are most likely early adopters and love to see new features roll out every few weeks. If this results in a little bit of downtime, they'll easily forgive you, as long as those features are sweet. In the early days of GitHub, we'd deploy up to ten times in one afternoon, always inching closer to that target.
-Make good use of that first year, because once the big important customers start rolling in, you have to be a lot more careful about hitting one of them with a stray bullet. Later in the game, downtime and botched deploys are money lost and you have to rely more on building instruments to predict where you should aim.
-All four fulltime GitHub employees work in the San Francisco area, and yet we have no office. But we're not totally virtual either. In fact, a couple times a week you'll find us at a cafe in North Beach, huddled around a square table that was made by nailing 2x4s to an ancient fold-out bulletin board. It's no Google campus, but the rent is a hell of a lot cheaper and the drinks are just as good!
-This is not to say that we haven't looked at a few places to call home. Hell, we almost leased an old bar. But in the end there's no hurry to settle down. We're going to wait until we find the perfect office. Until then, we can invest the savings back into the company, or into our pockets. For now, I like my couch and the cafe just fine.
-Of course, none of this would be possible without 37signals' "Campfire":http://www.campfirenow.com/ web-based chat and the very-difficult-to-find-but-totally-amazing "Propane":http://productblog.37signals.com/products/2008/10/propane-takes-c.html OSX desktop app container that doubles the awesome. Both highly recommended.
-Beyond the three cofounders of GitHub, we've hired one full time developer (Scott Chacon) and one part time support specialist (Tekkub).
-We hired Tekkub because he was one of the earliest GitHub users and actively maintains more than 75 projects (WoW addons mostly) on GitHub and was very active in sending us feedback in the early days. He would even help people out in the IRC channel, simply because he enjoyed doing so.
-I met Scott at one of the San Francisco Ruby meetups where he was presenting on one of his myriad Git-centric projects. Scott had been working with Git long before anyone else in the room. He was also working on a pure Ruby implementation of Git at the same time I was working on my fork/exec based Git bindings. It was clear to me then that depending on how things went down, he could become either a powerful ally or a dangerous foe. Luckily, we all went drinking afterwards and we became friends. Not long after, Scott started consulting for us and wrote the entire backend for what you now know of as "Gist":http://gist.github.com/. We knew then that we would do whatever it took to hire Scott full time. There would be no need for an interview or references. We already knew everything we needed to know in order to make him an offer without the slightest reservation.
-The lesson here is that it's far easier and less risky to hire based on relevant past performance than it is to hire based on projected future performance. There's a corollary that also comes into play: if you're looking to work for a startup (or anyone for that matter), contribute to the community that surrounds it. Use your time and your code to prove that you're the best one for the job.
-There's nothing I hate more than micromanagers. When I was doing graphic design consulting 5 years ago I had a client that was very near the Platonic form of a micromanager. He insisted that I travel to his office where I would sit in the back room at an old Mac and design labels and catalogs and touch up photographs of swimwear models (that part was not so bad!). While I did these tasks he would hover over me and bark instructions. "Too red! Can you make that text smaller? Get rid of those blemishes right there!" It drove me absolutely batty.
-This client could have just as easily given me the task at the beginning of the day, gone and run the business, and come back in 6 hours whereupon I would have created better designs twice as fast as if he were treating me like a robot that converted his speech into Photoshop manipulations. By treating me this way, he was marginalizing my design skills and wasting both money and talent.
-Micromanagement is symptomatic of a lack of trust. The remedy for this ailment is to hire experts and then trust their judgment. In a startup, you can drastically reduce momentum by applying micromanagement, or you can boost momentum by giving trust. It's pretty amazing what can happen when a group of talented people who trust each other get together and decide to make something awesome.
-A lot has been written recently about how the venture capital world is changing. I don't pretend to be an expert on the subject, but I've learned enough to say that a web startup like ours doesn't need any outside money to succeed. I know this because we haven't taken a single dime from investors. We bootstrapped the company on a few thousand dollars and became profitable the day we opened to the public and started charging for subscriptions.
-In the end, every startup is different, and the only person that can decide if outside money makes sense is you. There are a million things that could drive you to seek and accept investment, but you should make sure that doing so is in your best interest, because it's quite possible that you don't _need_ to do so. One of the reasons I left my last job was so that I could say "the buck stops here." If we'd taken money, I would no longer be able to say that.
-In order for GitHub to talk to Git repositories, I created the first ever Ruby Git bindings. Eventually, this library become quite complete and we were faced with a choice: Do we open source it or keep it to ourselves? Both approaches have benefits and drawbacks. Keeping it private means that the hurdle for competing Ruby-based Git hosting sites would be higher, giving us an advantage. But open sourcing it would mean that
-<b>NOTE: This is where the post ended and remained frozen in carbonite until today. I intend to write a follow up post on our open source philosophy at GitHub in the near future. I'm sure the suspense is killing you!</b>
-About a month ago I decided that it was foolish to let the words I had written rot on my hard drive and so I did the only thing I knew how to do: overreact. So I cut the original nine-hundred words of my bio down to fourteen words and resubmitted it to Daniel. Those are the words you see in the post now.
-When Chris and I first started working on GitHub in late 2007, we split the work into two parts. Chris worked on the Rails app and I worked on Grit, the first ever Git bindings for Ruby. After six months of development, Grit had become complete enough to power GitHub during our public launch of the site and we were faced with an interesting question:
-Keeping it private would provide a higher hurdle for competing Ruby-based Git hosting sites, giving us an advantage. Open sourcing it would mean thousands of people worldwide could use it to build interesting Git tools, creating an even more vibrant Git ecosystem.
-After a small amount of debate we decided to open source Grit. I don't recall the specifics of the conversation but that decision nearly four years ago has led to what I think is one of our most important core values: open source (almost) everything.
-If you do it right, open sourcing code is **great advertising** for you and your company. At GitHub we like to talk publicly about libraries and systems we've written that are still closed but destined to become open source. This technique has several advantages. It helps determine what to open source and how much care we should put into a launch. We recently open sourced Hubot, our chat bot, to widespread delight. Within two days it had 500 watchers on GitHub and 409 upvotes on Hacker News. This translates into goodwill for GitHub and more superfans than ever before.
-If your code is popular enough to attract outside contributions, you will have created a **force multiplier** that helps you get more work done faster and cheaper. More users means more use cases being explored which means more robust code. Our very own [resque](https://github.com/defunkt/resque) has been improved by 115 different individuals outside the company, with hundreds more providing 3rd-party plugins that extend resque's functionality. Every bug fix and feature that you merge is time saved and customer frustration avoided.
-Smart people like to hang out with other smart people. Smart developers like to hang out with smart code. When you open source useful code, you **attract talent**. Every time a talented developer cracks open the code to one of your projects, you win. I've had many great conversations at tech conferences about my open source code. Some of these encounters have led to ideas that directly resulted in better solutions to problems I was having with my projects. In an industry with such a huge range of creativity and productivity between developers, the right eyeballs on your code can make a big difference.
-If you're hiring, **the best technical interview possible** is the one you don't have to do because the candidate is already kicking ass on one of your open source projects. Once technical excellence has been established in this way, all that remains is to verify cultural fit and convince that person to come work for you. If they're passionate about the open source code they've been writing, and you're the kind of company that cares about well-crafted code (which clearly you are), that should be simple! We hired [Vicent Martí](https://github.com/tanoku) after we saw him doing stellar work on [libgit2](https://github.com/libgit2/libgit2), a project we're spearheading at GitHub to extract core Git functionality into a standalone C library. No technical interview was necessary, Vicent had already proven his skills via open source.
-Once you've hired all those great people through their contributions, dedication to open source code is an amazingly effective way to **retain that talent**. Let's face it, great developers can take their pick of jobs right now. These same developers know the value of coding in the open and will want to build up a portfolio of projects they can show off to their friends and potential future employers. That's right, a paradox! In order to keep a killer developer happy, you have to help them become more attractive to other employers. But that's ok, because that's exactly the kind of developer you want to have working for you. So relax and let them work on open source or they'll go somewhere else where they can.
-When I start a new project, I assume it will eventually be open sourced (even if it's unlikely). This mindset leads to **effortless modularization**. If you think about how other people outside your company might use your code, you become much less likely to bake in proprietary configuration details or tightly coupled interfaces. This, in turn, leads to cleaner, more maintainable code. Even internal code should pretend to be open source code.
-Have you ever written an amazing library or tool at one job and then left to join another company only to rewrite that code or remain miserable in its absence? I have, and it sucks. By getting code out in the public we can drastically **reduce duplication of effort**. Less duplication means more work towards things that matter.
-Lastly, **it's the right thing to do**. It's almost impossible to do anything these days without directly or indirectly executing huge amounts of open source code. If you use the internet, you're using open source. That code represents millions of man-hours of time that has been spent and then given away so that everyone may benefit. We all enjoy the benefits of open source software, and I believe we are all morally obligated to give back to that community. If software is an ocean, then open source is the rising tide that raises all ships.
-Notice that everything we keep closed has specific business value that could be compromised by giving it away to our competitors. Everything we open is a general purpose tool that can be used by all kinds of people and companies to build all kinds of things.
-I prefer the MIT license and almost everything we open source at GitHub carries this license. I love this license for several reasons:
-* It's short. Anyone can read this license and understand exactly what it means without wasting a bunch of money consulting high-octane lawyers.
-* Enough protection is offered to be relatively sure you won't sue me if something goes wrong when you use my code.
-* Everyone understands the legal implications of the MIT license. Weird licenses like the WTFPL and the Beer license pretend to be the "ultimate in free licenses" but utterly fail at this goal. These fringe licenses are too vague and unenforceable to be acceptable for use in some companies. On the other side, the GPL is too restrictive and dogmatic to be usable in many cases. I want everyone to benefit from my code. Everyone. That's what Open should mean, and that's what Free should mean.
-Easy, just flip that switch on your GitHub repository from private to public and tell the world about your software via your blog, Twitter, Hacker News, and over beers at your local pub. Then sit back, relax, and enjoy being part of something big.