Zach Burlingame
Programming, Computers, and Other Notes on Technology

Things to Consider When Publishing Mercurial Repositories

Shortcut: If you just want to know HOW I publish my repos locally and not all this babble on WHY, then jump to the post Installing Mercurial and Hosting Repositories with CentOS.

I’ve been using Mercurial as my VCS of choice now for about six months and quite frankly I love it. There were a number of pain points for me with Subversion (although I still use it almost every day) that Hg addressed. One of the items that regularly comes up when switching to a DVCS like Hg is “Where is the central repository?” By the pure nature of a DVCS, there is no requirement to have one. It is often useful however, to have a master or authoritative repository that other developers or clones (more on this later) can use to sync with.

Once you decide to have master a repo, the next question is “How do we access it?” With SVN this was typically accomplished through the use of HTTP with mod_dav or via svn+ssh. Mercurial comes with an utility called hgserve that will publish a repository for others to access, but it doesn’t provide any authentication methods so it’s really only viable on internal trusted networks. Fortunately, Mercurial also supports numerous other methods of publishing a repository. Additionally, there are a number of great repository hosting companies out there, including my personal favorites: BitBucket and Kiln.  I use BitBucket’s free service to host the source code used for posts on this blog and I use Kiln’s (which is so much more than just a source code hosting platform) free service for managing some personal projects.

With all these great free online services, why do I want another repository?

  1. Build Server – I run a TeamCity build server for a number of my personal projects and I would rather it work with a local repository than pull it’s data down from the Internet for every build.
  2. Backup – While the online services certainly have a more robust configuration overall than my little personal server, my data is but drop in their huge bucket. It means a LOT to me though, so having some up-to-date master copy that I control makes me sleep a bit better at night. Furthermore web companies go out of business overnight sometimes, and I don’t want to get caught with my data at the mercy of some company’s financials.
  3. Security – Not all of the things I want under version control do I want out on the Internet. Whether it be some personal project or versioning of my configuration data, it’s nice to have a local sandbox to keep these things in.
  4. Fun – I AM a developer after all – this stuff is fun!

There are 1,271,486 other articles out there on how to publish a Mercurial repository, why another?

I have several specific requirements in how I want to deploy my setup and there was no single source that seemed to address them all. This is more for my own documentation purposes than anything else, but I figured hey, there may be 0.7 other people out there who might benefit from it as well! These requirements are:

  1. User Authentication – I definitely want to require user authentication to get access to the repositories. This knocks out the built-in hgserve. (Yes, it’s running on my internal network, but I’m paranoid like that.)
  2. Multiple Committers – I don’t want to use the same user account for everyone who accesses the repository (including the build server) and it’s plausible that I may give friends access to specific projects that come up from time to time.
  3. No server user shell accounts just for access to repositories – I don’t want to have to create a full-blown shell account on a system just to give a user access to a repository. This knocks out SSH-only solutions.
  4. No stored passwords in hg clients for users with shell accounts on the server – For those of us (like me) who DO have shell accounts on the server, I don’t want to be caching a stored password on my client machines. I would much rather use something like public-key authentication in a case like this. This knocks out hgweb-only solutions.
  5. Support for multiple-repositories – I want a solution that supports publishing and managing access for multiple repositories with minimal effort.

Sidebar

Of note on point three – using an SSH-based solution – is that this is creates a special hell for you when you go to setup a TeamCity Continuous Integration server (or any Windows Service based agent that is accessing your repository in an authenticated manner). You either have to provide a username and password to TeamCity when setting up the VCS Root, which will also correspond to a user login on your server (no, thank you!) or setup public-key authentication. The problem arises when you go to setup public-key auth: How do you get TortoiseHg’s plink the key? It turns out you can but it involves jumping through some mighty hoops. First you have to generate your key, securely transfer it to your build server, convert it to PuTTY’s PPK format, place it and a mercurial.ini file in the correct SYSTEM account’s profile directory deep in the bowels of the Windows directory based on the bitness of your TeamCity install, use PsExec to run plink as the system account to accept the host key and not run into any other problems along the way. Ew.

The Hybrid-Solution

I settled on a hybrid of Apache hgweb and SSH to meet all of my requirements. For those users who already have shell accounts, I can use public-key authentication to both securely access the repositories and avoid having stored passwords on the client machines. For those users who don’t have shell accounts (like the service running the TeamCity build server) I can use HTTPS to provide authenticated access in either a read-only or read-write manner. I’ll provide steps on how to do this in the next post: Installing Mercurial and Hosting Repositories with CentOS.

Tags:

Leave a Reply

Your email address will not be published. Required fields are marked *

*