A Post-Commit Hook to Integrate Subversion with Hudson

Hudson is a continuous-integration build server which is easy to install and works well. An earlier article compared Hudson with Cruise Control. This article goes on to discuss how Hudson operates with source code stored in a Subversion repository.

Hudson provides out-of-the box integration with CVS and Subversion, and hosts various other source code repositories too (more info). This works on a ‘polling’ basis: you configure your build job to check the relevant svn URL every X minutes. This works well in most circumstances.

However, you may find you have a need for ‘push’ operation, i.e. when the Subversion server directly notifies the Hudson server about every relevant commit, so that the Hudson server can decide what has to be built. To do this, there are two steps needed:

  1. configure a post-commit hook in your Subversion repository;
  2. configure your projects in Hudson with suitable names and with remote builds enabled

The post-commit hook

First you should understand the basics of using hooks in Subversion. You will need three new files in your Subversion repository:

  1. in the conf folder, a configuration file called svn-hudson.conf
  2. in the hooks folder, a simple shell script called post-commit
  3. also in the hooks folder, the svn-hudson.py program

You also need a new folder called logs alongside conf, hooks etc.

How it works

When you commit a change, Subversion finds and executes the post-commit hook. This will call our svn-hudson-py script. This in turn reads the svn-hudson.conf configuration and then uses the svnlook command to find the path of every file that has changed. You will have configured which top-level directories are of interest and the list of changed paths is used to make a list of affected top-level directories. For each one, the script makes an HTTP GET request based on the path in question. The HTTP request is on your Hudson server and causes it to start builds accordingly.

The Python script keeps a note of the requests it has made and doesn’t make the same request more than once, even if there are many files changed in that part of the directory tree.

The trigger scripts

hooks/post-commit

The first script we need is hooks/post-commit, which must be executable (chmod a+x hooks/post-commit). This script is easy: based on the example template provided with Subversion, we simply add a few lines at the end.

#!/bin/sh
# POST-COMMIT HOOK

# The post-commit hook is invoked after a commit.  Subversion runs
# this hook by invoking a program (script, executable, binary, etc.)
# named 'post-commit' (for which this file is a template) with the
# following ordered arguments:
#
#   [1] REPOS-PATH   (the path to this repository)
#   [2] REV          (the number of the revision just committed)
#
# The default working directory for the invocation is undefined, so
# the program should set one explicitly if it cares.
#
# Because the commit has already completed and cannot be undone,
# the exit code of the hook program is ignored.  The hook program
# can use the 'svnlook' utility to help it examine the # newly-committed tree.
#
# On a Unix system, the normal procedure is to have 'post-commit'
# invoke other programs to do the real work, though it may do the
# work itself too.
#
# Note that 'post-commit' must be executable by the user(s) who will
# invoke it (typically the user httpd runs as), and that user must
# have filesystem-level permission to access the repository.
#
# On a Windows system, you should name the hook program
# 'post-commit.bat' or 'post-commit.exe', # but the basic idea is the same.
#
# The hook program typically does not inherit the environment of
# its parent process.  For example, a common problem is for the
# PATH environment variable to not be set to its usual value, so
# that subprograms fail to launch unless invoked via absolute path.
# If you're having unexpected problems with a hook program, the
# culprit may be unusual (or missing) environment variables.
#
# Here is an example hook script, for a Unix /bin/sh interpreter.
# For more examples and pre-written hooks, see those in
# /usr/share/subversion/hook-scripts, and in the repository at
# http://svn.collab.net/repos/svn/trunk/tools/hook-scripts/ and
# http://svn.collab.net/repos/svn/trunk/contrib/hook-scripts/</span>

DIR=`dirname $0`
REPOS="$1"
REV="$2"

python $DIR/svn-hudson.py -p $REPOS

Simply, the post-commit hook calls the Python interpreter to pass control to the main svn-hudson.py script. It would be straightforward to rewrite it for Windows, if that’s your server’s OS.

hooks/svn-hudson.py

The hard work is done using the Python script hooks/svn-hudson.py, which is available as an attachment to this post (see below).

conf/svn-hudson.conf

The last file is the configuration file that maps your Subversion folders to the corresponding Hudson build jobs. For each commit, the list of affected paths is scanned via a list of regular expressions. Those that match are used to produce a corresponding Hudson project name. By surrounding this name with a certain prefix and suffix, it becomes a URL that is understood by Hudson as a trigger to cause the project to build.

So the configuration file below consists of the URL prefix and suffix in the first part, and then a list of regular expressions. The urlprefix is simply the URL of your server with /job/ attached. The urlsuffix is always /build?token=abc but with the abc replaced by your own choice of token, as described later.

# General: this section lists general settings</span>
[general]
# URL prefix and suffix used to contact the Hudson server
urlprefix = http://ciserver:8080/job/
urlsuffix = /build?token=abc

# Mappings: This section provides mappings from path names to Hudson
# project names. This is done via regular expressions: the pattern on
# the left hand side may map to the project name on the right hand
# side, provided the pattern actually matches.
#
# As required, you can surround general pattern groups with parentheses
# and provide placeholders using backslash and a digit.
#
# Any unreplaced characters in the source path are copied to the project
# name; therefore, most patterns will end with ".*" to eat up the unwanted
# trailing characters.
#
# The mappings are searched in the order they are listed. Therefore,
# wherever there is an overlap, it is important to put the more-specific
# patterns earlier than the more general patterns, otherwise the more-
# specific patterns will possibly never be reached.
[mappings]
^trunk/.*                  = main~trunk
^branches/(.+?)/.*         = main~branches~\1
^releases/(.+?)/(.+?)/.*   = main~releases~\1~\2

The mappings deserve further explanation. Each path in a commit is checked against all the regular expressions. If a match is found, the value after the equals sign is used as the Hudson project name; then all subsequent patterns are ignored.

Let’s take the first mapping:

^trunk/.*                  = main~trunk

You need to understand a little about regular expressions here. The ^ symbol is the normal regular expression start-of-line anchor; the .* expression matches zero or more characters greedily (i.e. the parser tries to grab as many characters as are available to it, which in practice means everything to the right as far as the end of line).

So ^trunk/.* matches paths such as trunk/common/src/main/java/org/example/MyClass.java, in fact anything beginning with trunk/. The mapping itself is simple: if a commit contains any changed files in the trunk folder, it causes the project called main~trunk to be triggered, which is the project name given on the right-hand side of the equals sign.

Note that a Hudson project name cannot contain the / character. So, by convention, we are using the ~ character as a pseudo-separator instead. You can invent any convention you like as long as it makes valid Hudson project names. Also, we chose main as a prefix - this is optional but it’s a good idea to have a prefix that corresponds to your repository name in some way; once you have more than one repository, this becomes really useful.

The second mapping shows how a part of the search pattern can appear in the project name.

^branches/(.+?)/.*         = main~branches~\1

In this case, the group (.+?) corresponds to a series of one or more characters between the slash marks. The parentheses are important: they signify that this sequence of characters is to be “remembered” - they are available in the \1 expression on the right-hand side.

For example, paths such as branches/b123/common/src/main/java/org/example/MyClass.java, trigger a Hudson project called main~branches~b123, if one exists.

The third mapping is similar, except it has two replacement patterns \1 and \2 instead of one. This gives us four parts to the project name, so the value of the ~ character as a pseudo-separator is obvious. The Hudson project names need to be easily readable and developers need to understand how the Hudson projects map to source code branches within Subversion’s space.

Permitting Remotely-Triggered Builds

Within Hudson, you need to enable your projects so that they can be triggered via the Python script. The project’s configuration page has a box you need to tick: “Trigger builds remotely (eg from scripts)”. Underneath is a box in which you must put an authentication token of your choosing. We use simple strings such as “abc”; anything will do although punctuation and spaces will be awkward so are best avoided. Edit the urlsuffix in conf/svn-hudson.conf so contain the same token.

Testing

You will want to gain confidence that you have installed and configured things correctly. You can do this from the command line using additional options provided for this purpose. Look again at the last line of the post-commit script and compare with this:

python hooks/svn-hudson.py -p . -r 12345 -d

You must log in and cd to the Subversion repository folder first (i.e. not your local working copy).

-p PATH Use repository at PATH to check changes. This is mandatory.
-f PATH Use PATH as configuration file (default is repository path + /conf/svn-hudson.conf)
-r REV Optional Subversion revision REV for commit information (as per svnlook)
-t TXN Optional Subversion transaction TXN for commit information (as per svnlook)
-v Informational messages
-d Debug and informational messages

Particularly with the -d debug option enabled, you can check that the pattern matching is what you expected and that the correct Hudson project is launched in each case. Try this via the -r option so that known file sets can be tested against the matching rules. If you do this on your development server, you may need to manually cancel each build you started as you go - to stop this happening, just temporarily set the urlprefix or urlsuffix to something invalid whilst you are testing, then set it back when you’re ready.

Alternatively, if you want to test things apart from your development systems, it is easy with Subversion to create a temporary repository on your own machine that imitates your normal server, and then fire commits at it to observe the behaviour. This allows testing of unusual cases that you don’t normally get on your normal development system.

The diagnostic output should look something like this.

$ python hooks/svn-hudson.py -p . -d
2010-06-16 09:58:29,387 - root - DEBUG - urlprefix = http://ciserver:8080/job/
2010-06-16 09:58:29,388 - root - DEBUG - urlsuffix = /build?token=xyz
2010-06-16 09:58:29,388 - root - DEBUG - repository myrepos
2010-06-16 09:58:29,436 - root - DEBUG -   trunk/common/importdata/invoices/ rule ^trunk/.* (n 1)
2010-06-16 09:58:29,437 - root - DEBUG -     becomes myrepos~trunk
2010-06-16 09:58:29,456 - root - INFO -   http://ciserver:8080/job/myrepos~trunk/build?token=xyz

The fourth debug line shows how the path in question matched a particular mapping; the following line shows how the path was transmuted into a Hudson project name. The final line shows the URL called by the Python script to tell Hudson to trigger a build. Note the token=xyz: this same token needs to be set in the project’s configuration page in Hudson: remember to tick the box “Trigger builds remotely (eg from scripts)” and put your authentication token in the box; here, xyz was used.

If there is an error message because the URL fails to reach the Hudson server or gets a 404 rejection, you have misconfigured the urlprefix or urlsuffix.

If none of the rules match, then check that your mapping section contains the correct mappings. Also check that the commit contained a path that should indeed match.

Download

 
comments powered by Disqus