The Single Assignment Pattern

final’ - underused or overrated?

The famous Gang of Four made people sit up and take notice that we developers normally do similar things to each other, and the similarities can be effectively expressed as design patterns. Because of the object oriented nature of Java, it is a language that is well suited to patterns; nowadays design patterns are the industry norm. However, there is an important pattern that seems to have been overlooked - probably because it’s so obvious. The Single Assignment pattern can a major benefit to Java developers.

Java and OO are young programming technologies compared to functional programming. Perhaps it’s useful therefore to lean a thing or two from functional techniques. Indeed Java has done this:

  • the final keyword allows declared items to be assigned only once
  • immutable objects can be built up from objects containing only private final fields *

To start with, let’s look at the first of these. The term single assignment is used to describe a programming language or representation in which one cannot bind a value to a name if a value has already been bound to that name. This is exactly what the final keyword does when applied to a Java local or member variable declaration. Single assignment prevents some types of side effects, with the aim of reducing software bugs and simplify debugging.

OK, here’s a simple example. Suppose I have a method that builds up a String containing the alphabet.

public String getLetters()
{
    StringBuilder b = new StringBuilder();
    for (char c = 'a'; c <= 'z'; c++)
    {
        b.append( c );
    }
    return b.toString();
}

That seems fair enough.

Suppose then the spec changes - I get asked to add digits too. Clumsily, I copy and paste and immediately get the following code that won’t compile:

public String getLettersAndNumbers()
{
    StringBuilder b = new StringBuilder();
    for (char c = 'a'; c <= 'z'; c++)
    {
        b.append( c );
    }
    <span style="color: #ffff00;">StringBuilder b = new StringBuilder(); &lt;&lt;&lt; COMPILER RAISES AN ERROR for (char c = '0'; c &lt;= '9'; c++) { b.append( c ); }</span>
    return b.toString();
}

The object b is declared twice so the compiler throws it out. Easy to fix - but just suppose that in my contrived example, I’m having a bad day and yet again I clumsily introduce a new error:

public String getLettersAndNumbers()
{
    StringBuilder b = new StringBuilder();
    for (char c = 'a'; c &lt;= 'z'; c++)
    {
        b.append( c );
    }
    b = new StringBuilder();        &lt;&lt;&lt; MISTAKE
    for (char c = '0'; c &lt;= '9'; c++)
    {
        b.append( c );
    }
    return b.toString();
}

Now, this compiles OK and I might overlook the fact that it just doesn’t work. It only returns the ten digits; the alphabet is lost when the second StringBuilder is created. It was a silly mistake to declare b twice and maybe you’re thinking no-one is that stupid, but there are many more subtle cases in which the silly mistake is not so easy to spot. It does happen - I’ve seen it and done it myself.

The Single Assignment pattern helps prevent against this kind of mistake. The final keyword is added to the first declaration of b. Now the second erroneous declaration is thrown out by the compiler and we have this:

public String getLettersAndNumbers()
{
    <strong>final</strong> StringBuilder b = new StringBuilder();
    for (char c = 'a'; c &lt;= 'z'; c++)
    {
        b.append( c );
    }
    // b = new StringBuilder();        &lt;&lt;&lt; NO LONGER PERMITTED
    for (char c = '0'; c &lt;= '9'; c++)
    {
        b.append( c );
    }
    return b.toString();
}

OK so that now compiles and works as expected. If we’d had the mindset to use the final keyword (Single Assignment pattern) wherever possible, we’d have avoided this mistake at the start. If we try to use the final keyword for declarations as often as we can, we get some happy consequences:

  • We avoid the kind of silly mistake illustrated above.
  • We find that variable declarations tend to appear just when they're needed: these techniques are known to improve code quality -
    • we get away from the 'C' style of declaring a bunch of variables at the top of a block, just in case we might need them;
    • we eliminate unused declarations;
    • we reduce the scope;
    • objects declared in nested blocks go out of scope when the nested block ends so they may be able to be garbage-collected sooner so the memory footprint is lower overall;
    • depending on any expensive initialisation, later declarations may even perform much better than earlier ones in some cases - notably when they are in a conditional branch that is not reached by the execution path.
  • We find that declarations now fall into two categories This dichotomy is useful because it helps indicate more strongly the purpose of each declaration when someone else reads your code.
    1. Single Assignment object declarations: objects that are created once and then we perform operations on them
    2. calculation variables: counters, accumulators and the like

For further reading:

* Footnote: immutable objects can be built up from objects containing only private final fields but only provided that all non-primitive fields are themselves immutable. However, a weaker form of immutability can be used where this condition is not met, but instead it is established that the fields are not settable in any way, neither internally nor externally - also including that their visibility must not be leaked.

 
comments powered by Disqus