Move over Java, Scala has arrived!

Built with Hugo Theme Blackburn

26 Oct 2010

java / scala / object oriented / functional programming / computer languages

Scala is a general purpose programming language designed to express common programming patterns in a concise, elegant, and type-safe way. It smoothly integrates features of object-oriented and functional languages. It is also fully interoperable with Java. Code sizes are typically reduced by a factor of two to three when compared to an equivalent Java application.

Scala - The New Kid on the Block

Actually not so very new, Scala was initially developed during 2001 - 2003. The second version appears in early 2006.

Only recently, however, has Scala gained clear interest.

The Big Ask is whether the software world is getting ready to adopt functional programming (FP). Once interest in FP has arisen, Scala is a natural candidate, being easily deployed alongside and atop existing Java tools and systems.

Scala is a more advanced language than Java but has been cleverly designed to provide an evolutionary path, rather than starting from square one. Many of the differences are basically trivial, but the overall syntax is more othogonal and so easier to learn, in spite of the extra effort needed to switch over from Java. Some of the conceptual differences are not at all trivial but bring much benefit in terms of greater expressiveness, robustness and scalability - so also worth the learning effort.

By way of a summary, here are some key things existing Java people will need to learn in order to adapt to Scala. This is not intended as a tutorial but as a broad overview of similarities and differences as a preparation for embarking on other tutorials, for which there are links at the end. Hopefully this will whet your appetite, like it did mine!

Scala A B C

One JVM

Scala compiles to the same bytecode as Java (although there is also a .Net version of Scala). Compiled Scala can call compiled Java and vice versa with little fuss.

Less Syntax

Scala makes lots of syntax optional. For example, it is normal to miss out the end-of-line semicolon. Also, for methods that take one argument, the ‘.’ conjunction and the parentheses can be omitted. For example, x.add(1) can be simply written x add 1. This is a considerable benefit because it allows domain-specific language extensions to Scala, merely by carefully designing the APIs.

Variables and Values

Scala promotes the use of immutability to improve code reliability. One way this appears is through the equal measure given to vals and vars. All ‘variables’ are either a var, which can be re-assigned, or a val, which can only be assigned once. The type information comes afterwards, so Java’s int x = 1; becomes var x: Int = 0 in Scala; or simply var x = 0 because the type can often be inferred automatically. The equivalent final int x = 1; becomes val x: Int = 0 in Scala, or simply val x = 0. As with Java’s final, if the val refers to a mutable object, it remains mutable even though the val cannot be re-assigned.

Symbols

Scala allows symbols: these are essentially interned strings and are useful for representing identities consistently. A single tick is used (without any closing tick); for example a symbol literal 'id is a shorthand for the expression scala.Symbol("id").

Packages

Scala allows two forms of the package declaration: the first is like Java’s, the second is an alternative form with curly braces surrounding the following code. The curly-brace form allows nesting of packages within parent packages. Furthermore, Scala introduces package objects to define types, variables, and methods that are visible at the level of the corresponding package.

Constructors

Unlike Java, Scala has two kinds of constructor. Each class must have a primary constructor; the constructor body being the rest of the class. Auxiliary constructors can be used to provide different ways to construct instances but they have to call a preceding constructor in the same class, (normally the primary constructor). This differs from Java, which allows any number of constructors and they don’t need to call each other, although they can. If there is a parent class, only the primary constructor can refer to the parent’s constructor.

Scala’s reduced flexibility confers some advantages and it is argued that the edge cases where this raises difficulties are actually cases where the class design needs some re-work anyway. For example, a Java class that needs two primary constructors has two concerns and should be re-formed as two separate classes.

Visibility

In Scala, everything is public unless marked protected or private. There is no public keyword and Java’s ‘package-private’ is only available indirectly. The rules for protected and private members are slightly more restrictive than Java, and types are treated quite differently. In particular, protected Scala members are not visible to unrelated classes in the same package. Private Scala members are not visible to any other classes, even nested ones. Protected Scala types are visible only within the same package and subpackages. Private Scala types are visible only within the same package. Finally, note that both protected and private modifiers can be annotated with the name of a particular package, or class, or this, in order to control the scope more finely.

Static Becomes Companions

The Java keyword static is one of the unfortunate bits of baggage from C. Just what does it mean? …well, several different barely-related things, alas. So it’s a Good Thing that Scala has dropped it. There is no static keyword. Instead, “everything is an object” in Scala - and methods operate on objects.

Instead of static members, Scala has a keyword object that can used in the place where class would have been used. Scala objects define singletons, i.e. classes for which there is only one instance. We need to remember that the terms object and instance are not interchangeable in Scala (unlike Java) because Scala objects define singletons.

This leads to a particular style of coding: if a class (or a type referring to a class) and an object are declared in the same file, in the same package, and with the same name, they are called a companion class (or companion type) and a companion object, respectively. As with all Scala objects, companion objects are singletons. Object methods are accessed exactly like Java static methods using objectName.method(...); fields of objects are likewise.

Just as static members are commonly used in Java, companions are commonly used in Scala. Java’s static final constants become vals in Scala objects.

The two most interesting methods frequently defined in a companion object are apply and unapply. These are special methods that are invoked by the compiler on an instance if no method name is given. So apply can be within a companion object as a factory method for creating instances of its companion class. The calling code does not need to invoke them directly. This is particularly useful if a factory method selects which concrete subclass to create based on the parameters used. unapply takes an instance and returns a corresponding tuple - in effect it unpacks an object.

Java has static blocks for arbitrary initialisation code. These are not needed either: class code belongs to a particular method/constructor, or by default to the primary constructor. If this is in a Scala class, it executes when instances are constructed.

Null Pointers

Coding that involves null pointers is discouraged in Scala. Indeed, to split hairs a moment, there is no null keyword per se, but the value null is predefined and can be assigned to any reference type with the same effect as in Java.

In preference to using nulls, however, the standard Scala Option and its subclasses Some and None provide a way to write methods that return values that might imply non-existence of something. Testing for None produces clearer code than a null pointer check and ensures that the reference in question is never actually null. This is a form of defensive programming encouraged by Tony Hoare, inventor of null pointers (Tony Hoare, Null References: The Billion Dollar Mistake).

Methods or Fields: Uniform Access

A public field may be accessed just like in Java. For example, var count = 0 is the Scala version of public int count = 0;. But such public fields are discouraged in Java; getter and setter methods are considered better style. However, this distinction is irrelevant in Scala. The public field can be replaced by a private field with public getter and setter methods, but in Scala, the methods have the same name^* as the orignal field so client code does not have to be changed so as to use them. This is Scala’s principle of uniform access: it doesn’t matter whether fields are public or encapsulated behind a getter/setter pair. Clients don’t know or care whether they are manipulating fields or calling methods to manipulate the fields.

* actually, the setter has an extended version of the same name; for example the field `count` is represented by a getter called `count` and a setter called `count_=`.

Because Scala allows uniform access, it requires that every member has a unique name. So, in the example above, the count field must be given a different name if the new methods that replace it are to take the name count. (Remember that Java’s field names must be unique and its method names must be unique but members as a whole need not be unique.)

Scala’s universality of fields and methods totally transforms ‘pojo’ data objects, giving a considerable simplification and reduction of code lines so that the overall purpose is more clear.

Generics

Scala’s parameterized classes are broadly similar to Java’s, except they eschew Java’s angle-bracket syntax, preferring square brackets (e.g. List[String]) instead. Java’s verbose ? extends X upper bounds syntax is simply [+X] in Scala, and [-X] is used correspondingly instead of Java’s ? super X lower bounds. Scala has additional syntax to deal with common covariance and contravariance issues that Java simply doesn’t address - Java only supports nonvariance, except for arrays.

Unlike Scala classes, Scala objects (described above) are instantiated by the runtime and cannot have any parameters.

Arrays

Scala supports mutable arrays in a similar way to Java, albeit with some consistency improvements. Also, the syntax has changed to use parentheses for indexing, because square brackets are used for generic types. So Java’s final String[] a = new String[5]; would be val a = new Array[String](5) instead; access to the zeroth element would use a(0). Actually, Array is a standard API class, not a language syntax element, but the compiler uses the JVM support for arrays for performance to match Java’s.

Tuples

Tuples are lists of values of diverse types… so think method parameter lists. Indeed the syntax is the same: a tuple as expressed with parentheses surrounding a comma-separated list of values (or types, when the type of the tuple is expressed). So, (1, "A") is a simple literal tuple with two values, and (Int, String) is its type. Tuples are easy to use and Java sorely lacks them.

Case Classes

Scala allows classes to be declared as case classes, which are intended to be used in pattern matching and message passing situations. For such classes, the compiler automatically converts the constructor arguments into immutable fields (vals). Also, the compiler automatically implements equals, hashCode, toString and copy methods to the class, which use the fields specified as constructor arguments. For classes that wrap groups of values, this is a big convenience. The automatic copy method uses named parameters with default values from the original instance, allowing different values to be supplied for any of the fields.

Instant immutable POJOs!

Functions and Methods

Scala methods are always introduced with the def keyword. Unlike Java, methods can contain nested methods - handy because the inner method has the formal parameters of the outer method in scope.

Because Scala methods can return tuples (i.e. several results at once), it is easier to write pure functions that don’t have side effects - and this is encouraged. Conversely, no-result methods return Unit instead of void (a keyword not available in Scala) and normally have side effects, making them procedures rather than functions: this useful distinction is present in many languages but not in the C/C++/Java family.

Parameter lists can take a variable number of arguments. Only the syntax differs - so the Java params (String... x) becomes (x: String*). Just as with Java, an array carries the variable-length list.

Functions can be assigned and passed as parameters to other functions, which is ideal for visitor patterns and other such usage. It avoids a lot of the heavy syntax in Java of declaring interfaces or anonymous classes with one method, merely to pass that as a function. More than one parameter list is possible: this is useful when passing functions as objects that have groups of the parameters known at different times in the call graph.

Inheritance

Scala shares Java’s single-inheritance paradim for classes. Scala adds traits as a way of providing ‘mix-in’ behaviour; traits provide interfaces.

Unlike Java, there is no explicit interface in Scala. Traits provide interfaces and yet they can optionally have method bodies and/or fields, so they look rather more like Java’s abstract classes than its interfaces. If a trait’s methods are all abstract and it has no fields, such a trait is exactly the same as a Java interface. Otherwise, its methods and fields are mixed into each class that uses the trait. Any class can mix in as many traits as it wants. So Scala provides both the clarity of single inheritance and the code reduction possible with multiple inheritance. The order in which traits are mixed into a class is significant: a clear rule defines the precedence that eliminates any ambiguity.

Scala’s ancestor of all classes is Any, which includes all objects and values (i.e. Java primitives). AnyRef, a subclass of Any, is equivalent to Java’s Object.

Whereas Java has the optional @Override annotation for overridden members, Scala requires the use of its override modifier whenever a member overrides something. This is a Good Thing because it allows the compiler to be more thorough. Both languages have a final modifer that shares the same semantics. In Scala, abstract methods simply have no body; they don’t need an abstract modifier. However, any class containing such methods is abstract and must be marked as such. The effect of these differences is for Scala to make the relationships between base and derived classes clearer than it is in Java.

Scala allows fields to be abstract, as well as methods, whereas Java only allows methods to be abstract. Futhermore, the type of an abstract field can itself be abstract, allowing base classes to be highly general. Indeed, this provides a useful alternative to using generic classes (which are called parameterized classes in Scala).

No Ternaries

Java’s ternary operator, e.g. int x = expression ? a : b, is replaced with if, which returns a value, e.g. val x = if (expression) a else b. This fits nicely into the functional style, in which the normal behaviour of functions is to compute and return values. Naturally, the if construct is itself a function that returns a value.

Switch Becomes Match With Patterns

Java’s switch is exactly like that in C from 1972: a jump table designed for compilers not humans. Terms must be compile-time constants such as integer literals. Each case falls through to the next unless a break statement is present.

Scala sweeps this aside with the simpler more useful match. Like if, match is functional because it returns the selected value. Each case is an expression that either uses simple constants (including strings), or a range of powerful pattern-matching declarations such as regular expressions.

No fall-through happens so no break is needed. The default case is represented by the standard pattern _ (just an underscore, the symbol used in Scala in imports etc for matching everything).

Easy Enumerations

Scala provides enumerations easily but doesn’t need language support for them because its type system is more capable than Java’s; so there is no enum keyword. Instead, objects are defined that extend scala.Enumeration and include the required enumerations as vals.

No Break, Continue or Return

Java’s break, continue and return are not needed in Scala. Apart from break in switch statements, neither break nor continue is strictly necessary in Java either, and are replaced in Scala using if or boolean flags. This is a more functional style of programming which you should find easy once you get the knack. It’s a Good Thing: after all, the goto statement is disallowed, being widely recognised as bad practice. But break, continue and return are merely subtle forms of goto.

Functions return values by ending with an expression that provides the result. Because the return keyword is not used, it is only ever possible to have a single exit point from any function, which is encourages a simpler style of programming than when there are many return statements.

Operator Overloading

Broadly speaking, Scala function names can contain arbitrary symbols, albeit with a few restrictions. Along with the reduced syntax (see above), clean operator overloading is easily achieved.

Explicit Overriding

In Java, overriding happens when a non-private field or method in a superclass is replaced by a non-private field or method with the same name in a subclass. Scala allows similar overriding, but differs because it has an override keyword and requires that it be used when intended (compiler errors result if used when unintended or not used when intended). This makes your code clearer and more robust.

Implicit Conversions

Scala allows conversion functions to be declared that take values of one type and return values of a different type. An implicit keyword is inserted before the function def so that the compiler knows this is a function it can use when it has otherwise failed to make an implicit conversion from one type to another type. This clever feature makes it easy to write subclasses that add methods and then have these subclasses used automatically. This gives the effect of dynamic class extension, but it’s done in a way that is type-safe and handled by the compiler.

For example, if you have an object of type Foo, and your function expects an object of type Bar, then if you have a function which does a Foo to Bar conversion, it gets injected in by the compiler whenever a Bar is expected but a Foo is actually provided instead.

Equality

Java and Scala implement equality testing differently. They both have a method called equals and the operator == (which is implemented as a method in Scala). However, == is different between the two languages: Scala follows Ruby’s practice of value testing instead of Java’s practice of reference testing. In Scala, == simply invokes equals. The != operator is simply the negation of ==.

Scala provides new operators eq and ne for testing references. So ne is the same as Java’s ==, whereas == is the same as Java’s equals.

Main Entry

Just as in Java, Scala’s main entry point is a method called main taking an array of strings, i.e. def main(args: Array[String]). It has to be in an object, not a class, and this usually cannot be a companion object.

Checked Exceptions

Scala does not have checked exceptions, unlike Java. Even Java’s checked exceptions are treated as unchecked by Scala. There is also no throws clause on method declarations. However, there is a @throws annotation that is useful for Java interoperability.

Built-in XML

Java has library support for XML. Unfortunately, there are many such libraries and none are comprehensive. although they have been very widely used and clearly successful therefore. There’s W3C DOM, JDOM, XmlBeans, XStream, XBI, JSP and JSTL to name but a few.

By contrast, XML is built into Scala. Rather like PHP and its peers, Scala allows the source code to flip between Scala’s syntax and templated XML. The standard library provides an API for searching for nodes a-la xpath. Serialization and deserialization is easily achieved.

In Scala, anywhere a value can appear, XML text can appear, surrounded of course in the enclosing tags, e.g. val fruit = <item>apple</item>. defines a value fruit which is a reference to a scala.xml.Node instance containing the p element and a text node within it. Scala expressions can appear within XML literals using curly braces, e.g. <list>{fruit}</list> will replace the {fruit} with the <item>apple</item> we declared earlier.

.NET As Well

This article has focussed on Scala and Java. We’ve talked about Scala’s clean integration with the Java platform, which allows Scala and Java programs to be integrated seamlessly on a single JVM. Importantly, the same is also true for Scala with C# programs on the .NET platform. Scala therefore provides an exciting way to write portable cross-platform code, reducing the development effort needed when this is a requirement or providing future-proofing when this requirement isn’t yet evident. Either way, it’s a major benefit of using Scala.

Pedigree

As a final note, people in the team behind Scala had earlier been heavily involved in Java’s development, especially in compiler writing. You may dislike some of the decisions described above, but they all have a purpose and it’s worth finding out what that its. Scala has the feel of a well researched and well thought through system, comprising the language, quality APIs and good tools. It’s a good time to get on board.