Scala is a general purpose programming language designed to express common programming patterns in a concise, elegant, and type-safe way. It smoothly integrates features of object-oriented and functional languages. It is also fully interoperable with Java. Code sizes are typically reduced by a factor of two to three when compared to an equivalent Java application.
Scala - The New Kid on the Block
Actually not so very new, Scala was initially developed during 2001 - 2003. The second version appears in early 2006.
Only recently, however, has Scala gained clear interest.
The Big Ask is whether the software world is getting ready to adopt functional programming (FP). Once interest in FP has arisen, Scala is a natural candidate, being easily deployed alongside and atop existing Java tools and systems.
Scala is a more advanced language than Java but has been cleverly designed to provide an evolutionary path, rather than starting from square one. Many of the differences are basically trivial, but the overall syntax is more othogonal and so easier to learn, in spite of the extra effort needed to switch over from Java. Some of the conceptual differences are not at all trivial but bring much benefit in terms of greater expressiveness, robustness and scalability - so also worth the learning effort.
By way of a summary, here are some key things existing Java people will need to learn in order to adapt to Scala. This is not intended as a tutorial but as a broad overview of similarities and differences as a preparation for embarking on other tutorials, for which there are links at the end. Hopefully this will whet your appetite, like it did mine!
Scala A B C
One JVM
Scala compiles to the same bytecode as Java (although there is also a .Net version of Scala). Compiled Scala can call compiled Java and vice versa with little fuss.
Less Syntax
Scala makes lots of syntax optional. For example, it is normal to miss out the end-of-line semicolon.
Also, for methods that take one argument, the ‘.’ conjunction and the parentheses can be
omitted. For example, x.add(1)
can be simply written x add 1
. This is a
considerable benefit because it allows domain-specific language extensions to Scala, merely by carefully
designing the APIs.
Variables and Values
Scala promotes the use of immutability to improve code reliability. One way this appears is through the equal
measure given to vals and vars. All ‘variables’ are either a var
, which can be re-assigned, or a
val
,
which can only be assigned once. The type information comes afterwards, so Java’s
int x = 1;
becomes var x: Int = 0
in Scala; or simply var x = 0
because the type can often be inferred automatically. The equivalent
final int x = 1;
becomes val x: Int = 0
in Scala, or simply val x = 0
. As
with Java’s final
, if the val
refers to a mutable object, it remains mutable even though the
val
cannot be re-assigned.
Symbols
Scala allows symbols: these are essentially interned strings and are useful for representing identities
consistently. A single tick is used (without any closing tick); for example a symbol literal 'id
is a shorthand for the expression scala.Symbol("id")
.
Packages
Scala allows two forms of the package
declaration: the first is like Java’s, the second is an
alternative form with curly braces surrounding the following code. The curly-brace form allows nesting of
packages within parent packages. Furthermore, Scala introduces package objects to define types, variables,
and methods that are visible at the level of the corresponding package.
Constructors
Unlike Java, Scala has two kinds of constructor. Each class must have a primary constructor; the constructor body being the rest of the class. Auxiliary constructors can be used to provide different ways to construct instances but they have to call a preceding constructor in the same class, (normally the primary constructor). This differs from Java, which allows any number of constructors and they don’t need to call each other, although they can. If there is a parent class, only the primary constructor can refer to the parent’s constructor.
Scala’s reduced flexibility confers some advantages and it is argued that the edge cases where this raises difficulties are actually cases where the class design needs some re-work anyway. For example, a Java class that needs two primary constructors has two concerns and should be re-formed as two separate classes.
Visibility
In Scala, everything is public unless marked protected
or private
. There is no public
keyword and Java’s ‘package-private’ is only available indirectly. The rules for protected and private members
are slightly more restrictive than Java, and types are treated quite differently. In particular, protected Scala
members are not visible to unrelated classes in the same package. Private Scala members are not
visible to any other classes, even nested ones. Protected Scala types are visible only within the
same package and subpackages. Private Scala types are visible only within the
same package. Finally, note that both protected and private modifiers can be annotated with the
name of a particular package, or class, or this
, in order to control the scope more finely.
Static Becomes Companions
The Java keyword static
is one of the unfortunate bits of baggage from C. Just what does it mean?
…well, several different barely-related things, alas. So it’s a Good Thing that Scala has dropped it.
There is no static
keyword. Instead, “everything is an object” in Scala - and methods operate on
objects.
Instead of static members, Scala has a keyword object
that can used in the place where class
would have been used. Scala objects define singletons, i.e. classes for which there is only one
instance. We need to remember that the terms object and instance are not interchangeable
in Scala (unlike Java) because Scala objects define singletons.
This leads to a particular style of coding: if a class (or a type referring to a class) and an object are
declared in the same file, in the same package, and with the same name, they are called a companion class (or companion type) and a companion object,
respectively. As with all Scala objects, companion objects are singletons. Object methods are
accessed exactly like Java static methods using objectName.method(...)
; fields of objects are likewise.
Just as static members are commonly used in Java, companions are commonly used in Scala. Java’s static
final
constants become val
s in Scala objects.
The two most interesting methods frequently defined in a companion object are apply
and unapply
.
These are special methods that are invoked by the compiler on an instance if no method name is given. So
apply
can be within a companion object as a factory method for creating instances of its companion
class. The calling code does not need to invoke them directly. This is particularly useful if a
factory method selects which concrete subclass to create based on the parameters used. unapply
takes an
instance and returns a corresponding tuple - in effect it unpacks an object.
Java has static
blocks for arbitrary initialisation code. These are not needed either: class code
belongs to a particular method/constructor, or by default to the primary constructor. If this is in a
Scala class, it executes when instances are constructed.
Null Pointers
Coding that involves null pointers is discouraged in Scala. Indeed, to split hairs a moment, there is no
null
keyword per se, but the value null
is predefined and can be assigned to any reference
type with the same effect as in Java.
In preference to using nulls, however, the standard Scala Option
and its subclasses Some
and
None
provide a way to write methods that return values that might imply non-existence of something.
Testing for None
produces clearer code than a null pointer check and ensures that the reference in
question is never actually null. This is a form of defensive programming encouraged by Tony
Hoare, inventor of null pointers (Tony Hoare, Null References: The Billion Dollar Mistake).
Methods or Fields: Uniform Access
A public field may be accessed just like in Java. For example, var count = 0
is the
Scala version of public int count = 0;
. But such public fields are discouraged in
Java; getter and setter methods are considered better style. However, this distinction is irrelevant in
Scala. The public field can be replaced by a private field with public getter and setter methods, but in
Scala, the methods have the same name* as the orignal field so client code does not have to be
changed so as to use them. This is Scala’s principle of uniform access: it doesn’t matter whether fields
are public or encapsulated behind a getter/setter pair. Clients don’t know or care whether they are
manipulating fields or calling methods to manipulate the fields.
* actually, the setter has an extended version of the same name; for example the field `count` is represented by a getter called `count` and a setter called `count_=`.
Because Scala allows uniform access, it requires that every member has a unique name. So, in the
example above, the count
field must be given a different name if the new methods that replace it are to
take the name count
. (Remember that Java’s field names must be unique and its
method names must be unique but members as a whole need not be unique.)
Scala’s universality of fields and methods totally transforms ‘pojo’ data objects, giving a considerable simplification and reduction of code lines so that the overall purpose is more clear.
Generics
Scala’s parameterized classes are broadly similar to Java’s, except they eschew Java’s angle-bracket syntax,
preferring square brackets (e.g. List[String]
) instead. Java’s verbose ? extends X
upper
bounds syntax is simply [+X]
in Scala, and [-X]
is used correspondingly instead of Java’s ?
super X
lower bounds. Scala has additional syntax to deal with common covariance and
contravariance issues that Java simply doesn’t address - Java only supports nonvariance, except for arrays.
Unlike Scala classes, Scala objects (described above) are instantiated by the runtime and cannot have any parameters.
Arrays
Scala supports mutable arrays in a similar way to Java, albeit with some consistency improvements. Also,
the syntax has changed to use parentheses for indexing, because square brackets are used for generic types.
So Java’s final String[] a = new String[5]
; would be val a = new Array[String](5)
instead; access to the zeroth element would use a(0)
. Actually, Array
is a standard API
class, not a language syntax element, but the compiler uses the JVM support for arrays for performance to match
Java’s.
Tuples
Tuples are lists of values of diverse types… so think method parameter lists. Indeed the syntax is the
same: a tuple as expressed with parentheses surrounding a comma-separated list of values (or types, when the
type of the tuple is expressed). So, (1, "A")
is a simple literal tuple with two values, and
(Int, String)
is its type. Tuples are easy to use and Java sorely lacks them.
Case Classes
Scala allows classes to be declared as case classes, which are intended to be used in pattern matching
and message passing situations. For such classes, the compiler automatically converts the constructor
arguments into immutable fields (vals). Also, the compiler automatically implements equals
, hashCode
,
toString
and copy
methods to the class, which use the fields specified as constructor
arguments. For classes that wrap groups of values, this is a big convenience. The automatic
copy
method uses named parameters with default values from the original instance, allowing different
values to be supplied for any of the fields.
Instant immutable POJOs!
Functions and Methods
Scala methods are always introduced with the def
keyword. Unlike Java, methods can contain nested
methods - handy because the inner method has the formal parameters of the outer method in scope.
Because Scala methods can return tuples (i.e. several results at once), it is easier to write pure functions that
don’t have side effects - and this is encouraged. Conversely, no-result methods return Unit
instead of
void
(a keyword not available in Scala) and normally have side effects, making them procedures rather
than functions: this useful distinction is present in many languages but not in the C/C++/Java family.
Parameter lists can take a variable number of arguments. Only the syntax differs - so the Java params (String...
x)
becomes (x: String*)
. Just as with Java, an array carries the variable-length list.
Functions can be assigned and passed as parameters to other functions, which is ideal for visitor patterns and other such usage. It avoids a lot of the heavy syntax in Java of declaring interfaces or anonymous classes with one method, merely to pass that as a function. More than one parameter list is possible: this is useful when passing functions as objects that have groups of the parameters known at different times in the call graph.
Inheritance
Scala shares Java’s single-inheritance paradim for classes. Scala adds traits as a way of providing ‘mix-in’ behaviour; traits provide interfaces.
Unlike Java, there is no explicit interface
in Scala. Traits provide interfaces and yet they can
optionally have method bodies and/or fields, so they look rather more like Java’s abstract classes than its
interfaces. If a trait’s methods are all abstract and it has no fields, such a trait is exactly the same
as a Java interface. Otherwise, its methods and fields are mixed into each class that uses the trait.
Any class can mix in as many traits as it wants. So Scala provides both the clarity of single inheritance
and the code reduction possible with multiple inheritance. The order in which traits are mixed into a
class is significant: a clear rule defines the precedence that eliminates any ambiguity.
Scala’s ancestor of all classes is Any
, which includes all objects and values (i.e. Java primitives).
AnyRef
, a subclass of Any
, is equivalent to Java’s Object
.
Whereas Java has the optional @Override
annotation for overridden members, Scala requires the
use of its override
modifier whenever a member overrides something. This is a Good Thing because it
allows the compiler to be more thorough. Both languages have a final
modifer that shares the same
semantics. In Scala, abstract methods simply have no body; they don’t need an abstract
modifier.
However, any class containing such methods is abstract
and must be marked as such. The effect of
these differences is for Scala to make the relationships between base and derived classes clearer than it is in
Java.
Scala allows fields to be abstract, as well as methods, whereas Java only allows methods to be abstract. Futhermore, the type of an abstract field can itself be abstract, allowing base classes to be highly general. Indeed, this provides a useful alternative to using generic classes (which are called parameterized classes in Scala).
No Ternaries
Java’s ternary operator, e.g. int x = expression ? a : b
, is replaced with if, which returns a
value, e.g. val x = if (expression) a else b
. This fits nicely into the functional style, in which the
normal behaviour of functions is to compute and return values. Naturally, the if
construct is itself a
function that returns a value.
Switch Becomes Match With Patterns
Java’s switch
is exactly like that in C from 1972: a jump table designed for compilers not humans. Terms
must be compile-time constants such as integer literals. Each case falls through to the next unless a break
statement is present.
Scala sweeps this aside with the simpler more useful match
. Like if
, match
is
functional because it returns the selected value. Each case is an expression that either uses simple constants
(including strings), or a range of powerful pattern-matching declarations such as regular expressions.
No fall-through happens so no break
is needed. The default case is represented by the standard
pattern _
(just an underscore, the symbol used in Scala in imports etc for matching everything).
Easy Enumerations
Scala provides enumerations easily but doesn’t need language support for them because its type system is more
capable than Java’s; so there is no enum
keyword. Instead, objects are defined that extend scala.Enumeration
and include the required enumerations as val
s.
No Break, Continue or Return
Java’s break
, continue
and return
are not needed in Scala. Apart from
break
in switch statements, neither break
nor continue
is strictly necessary in Java
either, and are replaced in Scala using if
or boolean flags. This is a more functional style of
programming which you should find easy once you get the knack. It’s a Good Thing: after all, the goto
statement is disallowed, being widely recognised as bad practice. But break
, continue
and return
are merely subtle forms of goto.
Functions return values by ending with an expression that provides the result. Because the return
keyword is not used, it is only ever possible to have a single exit point from any function, which is encourages
a simpler style of programming than when there are many return
statements.
Operator Overloading
Broadly speaking, Scala function names can contain arbitrary symbols, albeit with a few restrictions. Along with the reduced syntax (see above), clean operator overloading is easily achieved.
Explicit Overriding
In Java, overriding happens when a non-private field or method in a superclass is replaced by a non-private field
or method with the same name in a subclass. Scala allows similar overriding, but differs because it has an
override
keyword and requires that it be used when intended (compiler errors result if used
when unintended or not used when intended). This makes your code clearer and more robust.
Implicit Conversions
Scala allows conversion functions to be declared that take values of one type and return values of a different
type. An implicit
keyword is inserted before the function def
so that the compiler knows this
is a function it can use when it has otherwise failed to make an implicit conversion from one type to another
type. This clever feature makes it easy to write subclasses that add methods and then have these subclasses used
automatically. This gives the effect of dynamic class extension, but it’s done in a way that is type-safe and
handled by the compiler.
For example, if you have an object of type Foo
, and your function expects an object of type Bar
,
then if you have a function which does a Foo
to Bar
conversion, it gets injected in by
the compiler whenever a Bar is expected but a Foo is actually provided instead.
Equality
Java and Scala implement equality testing differently. They both have a method called equals
and the
operator ==
(which is implemented as a method in Scala). However, ==
is different
between the two languages: Scala follows Ruby’s practice of value testing instead of Java’s practice of
reference testing. In Scala, ==
simply invokes equals. The !=
operator is simply the negation of ==
.
Scala provides new operators eq
and ne
for testing references. So ne
is the same
as Java’s ==
, whereas ==
is the same as Java’s equals
.
Main Entry
Just as in Java, Scala’s main entry point is a method called main taking an array of strings,
i.e. def main(args: Array[String])
.
It has to be in an object, not a class, and this usually cannot be a companion object.
Checked Exceptions
Scala does not have checked exceptions, unlike Java. Even Java’s checked exceptions are treated as unchecked by
Scala. There is also no throws
clause on method declarations. However, there is a @throws
annotation that is useful for Java interoperability.
Built-in XML
Java has library support for XML. Unfortunately, there are many such libraries and none are comprehensive. although they have been very widely used and clearly successful therefore. There’s W3C DOM, JDOM, XmlBeans, XStream, XBI, JSP and JSTL to name but a few.
By contrast, XML is built into Scala. Rather like PHP and its peers, Scala allows the source code to flip between Scala’s syntax and templated XML. The standard library provides an API for searching for nodes a-la xpath. Serialization and deserialization is easily achieved.
In Scala, anywhere a value can appear, XML text can appear, surrounded of course in the enclosing tags,
e.g. val fruit = <item>apple</item>
.
defines a value fruit
which is a reference to a scala.xml.Node
instance containing the
p element and a text node within it. Scala expressions can appear within XML literals
using curly braces,
e.g. <list>{fruit}</list>
will replace the {fruit}
with the <item>apple</item>
we declared earlier.
.NET As Well
This article has focussed on Scala and Java. We’ve talked about Scala’s clean integration with the Java platform, which allows Scala and Java programs to be integrated seamlessly on a single JVM. Importantly, the same is also true for Scala with C# programs on the .NET platform. Scala therefore provides an exciting way to write portable cross-platform code, reducing the development effort needed when this is a requirement or providing future-proofing when this requirement isn’t yet evident. Either way, it’s a major benefit of using Scala.
Pedigree
As a final note, people in the team behind Scala had earlier been heavily involved in Java’s development, especially in compiler writing. You may dislike some of the decisions described above, but they all have a purpose and it’s worth finding out what that its. Scala has the feel of a well researched and well thought through system, comprising the language, quality APIs and good tools. It’s a good time to get on board.
Further reading
- Learning Scala
- Programming Scala online book
- Scala for Java Refugees tutorial
- Scala (Wikipedia)