Month: July 2012

Scala need-to-knows

Posted by – July 13, 2012

You know Java. You’ve heard Scala is some awesome lean-and-mean, pure-OO, functional version of the language, but are daunted by its perceived complexity and all those underscores. The good news is that it’s really not that hard, at least, not when you understand these morsels of knowledge:

use the REPL

Once you’ve added the SCALA_HOME directory to your path, you can load the REPL by typing “scala” at the command line. Use it!

the apply method

// is short for

This is one bit of syntactic sugar that you just have to know.

It’s useful because now the syntax for applying a function is the same as the familiar syntax for calling a method, and the distinction between methods and functions begins to melt away.

functions are objects

Functions in Scala are objects just like Strings or Ints; their type depends on their “arity”, in other words how many arguments they take. For example a function that takes an Int and returns a String is of type `Function1[Int, String]`, which can be written as `Int => String`. A function that takes an Int and a String and returns a Char is type Function2[Int, String, Char], aka `(Int, String) => Char`.

There’s often confusion between what is a “function”, “method”, “function object” or “function literal”. In Programming in Scala they refer to “functions” and “methods” interchangeably; however the community consensus is that “method” should be used for `def`s, and `function` reserved for function objects as described above.

Methods can be used as functions: if you write a method name where a Function object is required, the compiler will create a Function object for you.

expressions everywhere

In functional languages, almost everything is an expression, i.e. it evaluates to something. This is incredibly useful. Important examples are if-expressions, try-catch-finally, and for-expressions:

val x = if (true) 42 else 101  // x is now 42

val y = try io.Source.fromFile("somefile.txt") catch { case _ => sys.error("ZOMG") }

val z = for (i <- 1 to 3) yield i * 2 // z is Vector(2, 4, 6)

singletons / companion objects

In Java, everything must be declared within a class or enum. In Scala, we have classes, but also the `object` keyword, which is used for declaring singleton objects. Use objects for declaring what would have been static methods in Java, static data, enumerations, implicits, factory methods, your main method, anything that you might want to import into another scope, basically.

vals and vars

`val` and `var` are used to declare values / variables (in a method), or properties (in a class / object). `val` values are immutable; `var` variables are not. Always use `val` unless you have a good reason not to; mutability is A Bad Thing.

object Foo {
  val size = 42  // constant property, with getter of same name
  var age = 33   // variable property, with getter and setter of same name
  def foo = {
    val n = 100  // final local variable
    var i = 0    // non-final local variable

def vs val vs lazy val

Each of these defines a value.

`def` will re-evaluate every time it’s called, and can take parameters, so it’s like a method in Java when defined on an object, but it can also be defined locally within methods.

`val` is eagerly evaluated only once.

`lazy val` is lazily evaluated (i.e. not before it’s actually called) once, and is useful, say, when it makes an expensive calculation that won’t always be used. Also often useful in recursion. Think of it as a no-arg method that caches itself.

the meaning of _

_ is Scala’s wildcard character, similar to * in Java. It has a number of meanings, detailed here, most of which are quite advanced and you don’t need to know them… just yet anyway. :)

Predef and pre-imported stuff

By default the following are imported:


So if you are wondering how your code already knows what a “Seq” is, or where “->” or “println” comes from, it’ll be in one of these, which you can browse in Scaladoc.


Scala doesn’t have for-loops corresponding to

for (int i = 0; i < 10; i++) { ... }

But this is to be celebrated, because a) they’re a pain to write, and b) in Java you should be using an “enhanced-for” (i.e. for each) loop most of the time anyway. Scala goes further and nudges you away from using loops at all, because they’re inherently imperative.

// Using while-loop
var i = 0
while(i < 10) {
  i += 1
// Using Scala for-loop
for (i <- 1 to 10) {
// Using collection functions
1 to 10 foreach { i => ... }

The while-loop might be faster than the for-loop or foreach, as of Scala 2.9, but optimizations are apparently forthcoming. But if you use functional patterns, you shouldn’t need these often, except maybe for I/O.

functional style

All you need to do to write in a functional style is to always use immutable variables and datastructures. That’s it.

Don’t mutate an object by changing a property: return a new copy, just like you did with String in Java. Paradoxically, this is sometimes more efficient than updating an existing structure, since immutability means that most of the existing structure can be re-used.

Instead of looping, perform a map or fold operation on a collection, or recursion if need be. Now you’re 90% of  the way there!

operator-style method calls

// "Infix" notation
// can also be written
obj method argument

This also works if your method doesn’t take an argument (“postfix notation”), but is discouraged since it can interfere with semicolon inference:

// can also be written
obj method

Why use these? It can make your code read more clearly, particularly when your methods have operator character names. For instance, 1 + 2 is clearer than its equivalent 1.+(2) .

right-associative methods

Methods that end with a : character are right-associative when you use them without the dot. e.g.

1 :: List(1,2,3)
// is the same as

Here, :: is a method on the List, with 1 as the argument.

optional parentheses on method calls

If a method takes no arguments, then the empty parentheses can be omitted. This reduces clutter.

There is a convention whereby parentheses are included when a method has side-effects (i.e. the method does something, rather than return a value), for instance the next() method on an iterator.

val list = List(1,2,3)
val a = list.toString   //good
val b = list.toString() //bad
val it = list.iterator  //good
val c =         //bad
val d =       //good: has side effects (changing the state of the iterator)

operator characters

Scala allows you to use symbols for identifiers (i.e. your variable / class names). Some characters are classed as alphanumeric (e.g. a1£$), some as operators (<+&*). Some you can’t use at all ([`;.). See here for the guide. The catch is, you can’t mix operators and alphanumerics in a given identifier (unless separated by an underscore).

rich types

You’ll see things like “1 to 10”, and after a while you’ll figure out that “to” is a method being called on “1” with a “10” argument. So you check out the Scaladocs, and find that “Int” has no “to” method. What’s going on? Well, it tells you in little letters at the top that there is an implicit conversion to RichInt, so you’ll need to look there instead. Here are some such implicit conversions that are useful to know about:

Int    -> RichInt
Char   -> RichChar
// etc for all primitives
Array  -> ArrayOps
String -> StringOps

Also, you won’t find `String` in the documentation, except for a type definition that tells you it’s really `java.lang.String`. So you’ll need to keep a link to the Java docs handy if you need to look up its methods (if you’re on Windows, this download is super-useful).

what are implicits?

Implicit values allow you to take values from the present scope as input into a function without having to specify them explicitly (i.e. type them out) in the function call. There can only be one of any given type in scope at a time (else, you get a compile error telling you of the ambiguity when you try to use it). The compiler looks for implicit values that are a) imported (in the default imports or by you), b) in the class’s companion object, or c) in a bunch of other places (which you can google).

An example is the `sorted` method on a `List`, which takes and implicit `Ordering[A]` argument. Say you want to sort List(1,2,3). Its `sorted` method requires an `Ordering[Int]` object to be in scope.

Implicit conversions are applied by the compiler when it finds an object of type A in a context where it requires one of type B, i.e. as an argument, or as an object where the method being called is only available on B. If there is an implicit function A => B in scope, this will automatically be applied. This can be very useful, for simulating extension methods and in DSLs, but generally should be avoided in everyday code since it can make code confusing to read and maintain.

methods of unit type

`Unit` is Scala’s equivalent of the `void` return type. You generally won’t use these when programming in a functional style, except perhaps in a `main` method. You can write them in two ways:

def doSomething: Unit = println("this method returns nothing")
// or
def doSomething {
  println("this method returns nothing")

function syntactic sugar

This is perhaps one of the most confusing things for the Scala novice, coupled with the fact that there are multiple ways to declare functions.

Let’s take a simple example of a function that adds 1 to an Int.

def addOne(x: Int) = x + 1

But wait! Didn’t we just say that this is a method, not a function? Yes, but the compiler will automatically convert this method to a function object if we use its name in such a context (e.g. as an argument). We can also manually turn this into a function object using an underscore, which represents the missing arguments:

val f = addOne _ 

Now let’s try defining the same thing the long way:

val addOne = new Function1[Int, Int] { def apply(x: Int) = x + 1 }

That’s quite verbose, so here’s where the `=>` keyword comes in; the following is exactly equivalent to the new object declation above:

val addOne = (x: Int) => x + 1

But we can do better! The repeated `x` is unnecessary; we can use an underscore:

val addOne = (_: Int) + 1

If the compiler can infer the type, we don’t need the type annotation. This could be in a situation where we’re entering an anonymous function as an argument, or if we’re simply supplying the information on the left hand side:

val addOne: Int => Int = x => x + 1
// or
val addOne: Int => Int = _ + 1

Which of these to use? Generally I use `def`, since the syntax is more familiar, and they can be type-parameterized (think generics) which makes them more flexible.

“as varargs” : _*

When using a method that takes varargs, i.e. a sequence of arguments of unspecified length, but you have a collection object that you want to enter, you can put : _* after it, to “type” it as individual elements:

Vector(1,2,3) // Vector's apply method takes varargs

val xs = List(1,2,3)

Vector(xs) // Vector(List(1,2,3)) - a Vector with a single List element: NOT what we wanted!

Vector(xs: _*) // = Vector(1,2,3)


Imports are much more flexible than in Java. You can have an import anywhere, not just at the top of a file, and it’s confined to that scope. You can even import the methods on a variable!

val xs = List(1,2,3)
import xs._
reverse // = List(3,2,1)

scaladocs at the ready

First, find the docs on your hard drive (somewhere along the lines of “scala2.9/doc/scala-devel-docs/api/index.html”). Bookmark this page and leave it open in your browser.

To find a class, object, trait or package, type its name into the box in the top-left.

Often classes / traits have a companion object containing utility methods: to to switch, click on the circle next to the name in the main window that looks like it’s peeling off, or on the little blue “o” next to the class name in the search column.

All methods from supertypes are shown by default, which is generally useful, but experiment clicking the “ordering by inheritance” button.

To search for a method by name where you don’t know the class, click on the appropriate letter of the alphabet under the search box.

Big improvements to Scaldoc are apparently coming in Scala 2.10!