case class MyClass(var i:Int) {; | def += (j:Int) = { i+=j ; this }; | }; defined class MyClass; scala> val m = MyClass(6); m: MyClass = MyClass(6); scala> m + = 7; res0: MyClass = MyClass(13); scala> m += 9; res1: MyClass.">
Skip to content

Scala Operator Overloading Assignment

So, I've been teaching myself Scala recently, and it's a very interesting language.

One of the nice things I like about it, is it's support for creating DSLs, domain specific languages. A domain specific language - or at least my understanding of it - is a language that is written specifically for one problem domain. One example would be SQL, great for querying relational databases, useless for creating first person shooters.

Of course Scala itself is not a DSL, it's a general purpose language. However it does offer several features that allow you to simulate a DSL, in particular operator overloading, and implicit conversions. In this post I'm going to focus on the first of these...

Operator Overloading

So what's operator overloading?

Well operators are typically things such as , , and . You know those things you use to do arithmetic on numbers, or occasionally for manipulating Strings. Well, operator overloading - just like method overloading - allows you to redefine their behaviour for a particular type, and give them meaning for your own custom classes.

Hang on a minute! I'm sure someone once told me operator overloading was evil?

Indeed, this is quite a controversial topic. It's considered far too open for abuse by some, and was so maligned in C++ that the creators of Java deliberately disallowed it (excepting "+" for String concatenation).

I'm of a slightly different opinion, used responsibly it can be very useful. For example lots of different objects support a concept of addition, so why not just use an addition operator?

Lets say you were developing a complex number class, and you want to support addition. Wouldn't it be nicer to write...

...rather than...

The first example is much more natural don't you think?

So Scala allows you to overload operators then?

Well, not really. In fact, technically not at all.

So all this is just a tease? This is the most stupid blog post I've ever read. Scala's rubbish. I'm going back to Algol 68.

Wait a second, I've not finished. You see Scala doesn't support operator overloading, because it doesn't have operators!

Scala doesn't have operators? You've gone mad, I write stuff like "sum = 2 + 3" all the time, and what about all those funny list operations? "::", and ":/". They look like operators to me!

Well they're not. The thing is, Scala has a rather relaxed attitude to what you can name a method.

When you write...

,

...you're actually calling a method called on a type with a value of . You could even rewrite it as...

...if you really really wanted to.

Aha, I got it. So how do you go about overloading an operator then?

Simple, it's exactly the same as writing a normal method. Here's an example.

class Complex(val real : Double, val imag : Double) { def +(that: Complex) = new Complex(this.real + that.real, this.imag + that.imag) def -(that: Complex) = new Complex(this.real - that.real, this.imag - that.imag) override def toString = real + " + " + imag + "i" } object Complex { def main(args : Array[String]) : Unit = { var a = new Complex(4.0,5.0) var b = new Complex(2.0,3.0) println(a) // 4.0 + 5.0i println(a + b) // 6.0 + 8.0i println(a - b) // 2.0 + 2.0i } } Ok that's nice, what if I wanted a "not" operator though, ie something like a "!"

That's a unary prefix operator, and yes scala can support these, although in a more limited fashion than an infix operator like "+"

Only four operators can be supported in this fashion, , , , and . You simply need to call your methods or , etc. Here's how you might add a "~" to calculate the magnitude of a Complex number to our complex number class

class Complex(val real : Double, val imag : Double) { // ... def unary_~ = Math.sqrt(real * real + imag * imag) } object Complex { def main(args : Array[String]) : Unit = { var b = new Complex(2.0,3.0) prinln(~b) // 3.60555 } }

So that's all pretty simple, but please use responsibly. Don't create methods called "+" unless your class really does something that could be interpreted as addition. And never ever redefine the binary shift left operator "<<" as some sort of substitute for println. It's not clever and you'll make the Scala gods angry.

Hope you found that useful. Next up I'll cover implicit conversions. Another nice feature of Scala that really allows you to write your code in a more natural way

Expressions

Expressions are composed of operators and operands. Expression forms are discussed subsequently in decreasing order of precedence.

Expression Typing

The typing of expressions is often relative to some expected type (which might be undefined). When we write "expression $e$ is expected to conform to type $T$", we mean:

  1. the expected type of $e$ is $T$, and
  2. the type of expression $e$ must conform to $T$.

The following skolemization rule is applied universally for every expression: If the type of an expression would be an existential type $T$, then the type of the expression is assumed instead to be a skolemization of $T$.

Skolemization is reversed by type packing. Assume an expression $e$ of type $T$ and let $t_1[\mathit{tps}_1] >: L_1 <: U_1 , \ldots , t_n[\mathit{tps}_n] >: L_n <: U_n$ be all the type variables created by skolemization of some part of $e$ which are free in $T$. Then the packed type of $e$ is

Literals

Typing of literals is as described here; their evaluation is immediate.

The Null Value

The value is of type , and is thus compatible with every reference type. It denotes a reference value which refers to a special “” object. This object implements methods in class as follows:

  • and return iff the argument $x$ is also the "null" object.
  • and return true iff the argument x is not also the "null" object.
  • always returns .
  • returns the default value of type $T$.
  • returns .

A reference to any other member of the "null" object causes a to be thrown.

Designators

A designator refers to a named term. It can be a simple name or a selection.

A simple name $x$ refers to a value as specified here. If $x$ is bound by a definition or declaration in an enclosing class or object $C$, it is taken to be equivalent to the selection where $C$ is taken to refer to the class containing $x$ even if the type name $C$ is shadowed at the occurrence of $x$.

If $r$ is a stable identifier of type $T$, the selection $r.x$ refers statically to a term member $m$ of $r$ that is identified in $T$ by the name $x$.

For other expressions $e$, $e.x$ is typed as if it was , for some fresh name $y$.

The expected type of a designator's prefix is always undefined. The type of a designator is the type $T$ of the entity it refers to, with the following exception: The type of a path $p$ which occurs in a context where a stable type is required is the singleton type .

The contexts where a stable type is required are those that satisfy one of the following conditions:

  1. The path $p$ occurs as the prefix of a selection and it does not designate a constant, or
  2. The expected type $\mathit{pt}$ is a stable type, or
  3. The expected type $\mathit{pt}$ is an abstract type with a stable type as lower bound, and the type $T$ of the entity referred to by $p$ does not conform to $\mathit{pt}$, or
  4. The path $p$ designates a module.

The selection $e.x$ is evaluated by first evaluating the qualifier expression $e$, which yields an object $r$, say. The selection's result is then the member of $r$ that is either defined by $m$ or defined by a definition overriding $m$.

This and Super

The expression can appear in the statement part of a template or compound type. It stands for the object being defined by the innermost template or compound type enclosing the reference. If this is a compound type, the type of is that compound type. If it is a template of a class or object definition with simple name $C$, the type of this is the same as the type of .

The expression is legal in the statement part of an enclosing class or object definition with simple name $C$. It stands for the object being defined by the innermost such definition. If the expression's expected type is a stable type, or occurs as the prefix of a selection, its type is , otherwise it is the self type of class $C$.

A reference refers statically to a method or type $m$ in the least proper supertype of the innermost template containing the reference. It evaluates to the member $m'$ in the actual supertype of that template which is equal to $m$ or which overrides $m$. The statically referenced member $m$ must be a type or a method. <!-- explanation: so that we need not create several fields for overriding vals -->

If it is a method, it must be concrete, or the template containing the reference must have a member $m'$ which overrides $m$ and which is labeled .

A reference refers statically to a method or type $m$ in the least proper supertype of the innermost enclosing class or object definition named $C$ which encloses the reference. It evaluates to the member $m'$ in the actual supertype of that class or object which is equal to $m$ or which overrides $m$. The statically referenced member $m$ must be a type or a method. If the statically referenced member $m$ is a method, it must be concrete, or the innermost enclosing class or object definition named $C$ must have a member $m'$ which overrides $m$ and which is labeled .

The prefix may be followed by a trait qualifier , as in . This is called a static super reference. In this case, the reference is to the type or method of $x$ in the parent trait of $C$ whose simple name is $T$. That member must be uniquely defined. If it is a method, it must be concrete.

Example

Consider the following class definitions

The linearization of class is and the linearization of class is . Then we have:

Note that the function returns different results depending on whether is mixed in with class or .

Function Applications

An application applies the function $f$ to the argument expressions $e_1 , \ldots , e_m$. If $f$ has a method type , the type of each argument expression $e_i$ is typed with the corresponding parameter type $T_i$ as expected type. Let $S_i$ be type type of argument $e_i$ $(i = 1 , \ldots , m)$. If $f$ is a polymorphic method, local type inference is used to determine type arguments for $f$. If $f$ has some value type, the application is taken to be equivalent to , i.e. the application of an method defined by $f$.

The function $f$ must be applicable to its arguments $e_1 , \ldots , e_n$ of types $S_1 , \ldots , S_n$.

If $f$ has a method type $(p_1:T_1 , \ldots , p_n:T_n)U$ we say that an argument expression $e_i$ is a named argument if it has the form $x_i=e'_i$ and $x_i$ is one of the parameter names $p_1 , \ldots , p_n$. The function $f$ is applicable if all of the following conditions hold:

  • For every named argument $x_i=e_i'$ the type $S_i$ is compatible with the parameter type $T_j$ whose name $p_j$ matches $x_i$.
  • For every positional argument $e_i$ the type $S_i$ is compatible with $T_i$.
  • If the expected type is defined, the result type $U$ is compatible to it.

If $f$ is a polymorphic method it is applicable if local type inference can determine type arguments so that the instantiated method is applicable. If $f$ has some value type it is applicable if it has a method member named which is applicable.

Evaluation of usually entails evaluation of $f$ and $e_1 , \ldots , e_n$ in that order. Each argument expression is converted to the type of its corresponding formal parameter. After that, the application is rewritten to the function's right hand side, with actual arguments substituted for formal parameters. The result of evaluating the rewritten right-hand side is finally converted to the function's declared result type, if one is given.

The case of a formal parameter with a parameterless method type is treated specially. In this case, the corresponding actual argument expression $e$ is not evaluated before the application. Instead, every use of the formal parameter on the right-hand side of the rewrite rule entails a re-evaluation of $e$. In other words, the evaluation order for -parameters is call-by-name whereas the evaluation order for normal parameters is call-by-value. Furthermore, it is required that $e$'s packed type conforms to the parameter type $T$. The behavior of by-name parameters is preserved if the application is transformed into a block due to named or default arguments. In this case, the local value for that parameter has the form and the argument passed to the function is .

The last argument in an application may be marked as a sequence argument, e.g. . Such an argument must correspond to a repeated parameter of type and it must be the only argument matching this parameter (i.e. the number of formal parameters and actual arguments must be the same). Furthermore, the type of $e$ must conform to , for some type $T$ which conforms to $S$. In this case, the argument list is transformed by replacing the sequence $e$ with its elements. When the application uses named arguments, the vararg parameter has to be specified exactly once.

A function application usually allocates a new frame on the program's run-time stack. However, if a local function or a final method calls itself as its last action, the call is executed using the stack-frame of the caller.

Example

Assume the following function which computes the sum of a variable number of arguments:

Then

both yield as result. On the other hand,

would not typecheck.

Named and Default Arguments

If an application might uses named arguments $p = e$ or default arguments, the following conditions must hold.

  • For every named argument $p_i = e_i$ which appears left of a positional argument in the argument list $e_1 \ldots e_m$, the argument position $i$ coincides with the position of parameter $p_i$ in the parameter list of the applied function.
  • The names $x_i$ of all named arguments are pairwise distinct and no named argument defines a parameter which is already specified by a positional argument.
  • Every formal parameter $p_j:T_j$ which is not specified by either a positional or a named argument has a default argument.

If the application uses named or default arguments the following transformation is applied to convert it into an application without named or default arguments.

If the function $f$ has the form it is transformed into the block

If the function $f$ is itself an application expression the transformation is applied recursively on $f$. The result of transforming $f$ is a block of the form

where every argument in $(\mathit{args}_1) , \ldots , (\mathit{args}_l)$ is a reference to one of the values $x_1 , \ldots , x_k$. To integrate the current application into the block, first a value definition using a fresh name $y_i$ is created for every argument in $e_1 , \ldots , e_m$, which is initialised to $e_i$ for positional arguments and to $e'_i$ for named arguments of the form . Then, for every parameter which is not specified by the argument list, a value definition using a fresh name $z_i$ is created, which is initialized using the method computing the default argument of this parameter.

Let $\mathit{args}$ be a permutation of the generated names $y_i$ and $z_i$ such such that the position of each name matches the position of its corresponding parameter in the method type . The final result of the transformation is a block of the form

Signature Polymorphic Methods

For invocations of signature polymorphic methods of the target platform , the invoked function has a different method type at each call site. The parameter types are the types of the argument expressions and is the expected type at the call site. If the expected type is undefined then is . The parameter names are fresh.

Note

On the Java platform version 7 and later, the methods and in class are signature polymorphic.

Method Values

The expression is well-formed if $e$ is of method type or if $e$ is a call-by-name parameter. If $e$ is a method with parameters, represents $e$ converted to a function type by eta expansion. If $e$ is a parameterless method or call-by-name parameter of type , represents the function of type , which evaluates $e$ when it is applied to the empty parameterlist .

Example

The method values in the left column are each equivalent to the eta-expanded expressions on the right.

placeholder syntaxeta-expansion

Note that a space is necessary between a method name and the trailing underscore because otherwise the underscore would be considered part of the name.

Type Applications

A type application instantiates a polymorphic value $e$ of type with argument types . Every argument type $T_i$ must obey the corresponding bounds $L_i$ and $U_i$. That is, for each $i = 1 , \ldots , n$, we must have $\sigma L_i <: T_i <: \sigma U_i$, where $\sigma$ is the substitution $[a_1 := T_1 , \ldots , a_n := T_n]$. The type of the application is $\sigma S$.

If the function part $e$ is of some value type, the type application is taken to be equivalent to , i.e. the application of an method defined by $e$.

Type applications can be omitted if local type inference can infer best type parameters for a polymorphic functions from the types of the actual function arguments and the expected result type.

Tuples

A tuple expression is an alias for the class instance creation , where $n \geq 2$. The empty tuple is the unique value of type .

Instance Creation Expressions

A simple instance creation expression is of the form where $c$ is a constructor invocation. Let $T$ be the type of $c$. Then $T$ must denote a (a type instance of) a non-abstract subclass of . Furthermore, the concrete self type of the expression must conform to the self type of the class denoted by $T$. The concrete self type is normally $T$, except if the expression appears as the right hand side of a value definition

(where the type annotation may be missing). In the latter case, the concrete self type of the expression is the compound type .

The expression is evaluated by creating a fresh object of type $T$ which is initialized by evaluating $c$. The type of the expression is $T$.

A general instance creation expression is of the form for some class template $t$. Such an expression is equivalent to the block

where $a$ is a fresh name of an anonymous class which is inaccessible to user programs.

There is also a shorthand form for creating values of structural types: If is a class body, then is equivalent to the general instance creation expression .

Example

Consider the following structural instance creation expression:

This is a shorthand for the general instance creation expression

The latter is in turn a shorthand for the block

where is some freshly created name.

Blocks

A block expression is constructed from a sequence of block statements $s_1 , \ldots , s_n$ and a final expression $e$. The statement sequence may not contain two definitions or declarations that bind the same name in the same namespace. The final expression can be omitted, in which case the unit value is assumed.

The expected type of the final expression $e$ is the expected type of the block. The expected type of all preceding statements is undefined.

The type of a block is , where $T$ is the type of $e$ and $Q$ contains existential clauses for every value or type name which is free in $T$ and which is defined locally in one of the statements $s_1 , \ldots , s_n$. We say the existential clause binds the occurrence of the value or type name. Specifically,

  • A locally defined type definition is bound by the existential clause . It is an error if $t$ carries type parameters.
  • A locally defined value definition is bound by the existential clause .
  • A locally defined class definition is bound by the existential clause where $T$ is the least class type or refinement type which is a proper supertype of the type $c$. It is an error if $c$ carries type parameters.
  • A locally defined object definition is bound by the existential clause where $T$ is the least class type or refinement type which is a proper supertype of the type .

Evaluation of the block entails evaluation of its statement sequence, followed by an evaluation of the final expression $e$, which defines the result of the block.

Example

Assuming a class , the block

has the type . The block

simply has type , because with the rules here the existentially quantified type can be simplified to .

Prefix, Infix, and Postfix Operations

Expressions can be constructed from operands and operators.

Prefix Operations

A prefix operation $\mathit{op};e$ consists of a prefix operator $\mathit{op}$, which must be one of the identifiers ‘’, ‘’, ‘’ or ‘’. The expression $\mathit{op};e$ is equivalent to the postfix method application .

Prefix operators are different from normal function applications in that their operand expression need not be atomic. For instance, the input sequence is read as , whereas the function application would be parsed as the application of the infix operator to the operands and .

Postfix Operations

A postfix operator can be an arbitrary identifier. The postfix operation $e;\mathit{op}$ is interpreted as $e.\mathit{op}$.

Infix Operations

An infix operator can be an arbitrary identifier. Infix operators have precedence and associativity defined as follows:

The precedence of an infix operator is determined by the operator's first character. Characters are listed below in increasing order of precedence, with characters on the same line having the same precedence.

That is, operators starting with a letter have lowest precedence, followed by operators starting with `', etc.

There's one exception to this rule, which concerns assignment operators. The precedence of an assignment operator is the same as the one of simple assignment . That is, it is lower than the precedence of any other operator.

The associativity of an operator is determined by the operator's last character. Operators ending in a colon `' are right-associative. All other operators are left-associative.

Precedence and associativity of operators determine the grouping of parts of an expression as follows.

  • If there are several infix operations in an expression, then operators with higher precedence bind more closely than operators with lower precedence.
  • If there are consecutive infix operations $e_0; \mathit{op}_1; e_1; \mathit{op}_2 \ldots \mathit{op}_n; e_n$ with operators $\mathit{op}_1 , \ldots , \mathit{op}_n$ of the same precedence, then all these operators must have the same associativity. If all operators are left-associative, the sequence is interpreted as $(\ldots(e_0;\mathit{op}_1;e_1);\mathit{op}_2\ldots);\mathit{op}_n;e_n$. Otherwise, if all operators are right-associative, the sequence is interpreted as $e_0;\mathit{op}_1;(e_1;\mathit{op}_2;(\ldots \mathit{op}_n;e_n)\ldots)$.
  • Postfix operators always have lower precedence than infix operators. E.g. $e_1;\mathit{op}_1;e_2;\mathit{op}_2$ is always equivalent to $(e_1;\mathit{op}_1;e_2);\mathit{op}_2$.

The right-hand operand of a left-associative operator may consist of several arguments enclosed in parentheses, e.g. $e;\mathit{op};(e_1,\ldots,e_n)$. This expression is then interpreted as $e.\mathit{op}(e_1,\ldots,e_n)$.

A left-associative binary operation $e_1;\mathit{op};e_2$ is interpreted as $e_1.\mathit{op}(e_2)$. If $\mathit{op}$ is right-associative, the same operation is interpreted as , where $x$ is a fresh name.

Assignment Operators

An assignment operator is an operator symbol (syntax category in Identifiers) that ends in an equals character “”, with the exception of operators for which one of the following conditions holds:

  1. the operator also starts with an equals character, or
  2. the operator is one of , , .

Assignment operators are treated specially in that they can be expanded to assignments if no other interpretation is valid.

Let's consider an assignment operator such as in an infix operation , where $l$, $r$ are expressions. This operation can be re-interpreted as an operation which corresponds to the assignment

except that the operation's left-hand-side $l$ is evaluated only once.

The re-interpretation occurs if the following two conditions are fulfilled.

  1. The left-hand-side $l$ does not have a member named , and also cannot be converted by an implicit conversion to a value with a member named .
  2. The assignment is type-correct. In particular this implies that $l$ refers to a variable or object that can be assigned to, and that is convertible to a value with a member named .

Typed Expressions

The typed expression $e: T$ has type $T$. The type of expression $e$ is expected to conform to $T$. The result of the expression is the value of $e$ converted to type $T$.

Example

Here are examples of well-typed and ill-typed expressions.

Annotated Expressions

An annotated expression attaches annotations $a_1 , \ldots , a_n$ to the expression $e$.

Assignments

The interpretation of an assignment to a simple variable depends on the definition of $x$. If $x$ denotes a mutable variable, then the assignment changes the current value of $x$ to be the result of evaluating the expression $e$. The type of $e$ is expected to conform to the type of $x$. If $x$ is a parameterless function defined in some template, and the same template contains a setter function as member, then the assignment is interpreted as the invocation of that setter function. Analogously, an assignment to a parameterless function $x$ is interpreted as the invocation .

An assignment with a function application to the left of the ‘’ operator is interpreted as , i.e. the invocation of an function defined by $f$.

Example

Here are some assignment expressions and their equivalent expansions.

assignmentexpansion
Example Imperative Matrix Multiplication

Here is the usual imperative code for matrix multiplication.

Desugaring the array accesses and assignments yields the following expanded version:

Conditional Expressions

The conditional expression chooses one of the values of $e_2$ and $e_3$, depending on the value of $e_1$. The condition $e_1$ is expected to conform to type . The then-part $e_2$ and the else-part $e_3$ are both expected to conform to the expected type of the conditional expression. The type of the conditional expression is the weak least upper bound of the types of $e_2$ and $e_3$. A semicolon preceding the symbol of a conditional expression is ignored.

The conditional expression is evaluated by evaluating first $e_1$. If this evaluates to , the result of evaluating $e_2$ is returned, otherwise the result of evaluating $e_3$ is returned.

A short form of the conditional expression eliminates the else-part. The conditional expression is evaluated as if it was .

While Loop Expressions

The while loop expression is typed and evaluated as if it was an application of where the hypothetical function is defined as follows.

Do Loop Expressions

The do loop expression is typed and evaluated as if it was the expression . A semicolon preceding the symbol of a do loop expression is ignored.

For Comprehensions and For Loops

A for loop executes expression $e$ for each binding generated by the enumerators $\mathit{enums}$. A for comprehension evaluates expression $e$ for each binding generated by the enumerators $\mathit{enums}$ and collects the results. An enumerator sequence always starts with a generator; this can be followed by further generators, value definitions, or guards. A generator produces bindings from an expression $e$ which is matched in some way against pattern $p$. A value definition binds the value name $p$ (or several names in a pattern $p$) to the result of evaluating the expression $e$. A guard contains a boolean expression which restricts enumerated bindings. The precise meaning of generators and guards is defined by translation to invocations of four methods: , , , and . These methods can be implemented in different ways for different carrier types.

The translation scheme is as follows. In a first step, every generator , where $p$ is not irrefutable for the type of $e$ is replaced by

Then, the following rules are applied repeatedly until all comprehensions have been eliminated.

  • A for comprehension is translated to .
  • A for loop is translated to .
  • A for comprehension

    where is a (possibly empty) sequence of generators, definitions, or guards, is translated to

  • A for loop

    where is a (possibly empty) sequence of generators, definitions, or guards, is translated to

  • A generator followed by a guard is translated to a single generator where $x_1 , \ldots , x_n$ are the free variables of $p$.

  • A generator followed by a value definition is translated to the following generator of pairs of values, where $x$ and $x'$ are fresh names:

Example

The following code produces all pairs of numbers between $1$ and $n-1$ whose sums are prime.

The for comprehension is translated to:

Example

For comprehensions can be used to express vector and matrix algorithms concisely. For instance, here is a function to compute the transpose of a given matrix:

Here is a function to compute the scalar product of two vectors:

Finally, here is a function to compute the product of two matrices. Compare with the imperative version.

The code above makes use of the fact that , , , and are defined for instances of class .

Return Expressions

A return expression must occur inside the body of some enclosing named method or function. The innermost enclosing named method or function in a source program, $f$, must have an explicitly declared result type, and the type of $e$ must conform to it. The return expression evaluates the expression $e$ and returns its value as the result of $f$. The evaluation of any statements or expressions following the return expression is omitted. The type of a return expression is .

The expression $e$ may be omitted. The return expression is type-checked and evaluated as if it was .

An method which is generated by the compiler as an expansion of an anonymous function does not count as a named function in the source program, and therefore is never the target of a return expression.

Returning from a nested anonymous function is implemented by throwing and catching a . Any exception catches between the point of return and the enclosing methods might see the exception. A key comparison makes sure that these exceptions are only caught by the method instance which is terminated by the return.

If the return expression is itself part of an anonymous function, it is possible that the enclosing instance of $f$ has already returned before the return expression is executed. In that case, the thrown will not be caught, and will propagate up the call stack.

Throw Expressions

A throw expression evaluates the expression $e$. The type of this expression must conform to . If $e$ evaluates to an exception reference, evaluation is aborted with the thrown exception. If $e$ evaluates to , evaluation is instead aborted with a . If there is an active expression which handles the thrown exception, evaluation resumes with the handler; otherwise the thread executing the is aborted. The type of a throw expression is .

Try Expressions

A try expression is of the form where the handler $h$ is a pattern matching anonymous function

This expression is evaluated by evaluating the block $b$. If evaluation of $b$ does not cause an exception to be thrown, the result of $b$ is returned. Otherwise the handler $h$ is applied to the thrown exception. If the handler contains a case matching the thrown exception, the first such case is invoked. If the handler contains no case matching the thrown exception, the exception is re-thrown.

Let $\mathit{pt}$ be the expected type of the try expression. The block $b$ is expected to conform to $\mathit{pt}$. The handler $h$ is expected conform to type . The type of the try expression is the weak least upper bound of the type of $b$ and the result type of $h$.

A try expression evaluates the block $b$. If evaluation of $b$ does not cause an exception to be thrown, the expression $e$ is evaluated. If an exception is thrown during evaluation of $e$, the evaluation of the try expression is aborted with the thrown exception. If no exception is thrown during evaluation of $e$, the result of $b$ is returned as the result of the try expression.

If an exception is thrown during evaluation of $b$, the finally block $e$ is also evaluated. If another exception $e$ is thrown during evaluation of $e$, evaluation of the try expression is aborted with the thrown exception. If no exception is thrown during evaluation of $e$, the original exception thrown in $b$ is re-thrown once evaluation of $e$ has completed. The block $b$ is expected to conform to the expected type of the try expression. The finally expression $e$ is expected to conform to type .

A try expression is a shorthand for .

Anonymous Functions

The anonymous function maps parameters $x_i$ of types $T_i$ to a result given by expression $e$. The scope of each formal parameter $x_i$ is $e$. Formal parameters must have pairwise distinct names.

If the expected type of the anonymous function is of the form , the expected type of $e$ is $R$ and the type $T_i$ of any of the parameters $x_i$ can be omitted, in which case is assumed. If the expected type of the anonymous function is some other type, all formal parameter types must be explicitly given, and the expected type of $e$ is undefined. The type of the anonymous function is, where $T$ is the packed type of $e$. $T$ must be equivalent to a type which does not refer to any of the formal parameters $x_i$.

The anonymous function is evaluated as the instance creation expression

In the case of a single untyped formal parameter, can be abbreviated to . If an anonymous function with a single typed parameter appears as the result expression of a block, it can be abbreviated to .

A formal parameter may also be a wildcard represented by an underscore . In that case, a fresh name for the parameter is chosen arbitrarily.

A named parameter of an anonymous function may be optionally preceded by an modifier. In that case the parameter is labeled ; however the parameter section itself does not count as an implicit parameter section in the sense defined here. Hence, arguments to anonymous functions always have to be given explicitly.

Example

Examples of anonymous functions:

Placeholder Syntax for Anonymous Functions

An expression (of syntactic category ) may contain embedded underscore symbols at places where identifiers are legal. Such an expression represents an anonymous function where subsequent occurrences of underscores denote successive parameters.

Define an underscore section to be an expression of the form where $T$ is a type, or else of the form , provided the underscore does not appear as the expression part of a type ascription .

An expression $e$ of syntactic category binds an underscore section $u$, if the following two conditions hold: (1) $e$ properly contains $u$, and (2) there is no other expression of syntactic category which is properly contained in $e$ and which itself properly contains $u$.

If an expression $e$ binds underscore sections $u_1 , \ldots , u_n$, in this order, it is equivalent to the anonymous function where each $u_i'$ results from $u_i$ by replacing the underscore with a fresh identifier and $e'$ results from $e$ by replacing each underscore section $u_i$ by $u_i'$.

Example

The anonymous functions in the left column use placeholder syntax. Each of these is equivalent to the anonymous function on its right.

Constant Expressions

Constant expressions are expressions that the Scala compiler can evaluate to a constant. The definition of "constant expression" depends on the platform, but they include at least the expressions of the following forms:

  • A literal of a value class, such as an integer
  • A string literal
  • A class constructed with
  • An element of an enumeration from the underlying platform
  • A literal array, of the form , where all of the $c_i$'s are themselves constant expressions
  • An identifier defined by a constant value definition.

Statements

Statements occur as parts of blocks and templates. A statement can be an import, a definition or an expression, or it can be empty. Statements used in the template of a class definition can also be declarations. An expression that is used as a statement can have an arbitrary value type. An expression statement $e$ is evaluated by evaluating $e$ and discarding the result of the evaluation.

Block statements may be definitions which bind local names in the block. The only modifier allowed in all block-local definitions is . When prefixing a class or object definition, modifiers , , and are also permitted.

Evaluation of a statement sequence entails evaluation of the statements in the order they are written.

Implicit Conversions

Implicit conversions can be applied to expressions whose type does not match their expected type, to qualifiers in selections, and to unapplied methods. The available implicit conversions are given in the next two sub-sections.

We say, a type $T$ is compatible to a type $U$ if $T$ weakly conforms to $U$ after applying eta-expansion and view applications.

Value Conversions

The following seven implicit conversions can be applied to an expression $e$ which has some value type $T$ and which is type-checked with some expected type $\mathit{pt}$.

Static Overloading Resolution

If an expression denotes several possible members of a class, overloading resolution is applied to pick a unique member.

Type Instantiation

An expression $e$ of polymorphic type

which does not appear as the function part of a type application is converted to a type instance of $T$ by determining with local type inference instance types for the type variables and implicitly embedding $e$ in the type application.

Numeric Widening

If $e$ has a primitive number type which weakly conforms to the expected type, it is widened to the expected type using one of the numeric conversion methods , , , , , defined here.

Numeric Literal Narrowing

If the expected type is , or , and the expression $e$ is an integer literal fitting in the range of that type, it is converted to the same literal in that type.

Value Discarding

If $e$ has some value type and the expected type is , $e$ is converted to the expected type by embedding it in the term .

View Application

If none of the previous conversions applies, and $e$'s type does not conform to the expected type $\mathit{pt}$, it is attempted to convert $e$ to the expected type with a view.

Dynamic Member Selection

If none of the previous conversions applies, and $e$ is a prefix of a selection $e.x$, and $e$'s type conforms to class , then the selection is rewritten according to the rules for dynamic member selection.

Method Conversions

The following four implicit conversions can be applied to methods which are not applied to some argument list.

Evaluation

A parameterless method $m$ of type is always converted to type $T$ by evaluating the expression to which $m$ is bound.

Implicit Application

If the method takes only implicit parameters, implicit arguments are passed following the rules here.

Eta Expansion

Otherwise, if the method is not a constructor, and the expected type $\mathit{pt}$ is a function type $(\mathit{Ts}') \Rightarrow T'$, eta-expansion is performed on the expression $e$.

Empty Application

Otherwise, if $e$ has method type $()T$, it is implicitly applied to the empty argument list, yielding $e()$.

Overloading Resolution

If an identifier or selection $e$ references several members of a class, the context of the reference is used to identify a unique member. The way this is done depends on whether or not $e$ is used as a function. Let $\mathscr{A}$ be the set of members referenced by $e$.

Assume first that $e$ appears as a function in an application, as in .

One first determines the set of functions that is potentially applicable based on the shape of the arguments.

The shape of an argument expression $e$, written $\mathit{shape}(e)$, is a type that is defined as follows:

  • For a function expression : , where occurs $n$ times in the argument type.
  • For a named argument : $\mathit{shape}(e)$.
  • For all other expressions: .

Let $\mathscr{B}$ be the set of alternatives in $\mathscr{A}$ that are applicable to expressions $(e_1 , \ldots , e_n)$ of types $(\mathit{shape}(e_1) , \ldots , \mathit{shape}(e_n))$. If there is precisely one alternative in $\mathscr{B}$, that alternative is chosen.

Otherwise, let $S_1 , \ldots , S_m$ be the vector of types obtained by typing each argument with an undefined expected type. For every member $m$ in $\mathscr{B}$ one determines whether it is applicable to expressions ($e_1 , \ldots , e_m$) of types $S_1 , \ldots , S_m$. It is an error if none of the members in $\mathscr{B}$ is applicable. If there is one single applicable alternative, that alternative is chosen. Otherwise, let $\mathscr{CC}$ be the set of applicable alternatives which don't employ any default argument in the application to $e_1 , \ldots , e_m$. It is again an error if $\mathscr{CC}$ is empty. Otherwise, one chooses the most specific alternative among the alternatives in $\mathscr{CC}$, according to the following definition of being "as specific as", and "more specific than":

  • A parameterized method $m$ of type is as specific as some other member $m'$ of type $S$ if $m'$ is applicable to arguments of types $T_1 , \ldots , T_n$.
  • A polymorphic method of type is as specific as some other member of type $S$ if $T$ is as specific as $S$ under the assumption that for $i = 1 , \ldots , n$ each $a_i$ is an abstract type name bounded from below by $L_i$ and from above by $U_i$.
  • A member of any other type is always as specific as a parameterized method or a polymorphic method.
  • Given two members of types $T$ and $U$ which are neither parameterized nor polymorphic method types, the member of type $T$ is as specific as the member of type $U$ if the existential dual of $T$ conforms to the existential dual of $U$. Here, the existential dual of a polymorphic type is . The existential dual of every other type is the type itself.

The relative weight of an alternative $A$ over an alternative $B$ is a number from 0 to 2, defined as the sum of

  • 1 if $A$ is as specific as $B$, 0 otherwise, and
  • 1 if $A$ is defined in a class or object which is derived from the class or object defining $B$, 0 otherwise.

A class or object $C$ is derived from a class or object $D$ if one of the following holds:

  • $C$ is a subclass of $D$, or
  • $C$ is a companion object of a class derived from $D$, or
  • $D$ is a companion object of a class from which $C$ is derived.

An alternative $A$ is more specific than an alternative $B$ if the relative weight of $A$ over $B$ is greater than the relative weight of $B$ over $A$.

It is an error if there is no alternative in $\mathscr{CC}$ which is more specific than all other alternatives in $\mathscr{CC}$.

Assume next that $e$ appears as a function in a type application, as in . Then all alternatives in $\mathscr{A}$ which take the same number of type parameters as there are type arguments in $\mathit{targs}$ are chosen. It is an error if no such alternative exists. If there are several such alternatives, overloading resolution is applied again to the whole expression .

Assume finally that $e$ does not appear as a function in either an application or a type application. If an expected type is given, let $\mathscr{B}$ be the set of those alternatives in $\mathscr{A}$ which are compatible to it. Otherwise, let $\mathscr{B}$ be the same as $\mathscr{A}$. We choose in this case the most specific alternative among all alternatives in $\mathscr{B}$. It is an error if there is no alternative in $\mathscr{B}$ which is more specific than all other alternatives in $\mathscr{B}$.

Example

Consider the following definitions:

Then the application refers to the first definition of $f$ whereas the application refers to the second. Assume now we add a third overloaded definition

Then the application is rejected for being ambiguous, since no most specific applicable signature exists.

Local Type Inference

Local type inference infers type arguments to be passed to expressions of polymorphic type. Say $e$ is of type [$a_1$ >: $L_1$ <: $U_1 , \ldots , a_n$ >: $L_n$ <: $U_n$]$T$ and no explicit type parameters are given.

Local type inference converts this expression to a type application . The choice of the type arguments $T_1 , \ldots , T_n$ depends on the context in which the expression appears and on the expected type $\mathit{pt}$. There are three cases.

Case 1: Selections

If the expression appears as the prefix of a selection with a name $x$, then type inference is deferred to the whole expression $e.x$. That is, if $e.x$ has type $S$, it is now treated as having type [$a_1$ >: $L_1$ <: $U_1 , \ldots , a_n$ >: $L_n$ <: $U_n$]$S$, and local type inference is applied in turn to infer type arguments for $a_1 , \ldots , a_n$, using the context in which $e.x$ appears.

Case 2: Values

If the expression $e$ appears as a value without being applied to value arguments, the type arguments are inferred by solving a constraint system which relates the expression's type $T$ with the expected type $\mathit{pt}$. Without loss of generality we can assume that $T$ is a value type; if it is a method type we apply eta-expansion to convert it to a function type. Solving means finding a substitution $\sigma$ of types $T_i$ for the type parameters $a_i$ such that

  • None of the inferred types $T_i$ is a singleton type
  • All type parameter bounds are respected, i.e. $\sigma L_i <: \sigma a_i$ and $\sigma a_i <: \sigma U_i$ for $i = 1 , \ldots , n$.
  • The expression's type conforms to the expected type, i.e. $\sigma T <: \sigma \mathit{pt}$.

It is a compile time error if no such substitution exists. If several substitutions exist, local-type inference will choose for each type variable $a_i$ a minimal or maximal type $T_i$ of the solution space. A maximal type $T_i$ will be chosen if the type parameter $a_i$ appears contravariantly in the type $T$ of the expression. A minimal type $T_i$ will be chosen in all other situations, i.e. if the variable appears covariantly, non-variantly or not at all in the type $T$. We call such a substitution an optimal solution of the given constraint system for the type $T$.

Case 3: Methods

The last case applies if the expression $e$ appears in an application $e(d_1 , \ldots , d_m)$. In that case $T$ is a method type $(p_1:R_1 , \ldots , p_m:R_m)T'$. Without loss of generality we can assume that the result type $T'$ is a value type; if it is a method type we apply eta-expansion to convert it to a function type. One computes first the types $S_j$ of the argument expressions $d_j$, using two alternative schemes. Each argument expression $d_j$ is typed first with the expected type $R_j$, in which the type parameters $a_1 , \ldots , a_n$ are taken as type constants. If this fails, the argument $d_j$ is typed instead with an expected type $R_j'$ which results from $R_j$ by replacing every type parameter in $a_1 , \ldots , a_n$ with undefined.

In a second step, type arguments are inferred by solving a constraint system which relates the method's type with the expected type $\mathit{pt}$ and the argument types $S_1 , \ldots , S_m$. Solving the constraint system means finding a substitution $\sigma$ of types $T_i$ for the type parameters $a_i$ such that

  • None of the inferred types $T_i$ is a singleton type
  • All type parameter bounds are respected, i.e. $\sigma L_i <: \sigma a_i$ and $\sigma a_i <: \sigma U_i$ for $i = 1 , \ldots , n$.
  • The method's result type $T'$ conforms to the expected type, i.e. $\sigma T' <: \sigma \mathit{pt}$.
  • Each argument type weakly conforms to the corresponding formal parameter type, i.e. $\sigma S_j <:_w \sigma R_j$ for $j = 1 , \ldots , m$.

It is a compile time error if no such substitution exists. If several solutions exist, an optimal one for the type $T'$ is chosen.

All or parts of an expected type $\mathit{pt}$ may be undefined. The rules for conformance are extended to this case by adding the rule that for any type $T$ the following two statements are always true: $\mathit{undefined} <: T$ and $T <: \mathit{undefined}$

It is possible that no minimal or maximal solution for a type variable exists, in which case a compile-time error results. Because $<:$ is a pre-order, it is also possible that a solution set has several optimal solutions for a type. In that case, a Scala compiler is free to pick any one of them.

Example

Consider the two methods:

and the definition

The application of is typed with an undefined expected type. This application is completed by local type inference to . Here, one uses the following reasoning to infer the type argument for the type parameter :

First, the argument expressions are typed. The first argument has type whereas the second argument is itself polymorphic. One tries to type-check with an expected type . This leads to the constraint system

where we have labeled with a question mark to indicate that it is a variable in the constraint system. Because class is covariant, the optimal solution of this constraint is

In a second step, one solves the following constraint system for the type parameter of :

The optimal solution of this constraint system is

so is the type inferred for .

Example

Consider now the definition

where is defined of type as before. In this case local type inference proceeds as follows.

First, the argument expressions are typed. The first argument has type . The second argument is first tried to be typed with expected type . This fails, as is not a subtype of . Therefore, the second strategy is tried; is now typed with expected type . This succeeds and yields the argument type .

In a second step, one solves the following constraint system for the type parameter of :

The optimal solution of this constraint system is

so is the type inferred for .

Eta Expansion

Eta-expansion converts an expression of method type to an equivalent expression of function type. It proceeds in two steps.

First, one identifies the maximal sub-expressions of $e$; let's say these are $e_1 , \ldots , e_m$. For each of these, one creates a fresh name $x_i$. Let $e'$ be the expression resulting from replacing every maximal subexpression $e_i$ in $e$ by the corresponding fresh name $x_i$. Second, one creates a fresh name $y_i$ for every argument type $T_i$ of the method ($i = 1 , \ldots , n$). The result of eta-conversion is then:

The behavior of call-by-name parameters is preserved under eta-expansion: the corresponding actual argument expression, a sub-expression of parameterless method type, is not evaluated in the expanded block.

Dynamic Member Selection

The standard Scala library defines a trait which defines a member as follows:

Assume a selection of the form $e.x$ where the type of $e$ conforms to . Further assuming the selection is not followed by any function arguments, such an expression can be rewritten under the conditions given here to:

If the selection is followed by some arguments, e.g. $e.x(\mathit{args})$, then that expression is rewritten to