title | layout | chapter |
---|---|---|
Pattern Matching |
default |
8 |
Pattern ::= Pattern1 { ‘|’ Pattern1 }
Pattern1 ::= boundvarid ‘:’ TypePat
| ‘_’ ‘:’ TypePat
| Pattern2
Pattern2 ::= id [‘@’ Pattern3]
| Pattern3
Pattern3 ::= SimplePattern
| SimplePattern {id [nl] SimplePattern}
SimplePattern ::= ‘_’
| varid
| Literal
| StableId
| StableId ‘(’ [Patterns] ‘)’
| StableId ‘(’ [Patterns ‘,’] [id ‘@’] ‘_’ ‘*’ ‘)’
| ‘(’ [Patterns] ‘)’
| XmlPattern
Patterns ::= Pattern {‘,’ Patterns}
A pattern is built from constants, constructors, variables and type tests. Pattern matching tests whether a given value (or sequence of values) has the shape defined by a pattern, and, if it does, binds the variables in the pattern to the corresponding components of the value (or sequence of values). The same variable name may not be bound more than once in a pattern.
Some examples of patterns are:
- The pattern
ex: IOException
matches all instances of classIOException
, binding variableex
to the instance. - The pattern
Some(x)
matches values of the formSome($v$)
, bindingx
to the argument value$v$ of theSome
constructor. - The pattern
(x, _)
matches pairs of values, bindingx
to the first component of the pair. The second component is matched with a wildcard pattern. - The pattern
x :: y :: xs
matches lists of length$\geq 2$ , bindingx
to the list's first element,y
to the list's second element, andxs
to the remainder. - The pattern
1 | 2 | 3
matches the integers between 1 and 3.
Pattern matching is always done in a context which supplies an expected type of the pattern. We distinguish the following kinds of patterns.
SimplePattern ::= ‘_’
| varid
A variable pattern _
which is treated as if it was a fresh variable on each occurrence.
Pattern1 ::= varid ‘:’ TypePat
| ‘_’ ‘:’ TypePat
A typed pattern
Pattern2 ::= varid ‘@’ Pattern3
A pattern binder $x$@$p$
consists of a pattern variable
SimplePattern ::= Literal
A literal pattern ==
) to the literal
SimplePattern ::= StableId
A stable identifier pattern is a stable identifier $r$ == $v$
(see here).
To resolve the syntactic overlap with a variable pattern, a stable identifier pattern may not be a simple name starting with a lower-case letter. However, it is possible to enclose such a variable name in backquotes; then it is treated as a stable identifier pattern.
Consider the following function definition:
def f(x: Int, y: Int) = x match {
case y => ...
}
Here, y
is a variable pattern, which matches any value.
If we wanted to turn the pattern into a stable identifier pattern, this
can be achieved as follows:
def f(x: Int, y: Int) = x match {
case `y` => ...
}
Now, the pattern matches the y
parameter of the enclosing function f
.
That is, the match succeeds only if the x
argument and the y
argument of f
are equal.
SimplePattern ::= StableId ‘(’ [Patterns] ‘)’
A constructor pattern is of the form
A special case arises when
SimplePattern ::= ‘(’ [Patterns] ‘)’
A tuple pattern ($p_1 , \ldots , p_n$)
is an alias
for the constructor pattern scala.Tuple$n$($p_1 , \ldots , p_n$)
,
where ()
is the unique value of type scala.Unit
.
SimplePattern ::= StableId ‘(’ [Patterns] ‘)’
An extractor pattern unapply
or unapplySeq
that matches
the pattern.
An unapply
method in an object
-
$n=0$ andunapply
's result type isBoolean
. In this case the extractor pattern matches all values$v$ for which$x$.unapply($v$)
yieldstrue
. -
$n=1$ andunapply
's result type isOption[$T$]
, for some type$T$ . In this case, the (only) argument pattern$p_1$ is typed in turn with expected type$T$ . The extractor pattern matches then all values$v$ for which$x$.unapply($v$)
yields a value of formSome($v_1$)
, and$p_1$ matches$v_1$ . -
$n>1$ andunapply
's result type isOption[($T_1 , \ldots , T_n$)]
, for some types$T_1 , \ldots , T_n$ . In this case, the argument patterns$p_1 , \ldots , p_n$ are typed in turn with expected types$T_1 , \ldots , T_n$ . The extractor pattern matches then all values$v$ for which$x$.unapply($v$)
yields a value of formSome(($v_1 , \ldots , v_n$))
, and each pattern$p_i$ matches the corresponding value$v_i$ .
An unapplySeq
method in an object Option[($T_1 , \ldots , T_m$, Seq[S])]
(if m = 0
, the type Option[Seq[S]]
is also accepted).
This case is further discussed below.
The Predef
object contains a definition of an
extractor object Pair
:
object Pair {
def apply[A, B](x: A, y: B) = Tuple2(x, y)
def unapply[A, B](x: Tuple2[A, B]): Option[Tuple2[A, B]] = Some(x)
}
This means that the name Pair
can be used in place of Tuple2
for tuple
formation as well as for deconstruction of tuples in patterns.
Hence, the following is possible:
val x = (1, 2)
val y = x match {
case Pair(i, s) => Pair(s + i, i * i)
}
SimplePattern ::= StableId ‘(’ [Patterns ‘,’] [varid ‘@’] ‘_’ ‘*’ ‘)’
A pattern sequence S*
.
Second, in an extractor pattern unapply
method,
but it does define an unapplySeq
method with a result type conforming to Option[(T_1, ... , T_m, Seq[S])]
(if m = 0
, the type Option[Seq[S]]
is also accepted). The expected type for the patterns
The last pattern in a pattern sequence may be a sequence wildcard _*
.
Each element pattern
Pattern3 ::= SimplePattern {id [nl] SimplePattern}
An infix operation pattern
An infix operation pattern
Pattern ::= Pattern1 { ‘|’ Pattern1 }
A pattern alternative $p_1$ | $\ldots$ | $p_n$
consists of a number of alternative patterns
XML patterns are treated here.
Regular expression patterns have been discontinued in Scala from version 2.0.
Later version of Scala provide a much simplified version of regular
expression patterns that cover most scenarios of non-text sequence
processing. A sequence pattern is a pattern that stands in a
position where either (1) a pattern of a type T
which is
conforming to
Seq[A]
for some A
is expected, or (2) a case
class constructor that has an iterated formal parameter
A*
. A wildcard star pattern _*
in the
rightmost position stands for arbitrary long sequences. It can be
bound to variables using @
, as usual, in which case the variable will have the
type Seq[A]
.
A pattern
-
$p$ is a variable pattern, -
$p$ is a typed pattern$x: T'$ , and$T <: T'$ , -
$p$ is a constructor pattern$c(p_1 , \ldots , p_n)$ , the type$T$ is an instance of class$c$ , the primary constructor of type$T$ has argument types$T_1 , \ldots , T_n$ , and each$p_i$ is irrefutable for$T_i$ .
TypePat ::= Type
Type patterns consist of types, type variables, and wildcards.
A type pattern
-
A reference to a class
$C$ ,$p.C$ , or$T$#$C$
. This type pattern matches any non-null instance of the given class. Note that the prefix of the class, if it exists, is relevant for determining class instances. For instance, the pattern$p.C$ matches only instances of classes$C$ which were created with the path$p$ as prefix. This also applies to prefixes which are not given syntactically. For example, if$C$ refers to a class defined in the nearest enclosing class and is thus equivalent to$this.C$ , it is considered to have a prefix.The bottom types
scala.Nothing
andscala.Null
cannot be used as type patterns, because they would match nothing in any case. -
A singleton type
$p$.type
. This type pattern matches only the value denoted by the path$p$ (that is, a pattern match involved a comparison of the matched value with$p$ using methodeq
in classAnyRef
). -
A compound type pattern
$T_1$ with $\ldots$ with $T_n$
where each$T_i$ is a type pattern. This type pattern matches all values that are matched by each of the type patterns$T_i$ . -
A parameterized type pattern
$T[a_1 , \ldots , a_n]$ , where the$a_i$ are type variable patterns or wildcards_
. This type pattern matches all values which match$T$ for some arbitrary instantiation of the type variables and wildcards. The bounds or alias type of these type variable are determined as described here. -
A parameterized type pattern
scala.Array$[T_1]$
, where$T_1$ is a type pattern. This type pattern matches any non-null instance of typescala.Array$[U_1]$
, where$U_1$ is a type matched by$T_1$ .
Types which are not of one of the forms described above are also accepted as type patterns. However, such type patterns will be translated to their erasure. The Scala compiler will issue an "unchecked" warning for these patterns to flag the possible loss of type-safety.
A type variable pattern is a simple identifier which starts with a lower case letter.
Type parameter inference is the process of finding bounds for the bound type variables in a typed pattern or constructor pattern. Inference takes into account the expected type of the pattern.
Assume a typed pattern
Type parameter inference constructs first a set of subtype constraints over
the type variables
where
The set
If there exists a substitution
Otherwise, if
The final step consists in choosing type bounds for the type variables which imply the established constraint system. The process is different for the two cases above.
We take
We take
In both cases, local type inference is permitted to limit the complexity of inferred bounds. Minimality and maximality of types have to be understood relative to the set of types of acceptable complexity.
Assume a constructor pattern (_: $C[a_1 , \ldots , a_n]$)
.
Consider the program fragment:
val x: Any
x match {
case y: List[a] => ...
}
Here, the type pattern List[a]
is matched against the
expected type Any
. The pattern binds the type variable
a
. Since List[a]
conforms to Any
for every type argument, there are no constraints on a
.
Hence, a
is introduced as an abstract type with no
bounds. The scope of a
is right-hand side of its case clause.
On the other hand, if x
is declared as
val x: List[List[String]],
this generates the constraint
List[a] <: List[List[String]]
, which simplifies to
a <: List[String]
, because List
is covariant. Hence,
a
is introduced with upper bound
List[String]
.
Consider the program fragment:
val x: Any
x match {
case y: List[String] => ...
}
Scala does not maintain information about type arguments at run-time,
so there is no way to check that x
is a list of strings.
Instead, the Scala compiler will erase the
pattern to List[_]
; that is, it will only test whether the
top-level runtime-class of the value x
conforms to
List
, and the pattern match will succeed if it does. This
might lead to a class cast exception later on, in the case where the
list x
contains elements other than strings. The Scala
compiler will flag this potential loss of type-safety with an
"unchecked" warning message.
Consider the program fragment
class Term[A]
class Number(val n: Int) extends Term[Int]
def f[B](t: Term[B]): B = t match {
case y: Number => y.n
}
The expected type of the pattern y: Number
is
Term[B]
. The type Number
does not conform to
Term[B]
; hence Case 2 of the rules above
applies. This means that B
is treated as another type
variable for which subtype constraints are inferred. In our case the
applicable constraint is Number <: Term[B]
, which
entails B = Int
. Hence, B
is treated in
the case clause as an abstract type with lower and upper bound
Int
. Therefore, the right hand side of the case clause,
y.n
, of type Int
, is found to conform to the
function's declared result type, Number
.
Expr ::= PostfixExpr ‘match’ ‘{’ CaseClauses ‘}’
CaseClauses ::= CaseClause {CaseClause}
CaseClause ::= ‘case’ Pattern [Guard] ‘=>’ Block
A pattern matching expression
e match { case $p_1$ => $b_1$ $\ldots$ case $p_n$ => $b_n$ }
consists of a selector expression if $e$
where
Let
If no such bounds can be found, a compile time error results. If such
bounds are found, the pattern matching clause starting with
The expected type of every block
When applying a pattern matching expression to a selector value,
patterns are tried in sequence until one is found which matches the
selector value. Say this case is case $p_i \Rightarrow b_i$
.
The result of the whole expression is the result of evaluating scala.MatchError
exception is thrown.
The pattern in a case may also be followed by a guard suffix
if e
with a boolean expression true
, the pattern match succeeds as
normal. If the guard expression evaluates to false
, the pattern
in the case is considered not to match and the search for a matching
pattern continues.
In the interest of efficiency the evaluation of a pattern matching expression may try patterns in some other order than textual sequence. This might affect evaluation through side effects in guards. However, it is guaranteed that a guard expression is evaluated only if the pattern it guards matches.
If the selector of a pattern match is an instance of a
sealed
class,
the compilation of pattern matching can emit warnings which diagnose
that a given set of patterns is not exhaustive, i.e. that there is a
possibility of a MatchError
being raised at run-time.
Consider the following definitions of arithmetic terms:
abstract class Term[T]
case class Lit(x: Int) extends Term[Int]
case class Succ(t: Term[Int]) extends Term[Int]
case class IsZero(t: Term[Int]) extends Term[Boolean]
case class If[T](c: Term[Boolean],
t1: Term[T],
t2: Term[T]) extends Term[T]
There are terms to represent numeric literals, incrementation, a zero
test, and a conditional. Every term carries as a type parameter the
type of the expression it represents (either Int
or Boolean
).
A type-safe evaluator for such terms can be written as follows.
def eval[T](t: Term[T]): T = t match {
case Lit(n) => n
case Succ(u) => eval(u) + 1
case IsZero(u) => eval(u) == 0
case If(c, u1, u2) => eval(if (eval(c)) u1 else u2)
}
Note that the evaluator makes crucial use of the fact that type parameters of enclosing methods can acquire new bounds through pattern matching.
For instance, the type of the pattern in the second case,
Succ(u)
, is Int
. It conforms to the selector type
T
only if we assume an upper and lower bound of Int
for T
.
Under the assumption Int <: T <: Int
we can also
verify that the type right hand side of the second case, Int
conforms to its expected type, T
.
BlockExpr ::= ‘{’ CaseClauses ‘}’
An anonymous function can be defined by a sequence of cases
{ case $p_1$ => $b_1$ $\ldots$ case $p_n$ => $b_n$ }
which appear as an expression without a prior match
. The
expected type of such an expression must in part be defined. It must
be either scala.Function$k$[$S_1 , \ldots , S_k$, $R$]
for some scala.PartialFunction[$S_1$, $R$]
, where the
argument type(s)
If the expected type is SAM-convertible
to scala.Function$k$[$S_1 , \ldots , S_k$, $R$]
,
the expression is taken to be equivalent to the anonymous function:
($x_1: S_1 , \ldots , x_k: S_k$) => ($x_1 , \ldots , x_k$) match {
case $p_1$ => $b_1$ $\ldots$ case $p_n$ => $b_n$
}
Here, each
new scala.Function$k$[$S_1 , \ldots , S_k$, $T$] {
def apply($x_1: S_1 , \ldots , x_k: S_k$): $T$ = ($x_1 , \ldots , x_k$) match {
case $p_1$ => $b_1$ $\ldots$ case $p_n$ => $b_n$
}
}
If the expected type is scala.PartialFunction[$S$, $R$]
,
the expression is taken to be equivalent to the following instance creation expression:
new scala.PartialFunction[$S$, $T$] {
def apply($x$: $S$): $T$ = x match {
case $p_1$ => $b_1$ $\ldots$ case $p_n$ => $b_n$
}
def isDefinedAt($x$: $S$): Boolean = {
case $p_1$ => true $\ldots$ case $p_n$ => true
case _ => false
}
}
Here, isDefinedAt
method is omitted if one of the patterns
Here is a method which uses a fold-left operation
/:
to compute the scalar product of
two vectors:
def scalarProduct(xs: Array[Double], ys: Array[Double]) =
(0.0 /: (xs zip ys)) {
case (a, (b, c)) => a + b * c
}
The case clauses in this code are equivalent to the following anonymous function:
(x, y) => (x, y) match {
case (a, (b, c)) => a + b * c
}