# CS-210 Functional Programming Principles in Scala

## Call-by-name (CBN), call-by-value (CBV)

Let’s say we have the following function, and that we call it in the following way:

There are 2 strategies to solving this: send the function the uncalculated arguments (CBN) or calculate the arguments and then send them to the function (CBV).

• CBN and CBV reduce an expression to the same value as long as both evaluations terminate.
• If CBV evaluation of an expression e terminates, then CBN evaluation of e terminates too
• The other direction is not true.

Here’s an example:

Scala normally uses CBV, but you can force CBN with the `=>`.

### Value definitions

Using `def` is CBN, but `val` is CBV.

## Blocks and lexical scope

To avoid namespace pollution, we can use nested functions:

This is done using a block, delimited by `{ ... }` braces. The last element of a block is an expression that defines its return value.

The definitions inside a block are only visible from within the block. The block has access to what’s been defined outside of it, but if it redefines an external definition, the new one will shadow the old one, meaning it will be redefined inside the block.

## Tail recursion

If a function calls itself as its last action, then the function’s stack frame can be reused. This is called tail recursion. In practice, this means that recursion is iterative in Scala, and is just as efficient as a loop.

One can require that a function is tail-recursive using a `@tailrec` annotation:

An error is issued if `gcd` isn’t tail recursive.

## Higher-Order Functions

Functions that take other functions as parameters or that return functions as results are called higher order functions, as opposed to a first order function that acts on simple data types.

### Anonymous functions

Instead of having to define a `cube` and `id` function in the example above, we can just write an anonymous function as such:

## Currying

From Wikipedia:

Currying is the technique of translating the evaluation of a function that takes multiple arguments into evaluating a sequence of functions, each with a single argument.

Essentially, with currying we do the following transition:

Using currying, we can once more improve our `sum` function:

Function application associates to the left so `sum(cube)(1, 10)` is equivalent to `(sum(cube))(1, 10)`.

The type of `sum` is `(Int => Int) => (Int, Int) => Int`. This should be read and understood as `(Int => Int) => ((Int, Int) => Int)` as functional types associate to the right.

## Classes: functions and data

In Scala, we use classes to define and create data structures:

This introduces two entities:

• A new type named `Rational`
• A constructor `Rational` to create elements of this type

### Methods

One can go further and also package functions operating on a data abstraction into the data abstraction itself. Such functions are called methods.

#### Identifier

The identifier is alphanumeric (starting with a letter, followed by letters or numbers) xor symbolic (starting with a symbol, followed by other symbols). We can mix them by using an alphanumeric name, an underscore `_` and then a symbol.

Small practical trick: to define a `neg` function that returns the negation of a `Rational`, we can write:

The precedence of an operator is determined by its first character, in the following priority (from lowest to highest):

• All letters
• `|`
• `^`
• `&`
• `< >`
• `= !`
• `:`
• `+ -`
• `* / %`
• All other symbolic characters

#### Infix notation

Any method with a parameter can be used like an infix operator:

### Constructors

Scala naturally executes the code in the class body as an implicit constructor, but there is a way to explicitly define more constructors if necessary:

### Data abstraction

We can improve `Rational` by making it an irreducible fraction using the GCD:

There are obviously multiple ways of achieving this; the above code just shows one. The ability to choose different implementations of the data without affecting clients is called data abstraction.

### Assert and require

When calling the constructor, using a denominator of 0 will eventually lead to errors. There are two ways of imposing restrictions on the given constructor arguments:

• `require`, which throws an `IllegalArgumentException` if it fails
• `assert`, which throws an `AssertionError` if it fails

This reflects a difference in intent:

• `require` is used to enforce a precondition on the caller of a function
• `assert` is used to check the code of the function itself

## Class Hierarchies

### Abstract classes

Just like in Java, we can have absctract classes and their implementation:

#### Terminology

• `Empty` and `NonEmpty` both extend the class `IntSet`
• The definitions of `incl` and `contains` implement the abstract functions of `IntSet`
• This implies that the types `Empty` and `NonEmpty` conform to the type `IntSet`, and can be used wherever an `IntSet` is required
• `IntSet` is the superclass of `Empty` and `NonEmpty`
• `Empty` and `NonEmpty` are subclasses of `IntSet`
• In Scala, any user-defined class extends another class. By default, if no superclass is given, the superclass is `Object`
• The direct or indirect superclasses are called base classes

#### Override

It is possible to redefine an existing, non-abstract definition in a subclass by using `override`.

Overriding something that isn’t overrideable yields an error.

### Traits

In Scala, a class can only have one superclass. But sometimes we want several supertypes. To do this we can use traits. It’s declared just like an abstract class, but using the keyword `trait`:

Classes, objects and traits can inherit from at most one class but as arbitrarily many traits.

Traits cannot have value parameters, only classes can.

### Singleton objects

In the `IntSet` example, one could argue that there really only is a single empty `IntSet`, and that it’s overkill to have the user create many instances of `Empty`. Instead we can define a singleton object:

Singleton objects are values, so `Empty` evaluates to itself.

### Packages and imports

Classes and objects are organized in packages, just like in Java.

One can now call the object using its full qualified name, or with an import:

### Polymorphism

Just like in Java, we may wish to have polymorphic types.

Type parameters can be used in classes, but also in functions.

#### Type inference

The Scala compiler can usually deduce the correct type parameters.

#### Type bounds

We can set the types of parameters as either subtypes or supertypes of something. For instance, a method that takes an `IntSet` and returns it if all elements are positive, or throws an error if not, could be implemented as such:

Here, `<: IntSet` is an upper bound of the type parameter `S`. Generally:

• `S <: T` means `S` is a subtype of `T`
• `S >: T` means `S` is a supertype of `T`

It’s also possible to mix a lower bound with an upper bound:

This would restrict `S` to any type on the interval between `NonEmpty` and `IntSet`.

#### Variance

Given `NonEmpty <: IntSet`, is `List[NonEmpty] <: List[IntSet]`? Yes!

Types for which this relationship holds are called covariant because their subtyping relationship varies with the type parameter. This makes sense in situations fitting the Liskov Substitution Principle (loosely paraphrased):

If `A <: B`, then everything one can do with a value of type `B` one should also be able to do with a value of type `A`.

In Scala, for instance, `Array`s are not covariant.

There are in fact 3 types of variance (given `A <: B`):

• `C[A] <: C[B]` means `C` is covariant
• `C[A] >: C[B]` means `C` is contravariant
• Neither `C[A]` nor `C[B]` is a subtype of the other means `C` is nonvariant

Scala lets you declare the variance of a type by annotating the type parameter:

Functions are contravariant in their argument types, and covariant in their result type. This allows us to state a very useful and important subtyping relation for functions: `A1 => B2 <: A2 => B1` if and only if `A1 >: A2` and `B1 >: B2`.

Note that, in this case, `A2 => B2` is unrelated to `A1 => B1`.

The Scala compiler checks that there are no problematic combinations when compiling a class with variance annotations. Roughly:

• Covariant type parameters can only appear in method results
• However, covariant type parameters may appear in lower bounds of method type parameters
• Contravariant type parameters can only appear in method parameters
• However, contravariant type parameters may appear in upper bounds of method type parameters
• Invariant type parameters can appear anywhere

The following code, for instance, is correct as the covariant type parameter is a method result, and the contravariant is a parameter:

### Object oriented decomposition

Instead of writing external methods that apply to different types of subclasses, we can write the functionality inside the respective classes.

But this is problematic if we need to add lots of methods but not add many classes, as we’ll need to define new methods in all the subclasses. Another limitation of OO decomposition is that some non-local operations cannot be encapsulated in the method of a single object.

In these cases, pattern matching may be a better solution.

### Pattern matching

Pattern matching is a generalization of `switch` from C or Java, to class hierarchies. It’s expressed in Scala using the keyword `match`:

If none of the cases match, a match error exception is thrown.

Patterns are constructed from:

• Constructors, e.g. `Number`, `Sum`
• Variables, e.g. `n`, `e1`, `e2`
• Wildcard patterns `_` (if we don’t care about the argument, we can use `Number(_)`)
• Constants, e.g. `1`, `true` (by convention, start `const` with a capital letter).

These patterns can be stacked, so we may try to match a `Sum(Number(1), Var(x))` for instance. The same variable name can only appear once in a pattern, so `Sum(x, x)` is not a legal pattern.

It’s possible to define the evaluation function as a method of the base trait:

Pattern matching is especially useful when what we do is mainly to add methods (not really changing the class hierarchy). Otherwise, if we mainly create sub-classes, then object-oriented decomposition works best.

### Case classes

A case class definition is similar to a normal class definition, except that it is preceded by the modifier `case`. For example:

Doing this implicitly defines companion object with `apply` methods.

This way we can just do `Number(1)` instead of `new Number(1)`.

## Lists

There are two important differences between lists and arrays:

• Lists are immutable — the elements of a list cannot be changed.
• Lists are recursive (linked lists), while arrays are flat.

Like arrays, lists are homogeneous: the elements of a list must all have the same type.

### List constructors

A bit of syntactic sugar: you can construct new lists using the construction operation `::` (pronounced cons).

As a convention, operators ending in `:` associate to the right, and are calls on the right-hand operand.

### List patterns

It is also possible to decompose lists with pattern matching. Examples:

We can do a really short insertion sort this way (but one that runs in O(n2))

### List methods

#### Sublists and element access

• `xs.length`: The number of elements of `xs`
• `xs.last`: The list’s last elemeent, exception if `xs` is empty
• `xs.init`: A list consisting of all elements of `xs` except the last one, except if `xs` is empty.
• `xs take n`: A list consisting of the first `n` elements of `xs` or `xs` itself if it’s shorter than `n`
• `xs drop n`: The rest of the collection after taking `n` elements.
• `xs(n)`: The element of `xs` at index `n`

#### Creating new lists

• `xs ++ ys` or `xs ::: ys`: Concatenation of `xs` and `ys`
• `xs.reverse`: The list containing the elements of `xs` in reversed order
• `xs updated (n, x)`: The list containing the same elements as `xs`, except at index `n` where it contains `x`.

#### Finding elements

• `xs indexOf x`: The index of the first elemen in `xs` matching `x`, or `-1` if `x` does not appear in `xs`
• `xs contains x`: same as `xs indexOf x >= 0`

### Higher-order list functions

These are functions that work on lists and take another function as argument. The above examples often have similar structures, and we can identify patterns:

• transforming each element in a list in a certain way
• retrieving a list of all elements satisfying a criterion
• combining the elements of a list using an operator

Since Scala is a functional language, we can write generic function that implement these patterns using higher-order functions.

#### Map

The actual implementation of `map` is a bit more complicated for performance reasons, but follows something allong the lines of:

#### Filter

There are a few other methods that extract sublists based on a predicate:

• `xs filterNot p`: Same as `xs filter (x >= !p(x))`
• `xs partition p`: Same as `(xs filter p, xs filterNot p)`
• `xs takeWhile p`: The longest prefix of list `xs` consisting of elements that all satisfy the predicate `p`
• `xs dropWhile p`: The remainder of the list `xs` after any leading elements satisfying `p` have been removed
• `xs span p`: Same as `(xs takeWhile p, xs dropWhile p)`

#### Reduce

A reduction of a list consist of a combination of the elements using a given operator (i.e. summing or multiplying all the elements).

For certain operations, the order matters, and there are therefore different orders in which the reduction can be made.

One such function is `foldLeft`. It goes from left to right and takes an accumulator `z` as an additional parameter, which is returned when `foldLeft` is called on an empty list. For instance

Note: The `(_ + _)` notation is equivalent to `((x, y) => x + y)`.

`foldLeft` and `reduceLeft` (same as `foldLeft` but without the `z` argument) could be implemented as follows:

`foldRight` and `reduceRight` follow similar implementations but put the parentheses to the right.

## Implicit parameters

If we wanted to generalize an implementation of merge sort to work on more types than just `Int`s, we could rewrite it as such:

As a tiny note, it’s usually best to put the function value as the last parameter of a function, because that makes it more likely that the compiler can infer the types of the arguments of the function. E.g. we have written `(x, y) => x < y)` instead of `(x: Int, y: Int) => x < y`.

How can we make this code nicer? We can use the `Ordering` type to represent the function, and make it an implicit parameter:

Using the `Ordering[T]` type means using the predefined default ordering, which we don’t even need to supply to `msort`, namely `Ordering.String` and `Ordering.Int`. See notes on ordering in Java.

When you write an implicit parameter, and you don’t write an actual argument that matches that parameter, the compiler will figure out the right implicit to pass, based on the demanded type.

### Rules for implicit parameters

Say that a function takes an implicit parameter of type `T`. The compiler will search for an implicit definition that:

• is marked `implicit`
• has a type compatible with `T`
• is visible at the scope of the function call (see line 13 above), or is defined in a companion object associated with `T`

If there’s a single (most specific) definition, it will be taken as actual argument for the implicit parameter. Otherwise, it’s an error.

For instance, at line 13, the compiler inserts the `ord` parameter of `msort`

## Proof techniques

Before we can prove anything, we’ll just assert that pure functional languages have a property called referential transparency, since they don’t have side effects. This means that we can use reduction steps as equalities to some part of a term.

### Structural induction

The principle of structural induction is analogous to natural induction.

To prove a property `P(xs)` for all lists `xs`:

• Base case: Show that `P(Nil)` holds
• Induction step: for a list `xs`and some element `x`, show that if `P(xs)` holds then `P(x :: xs)` also holds.

Instead of constructing numbers and adding 1, we construct lists from `Nil` and add one element.

#### Example

Let’s show that, for lists `xs`, `ys` and `zs`, `(xs ++ ys) ++ zs = xs ++ (ys ++ zs)`.

We’ll use the two following axioms of `++` to prove this:

1. `Nil ++ ys = ys`
2. `(x :: xs1) ++ ys = x :: (xs1 ++ ys)`

Let’s solve it. First, the base case:

Now, onto the induction step:

So this property is established.

## Other collections

All the collections we’ll study are immutable. The collection hierarchy is as follows:

• `Iterable`
• `Seq`
• `List`
• `Vector`
• `Range`
• `Set`
• `Map`

### Sequences

#### Vectors

A `Vector` of up to 32 elements is just an array, but once it grows past that bound, its representation changes; it becomes a `Vector` of 32 pointers to `Vector`s (that follow the same rule once they outgrow 32).

Unlike lists, which are linear (access to the end of the list is slower than the start), random access to a certain element in a vector can be done in time log32(n).

Vectors are fairly good for bulk operations that traverse a sequence, such as a map, fold or filter. Also, 32 is a good number since it corresponds to a cache line.

Vectors are created analogously to lists:

Creating new vectors with these `:+` and `+:` operators works by adding a vector, and recreating parent vectors with pointers to the existing ones. Doing this preserves immutability while still being fairly efficient (log32(n)).

#### Arrays and Strings

They come from Java, so they can’t be subclasses of `Iterable`, but they still work just as if they were subclasses of `Seq`, and we can apply all the same operations.

#### Range

Represents a sequence of evenly spaced integers.

Ranges are represented as three fields: lower bounds, upper bounds and step value.

### Sets

Sets are another basic abstraction in the Scala collections. It is written analogously to a sequence:

The principal differences between sets and sequences are:

1. Sets are unordered: the elements of a set do not have a predefined order in which they appear in the set.
2. Sets do not have duplicate elements.
3. The fundamental operation on sets is `contains`.

### Maps

Another fundamental collection type is the `map`. A map of type `Map[Key, Value]` is a data structure associating keys with values.

They’re both an `Iterable` and a function, as `Map[Key, Value]` also extends the function type `Key => Value`.

Both the `None` and the `Some` are subclasses of the `Option` type.

This means that we can do pattern matching, or use the `withDefaultValue`:

### Operations on iterables

#### Operations on Sequences

• `xs exists p`: `true` if there is an element `x` of `xs` such that `p(x)` holds, `false` otherwise.
• `xs forall p`: `true` if `p(x)` holds for all elements `x` of `xs`, `false` otherwise
• `xs zip ys`: A sequence of pairs drawn from corresponding elements of sequences `xs` and `ys`
• `xs.unzip`: Splits a sequences of pairs `xs` into two sequences consisting of the first and second halves of all pairs
• `xs.flatMap f`: Applies collection-valued function `f` to all elements of `xs` to all elements the results.
• `xs.sum`: The sum of all elements of this numeric collection
• `xs.product`: The product of all elements of this numeric collection
• `xs.max`: The maximum of all elements of this numeric collection (an `Ordering` must exist)
• `xs.min`: The minimum of all elements of this numeric collection (an `Ordering` must exist)

A few examples below.

#### Sorted and groupBy

To sort elements, we can use either `sortWith` or `sorted` as below.

`groupBy` is available on Scala collections. It partitions a collection into a map of collections according to a discriminator function `f`.

## For-Expressions

Higher order functions and collections in functional languages often replace loops in imperative languages. Programs using many nested loops can therefore often be replaced by a combination of higher order functions.

For example, let’s say we want to find all 1 < i < j < n for which i + j is prime. This would take two loops in an imperative language, but in Scala we can “just” write:

This is hard to read, so we can use a for expression, of the form

Where `s` is a sequence of generators and filters, and `e` is an expression whose value is returned by an iteration.

Instead of `( s )`, braces `{ s }` can also be used, and then the sequence of generators and filters can be written on multiple lines without requiring semicolons.

Using a for expression, we can rewrite our previous example:

The rest of these notes correspond to the Functional Pogram Design in Scala course

### Querying

Let’s say we want to query the number of authors who have written two or more books.

The first mechanism to prevent duplicates is to compare titles using lexicographical order instead of a simple `!=`. Another trick is to use `.distinct`, which is like a `.toSet`.

### Translation to higher-order functions

The syntax of for is closely related to the higher-order functions `map`, `flatMap`, and `filter`. These functions could be implemented as such:

In reality, the translation is done the other way by the compiler. How do we translate for-expressions to these higher-order functions?

Below is the for expression and its translation at the next line.

See more examples of desugared for-expressions in this gist.

Interestingly, the translation of `for` is not limited to lists, sequences, or collections. Since it’s based solely on the presence of the methods `map`, `flatMap` and `withFilter`, we can simply redefine these methods for our own types.

If, for instance, we were to write a database supporting these methods, then as long as these methods are defined, we can use the `for` syntax for querying the database.

### Functional Random Generators

#### Definition

We could also define these three methods (`map`, `flatMap`, `withFilter`) for a random value generator. Let’s define it as such:

But we can streamline this:

Which expands to:

We therefore need to define `map` and `flatMap` on the `Generator` class.

Our example now expands to:

We can also define other types of generators:

#### Usage

Having created a generator, we can use this as a building block for more complex expressions:

#### Application: Random Testing

Generators are especially useful for random testing. Obviously it’s hard to predict the result of any random input without running the program, but what we can do is test postconditions, which are properties of the expected result.

We can use a tool called ScalaCheck to do this in a more automated way. Instead of writing tests, with ScalaCheck we write properties that are assumed to hold. ScalaCheck will then try to find good counter-examples if the assertion fails.

### Definition

A monad `M` is a parametric type `M[T]` with two operations, `unit` and `flatMap` (more commonly called `bind` in the literature):

The unit method return a monad with the given type:

• `List` is a monad with `unit(x) = List(x)`
• `Set` is a monad with `unit(x) = Set(x)`
• `Option` is a monad with `unit(x) = Some(x)`
• `Generator` is a monad with `unit(x) = single(x)`

For every monad, `map` can be be defined as a combination of `flatMap` and `unit`. All of the following are equivalent.

These methods have to satisfy some laws:

• Associativity: we can put the parentheses either to the left or the right, so `(m flatMap f) flatMap g == m flatMap(x => f(x) flatMap g)`
• Left unit: `unit(x) flatMap f == f(x)`
• Right unit: `m flatMap unit == m`

### Significance of the laws

Associativity says that one can “inline” nested for-expressions; the following are equivalent:

Right unit says `for (x <- m) yield x` is equivalent to just `m`, and left unit isn’t very useful for for-expressions.

## Streams

Sometimes, for performance reasons, we want avoid computing the tail of a sequence until it is needed for the evalutation result (which might be never). Streams implement this idea while keeping the notation concise. They’re similar to lists, but their tail is evaluated only on demand.

### Definition

Streams can be constructed like most other collections:

`.toStream` can be applied to any collection.

Streams can be described as partially constructed lists, and they support almost all of the `List` methods. For instance, to find the second prime number between 1000 and 10000, we can do:

The only exception is the cons operator, which is `#::` instead of `::`. This can be used in operation but also in patterns.

### Implementation

Again, this is pretty close to lists:

All other methods can be defined in terms of these three. The actual implementation of streams is in the `Stream` companion object, so if we want to define a new type of `Stream`, we just need to redefine these three methods.

Notice how the `cons` method uses CBN. This is what makes the whole drastic difference between `List` and `Stream`!

The other stream methods are implemented analogously to their list counterparts:

### Lazy Evaluation

The proposed implementation suffers from a serious potential performance problem: if `tail` is called several times, the corresponding stream will be recomputed each time. To avoid this, we can store the result of the first evalutation of `tail` and re-use the stored result next time.

This is called lazy evaluation (as opposed to by-name evaluation where everything is recomputed, and strict evaluation for normal parameters and `val` definitions). Scala uses strict evaluation by default, but allows lazy evaluation:

`x` is computed only once, when it is needed the first time; since functional programming expressions yield the same result on each call, the result is saved and reused next time.

This means that using a lazy value for `tail`, `Stream.cons` can be implemented more efficiently:

### Infinite Streams

Infinite streams benefit from laziness. All elements of a stream except the first one are computed only when needed. This opens up the possibility to define infinite streams.

We also don’t need to worry too much about infinite recursions with infinite streams since the tail isn’t evaluated:

## Functions and State

So far we’ve seen that rewriting can be done anywhere in a term, and all rewritings which terminate lead to the same solution. For instance:

There are multiple ways to rewrite our way to the solution; this is known as the Church-Rosser Theorem of lambda-calculus.

In this chapter, we’ll look at code that doesn’t satisfy that property. We will say goodbye to the substitution model for code that isn’t purely functional.

### Stateful Objects

An object has a state if its behavior is influenced by its history. It is mutable (while everything so far has been immutable).

Mutable states are defined using the `var` keyword (instead of `val`), and assigned with `=`:

If we define an object with stateful variables, then it is a stateful object if the result of calling a method depends on the history of the called methods, that the result may change over time.

### Identity

Mutable state introduces questions about equality, identity between two objects.

With immutable values (`val`), we had referential transparency; `val x = E; val y = E` was equivalent to `val x = E; val y = x`. This is no longer the case.

If `BankAccount` is a stateful object (its balance may change), then `val x = new BankAccount` and `val y = new BankAccount` aren’t equal. This makes sense, because modifying `x` doesn’t mean modifying `y`, and we therefore have to different accounts.

In general, to determine equality, we must first specify what is meant by “being the same”. The precise meaning is defined by the property of operational equivalence: informally, `x` and `y` are operationally equivalent if no possible test can distinguish between them. For any arbitrary function `f`, `f(x, y)` and `f(x, x)` must return the same value.

### Loops

For-loops look similar to for-expressions, but are translated to `foreach` instead of `map` and `flatMap`:

This should print “1a 1b 1c 2a 2b 2c”

## Lisp

I don’t have a whole lot of notes on this, since most of Lisp was seen during lab sessions, and my notes on lambda-calculus are on paper (it wouldn’t have been easy typing it in real time). But for future reference, I’m adding a syntax list of the Lisp dialect seen in class:

• `(if c a b)`: special form which evaluates `c`, and then `a` if `c != 0` and `b` if `c = 0`.
• `(cond (c1 r1) ... (cn rn) (else relse))`: special form which evaluates `c1`, then `r1` if `c1` is true, or else continues with the other clauses.
• `(cons first rest)`: constructs a list equivalent to Scala’s `x :: xs`. In our interpreter, `xs` must be a list.
• `(car lst)`: returns the head of a given list.
• `(cdr lst)`: returns the tail of a given list
• `(quote x)`: returns x as a quoted expression, i.e. `(quote foo)` returns the quoted symbol `foo`, and `(quote (a b c))` returns the list equivalent to `(cons (quote a) (cons (quote b) (cons (quote c) nil)))`
• `(= a b)`: returns whether `a` and `b` are equal. In our interpreter, a and b may be numbers, symbols or even lists.
• `(lambda (p1 ... pn) body)`: creates an anonymous function.
• `def f x`: creates a definition.
• `def (f p1 ... pn) body`: syntactic sugar for defining a named function.
« Back