Consider the following program that finds the second prime number between 1000 and 10000:
((1000 to 10000) filter isPrime)(1)This is much shorter than the recursive alternative:
def nthPrime(from: Int, to: Int, n: Int): Int =
if (from >= to) throw new Error("no prime")
else if (isPrime(from))
if (n == 1) from else nthPrime(from + 1, to, n - 1)
else nthPrime(from + 1, to, n)def secondPrime(from: Int, to: Int) = nthPrime(from, to, 2)
But from a standpoint of performance, the first version is pretty bad; it constructs
all prime numbers between 1000 and 10000 in a list, but only ever looks at
the first two elements of that list.
Reducing the upper bound would speed things up, but risks that we miss the second prime number all together.
However, we can make the short-code efficient by using a trick:
This idea is implemented in a new class, the LazyList.
LazyLists are similar to lists, but their elements are evaluated only on demand.
LazyLists are defined from a constructor LazyList.cons.
For instance,
val xs = LazyList.cons(1, LazyList.cons(2, LazyList.empty))Let's try to write a function that returns a LazyList representing a range of numbers
between lo and hi:
def llRange(lo: Int, hi: Int): LazyList[Int] =
if (lo >= hi) LazyList.empty
else LazyList.cons(lo, llRange(lo + 1, hi))Compare to the same function that produces a list:
def listRange(lo: Int, hi: Int): List[Int] =
if (lo >= hi) Nil
else lo :: listRange(lo + 1, hi)The functions have almost identical structure yet they evaluate quite differently.
listRange(start, end) will produce a list with end - start elements and return it.llRange(start, end) returns a single object of type LazyList with start as head element.tail on the stream.LazyList supports almost all methods of List.
For instance, to find the second prime number between 1000 and 10000:
(llRange(1000, 10000) filter isPrime)(1)The one major exception is ::.
x :: xs always produces a list, never a lazy list.
There is however an alternative operator #:: which produces a lazy list.
x #:: xs == LazyList.cons(x, xs)#:: can be used in expressions as well as patterns.
The implementation of lazy lists is quite close to the one of lists.
Here's the class LazyList:
final class LazyList[+A] ... extends ... {
override def isEmpty: Boolean = ...
override def head: A = ...
override def tail: LazyList[A] = ...
...
}As for lists, all other methods can be defined in terms of these three.
Concrete implementations of streams are defined in the LazyList.State companion object.
Here's a first draft:
private object State {
object Empty extends State[Nothing] {
def head: Nothing = throw new NoSuchElementException("head of empty lazy list")
def tail: LazyList[Nothing] = throw new UnsupportedOperationException("tail of empty lazy list")
}
final class Cons[A](val head: A, val tail: LazyList[A]) extends State[A]
}The only important difference between the implementations of List and LazyList
concern tail, the second parameter of LazyList.cons.
For lazy lists, this is a by-name parameter: the type of tail starts with =>. In such
a case, this parameter is evaluated by following the rules of the call-by-name model.
That's why the second argument to LazyList.cons is not evaluated at the point of call.
Instead, it will be evaluated each time someone calls tail on a LazyList object.
In Scala 2.13, LazyList (previously Stream) became fully lazy from head to tail. To make it possible,
methods (filter, flatMap...) are implemented in a way where the head is not being evaluated if is
not explicitly indicated.
For instance, here's filter:
object LazyList extends SeqFactory[LazyList] {
...
private def filterImpl[A](ll: LazyList[A], p: A => Boolean, isFlipped: Boolean): LazyList[A] = {
// DO NOT REFERENCE `ll` ANYWHERE ELSE, OR IT WILL LEAK THE HEAD
var restRef = ll // val restRef = new ObjectRef(ll)
newLL {
var elem: A = null.asInstanceOf[A]
var found = false
var rest = restRef // var rest = restRef.elem
while (!found && !rest.isEmpty) {
elem = rest.head
found = p(elem) != isFlipped
rest = rest.tail
restRef = rest // restRef.elem = rest
}
if (found) sCons(elem, filterImpl(rest, p, isFlipped)) else State.Empty
}
}Consider the following modification of llRange. When you write
llRange(1, 10).take(3).toList what is the value of rec?
Be careful, head is evaluating too!
var rec = 0
def llRange(lo: Int, hi: Int): LazyList[Int] = {
rec = rec + 1
if (lo >= hi) LazyList.empty
else LazyList.cons(lo, llRange(lo + 1, hi))
}
llRange(1, 10).take(3).toList
rec shouldBe res0
The proposed LazyList implementation suffers from a serious potential performance
problem: If tail is called several times, the corresponding stream
will be recomputed each time.
This problem can be avoided by storing the result of the first
evaluation of tail and re-using the stored result instead of recomputing tail.
This optimization is sound, since in a purely functional language an expression produces the same result each time it is evaluated.
We call this scheme lazy evaluation (as opposed to by-name evaluation in
the case where everything is recomputed, and strict evaluation for normal
parameters and val definitions.)
Haskell is a functional programming language that uses lazy evaluation by default.
Scala uses strict evaluation by default, but allows lazy evaluation of value definitions
with the lazy val form:
lazy val x = exprval builder = new StringBuilder
val x = { builder += 'x'; 1 }
lazy val y = { builder += 'y'; 2 }
def z = { builder += 'z'; 3 }
z + y + x + z + y + x
builder.result() shouldBe res0