The following set of sections represent the exercises contained in the book "Functional Programming in Scala", written by Paul Chiusano and Rúnar Bjarnason and published by Manning. This content library is meant to be used in tandem with the book. We use the same numeration for the exercises for you to follow them.
For more information about "Functional Programming in Scala" please visit its official website.
Exercise 9.1
First let's implement map2
in our Parser
type:
def map2[A, B, C](p: Parser[A], p2: Parser[B])(f: (A, B) => C): Parser[C] =
map(product(p, p2))(f.tupled)
With map2
in our set of combinators, we can implement many1
:
def many1[A](p: Parser[A]): Parser[List[A]] =
map2(p, many(p))(_ :: _)
val parser = run(
many1(char('a')))(_)
parser(res0) shouldBe Right(List('a'))
parser(res1) shouldBe Right(List('a', 'a', 'a'))
Exercise 9.2
Regarding laws that specify the behavior of product
(**
), let's start by stating that product
is associative.
These two expressions are "roughly" equal:
(a ** b) ** c
a ** (b ** c)
The only difference is how the pairs are nested. The (a ** b) ** c
parser returns an ((A,B), C)
, whereas the
a ** (b ** c)
returns an (A, (B,C))
. We can define functions unbiasL
and unbiasR
to convert these nested
tuples to flat 3-tuples:
def unbiasL[A, B, C](p: ((A, B), C)): (A, B, C) = (p._1._1, p._1._2, p._2)
def unbiasR[A, B, C](p: (A, (B, C))): (A, B, C) = (p._1, p._2._1, p._2._2)
With these, we can now state the associativity property:
(a ** b) ** c map (unbiasL) == a ** (b ** c) map (unbiasR)
We'll sometimes just use ~=
when there is an obvious bijection between the two sides:
(a ** b) ** c ~= a ** (b ** c)
map
and product
also have an interesting relationship--we can map
either before or after taking the product
of two parsers, without affecting the behavior:
a.map(f) ** b.map(g) == (a ** b) map { case (a, b) => (f(a), g(b)) }
For instance, if a
and b
were both Parser[String]
, and f
and g
both computed the length of a string, it
doesn't matter if we map over the result of a
to compute its length, or whether we do that after the product.
For more discussion of these laws, take a look at chapter 12 of "Functional Programming in Scala".
Exercise 9.3
Let's try to define many
via or
, map2
, and succeed
:
def many[A](p: Parser[A]): Parser[List[A]] =
map2(p, many(p))(_ :: _) or fpinscalalib.customlib.parsing.Reference.succeed(List())
val parser = run(
many(char('a')))(_)
parser("") shouldBe res0
parser("aaa") shouldBe res1
Exercise 9.4
Using map2
and succeed
, we can also implement listOfN
:
def listOfN[A](n: Int, p: Parser[A]): Parser[List[A]] =
if (n <= res0) fpinscalalib.customlib.parsing.Reference.succeed(List())
else map2(p, listOfN(n - res1, p))(_ :: _)
val singleParser = char('a')
val listParser = listOfN(3, singleParser)
val parseFunction = run(listParser)(_)
parseFunction("aaa") shouldBe Right(List('a', 'a', 'a'))
parseFunction("a").isLeft shouldBe true
Exercise 9.5
One way we can deal with non-strictness, is by using a new combinator (wrap
):
def wrap[A](p: => Parser[A]): Parser[A]
We can then define many
as:
def many[A](p: Parser[A]): Parser[List[A]] =
map2(p, wrap(many(p)))(_ :: _) or succeed(List())
In the parallelism chapter, we were particularly interested in avoiding having Par
objects that took as much time
and space to build as the corresponding serial computation, and the delay
combinator let us control this more
carefully. Here, this isn't as much of a concern, and having to think carefully each time we map2
to decide
whether we need to call wrap
seems like unnecessary friction for users of the API.
Exercise 9.6
We're now going to try parse a single digit (i.e.: 4
), followed by that many
'a'
characters. For instance:
"0"
, "1a"
, "2aa"
, and so on. This is also called a context-sensitive grammar. It can't be expressed with
product
because what the second parser decides depends on the first parser's result. Similarly as we did in
previous chapters, this problem calls for the use of a new primitive, flatMap
:
def flatMap[A, B](p: Parser[A])(f: A => Parser[B]): Parser[B]
To parse the digits, we'll be using regular expressions. Note the use of the new primitive regex
which converts
any regular expression into a Parser
. Complete the implementation with a regular expression to catch the digits
in our parser's first step, and take a look on how we chain both operations:
val parser = for {
digit <- Reference.regex(res0.r)
n = digit.toInt
_ <- listOfN(n, char('a'))
} yield n
val parseFunction = run(parser)(_)
parseFunction("0") shouldBe Right(0)
parseFunction("1a") shouldBe Right(1)
parseFunction("2aa") shouldBe Right(2)
Exercise 9.7
We can also implement product
and map2
in terms of flatMap
:
def product[A, B](p: Parser[A], p2: => Parser[B]): Parser[(A, B)] =
flatMap(p)(a => map(p2)(b => (a, b)))
def map2[A, B, C](p: Parser[A], p2: => Parser[B])(f: (A, B) => C): Parser[C] =
for { a <- p; b <- p2 } yield f(a, b)
Exercise 9.8
The same way as map
can be implemented via flatMap
:
def map[A, B](p: Parser[A])(f: A => B): Parser[B] = p.flatMap(a => succeed(f(a)))
Exercise 9.9
Let's see an example of how a Parser[JSON]
could be implemented using the primitives we've defined:
trait JSON
object JSON {
case object JNull extends JSON
case class JNumber(get: Double) extends JSON
case class JString(get: String) extends JSON
case class JBool(get: Boolean) extends JSON
case class JArray(get: IndexedSeq[JSON]) extends JSON
case class JObject(get: Map[String, JSON]) extends JSON
def jsonParser[Parser[+_]](P: Parsers[Parser]): Parser[JSON] = {
import P.{ string => _, _ }
implicit def tok(s: String) = P.token(P.string(s))
def array = surround("[", "]")(
value sep "," map (vs => JArray(vs.toIndexedSeq))) scope "array"
def obj = surround("{", "}")(
keyval sep "," map (kvs => JObject(kvs.toMap))) scope "object"
def keyval = escapedQuoted ** (":" *> value)
def lit = scope("literal") {
"null".as(JNull) |
double.map(JNumber(_)) |
escapedQuoted.map(JString(_)) |
"true".as(JBool(true)) |
"false".as(JBool(false))
}
def value: Parser[JSON] = lit | obj | array
root(whitespace *> (obj | array))
}
}
Exercise 9.11
Some useful primitives that could be useful to let programmers specify what error(s) get reported in an or
chain
could be:
/** In the event of an error, returns the error that occurred after consuming the most number of characters. */
def furthest[A](p: Parser[A]): Parser[A]
/** In the event of an error, returns the error that occurred most recently. */
def latest[A](p: Parser[A]): Parser[A]
Explore 9.13
We'll be exploring an actual representation of Parser
. Let's begin by implementing some of its methods, starting
with string
:
def string(w: String): Parser[String] = {
val msg = "'" + w + "'"
s => {
val i = firstNonmatchingIndex(s.loc.input, w, s.loc.offset)
if (i == -1) // they matched
Success(w, w.length)
else
Failure(s.loc.advanceBy(i).toError(msg), i != 0)
}
}
val parseFunction = run(
string("42"))(_)
parseFunction(res0) shouldBe Right("42")
Let's continue by implementing regex
:
def regex(r: Regex): Parser[String] = {
val msg = "regex " + r
s =>
r.findPrefixOf(s.input) match {
case None => Failure(s.loc.toError(msg), false)
case Some(m) => Success(m, m.length)
}
}
val parserFunction = run(
regex(res0.r))(_)
parserFunction("12345") shouldBe Right("12345")
parserFunction("abcde").isLeft shouldBe true
Implementation of succeed
is pretty straightforward:
def succeed[A](a: A): Parser[A] = s => Success(a, 0)
Finally let's implement slice
for our new representation:
def slice[A](p: Parser[A]): Parser[String] =
s =>
p(s) match {
case Success(_, n) => Success(s.slice(n), n)
case f @ Failure(_, _) => f
}
val parserFunction = run(
slice(many1(string("a"))))(_)
parserFunction("a") shouldBe Right("a")
parserFunction("ab") shouldBe Right("a")
parserFunction("b").isLeft shouldBe res0
Exercise 9.15
It's time to implement the run
primitive:
def run[A](p: Parser[A])(s: String): Either[ParseError, A] = {
val s0 = ParseState(Location(s))
p(s0).extract
}
Exercise 9.16
ParseError
can be improved by better formatting it for human consumption. Let's see how this could work:
case class ParseError(stack: List[(Location,String)] = List()) {
def push(loc: Location, msg: String): ParseError =
copy(stack = (loc,msg) :: stack)
def label[A](s: String): ParseError =
ParseError(latestLoc.map((_,s)).toList)
def latest: Option[(Location,String)] = stack.lastOption
def latestLoc: Option[Location] = latest map (_._1)
// Display collapsed error stack - any adjacent stack elements with the same location are combined on one line.
// For the bottommost error, we display the full line, with a caret pointing to the column of the error.
override def toString =
if (stack.isEmpty) "no error message"
else {
val collapsed = collapseStack(stack)
val context =
collapsed.lastOption.map("\n\n" + _._1.currentLine).getOrElse("") +
collapsed.lastOption.map("\n" + _._1.columnCaret).getOrElse("")
collapsed.map { case (loc,msg) => loc.line.toString + "." + loc.col + " " + msg }.mkString("\n") + context
}
// Builds a collapsed version of the given error stack - messages at the same location have their messages
// merged, separated by semicolons.
def collapseStack(s: List[(Location,String)]): List[(Location,String)] =
s.groupBy(_._1).
mapValues(_.map(_._2).mkString("; ")).
toList.sortBy(_._1.offset)
def formatLoc(l: Location): String = l.line + "." + l.col
Exercise 9.17
In order to make the slice
combinator more efficient (i.e.: many(char('a')).slice
will still create a
List[Char]
, only to discard it) we can create a different representation of the Parser
type. The main change
is to add another piece of state to ParseState
, an isSliced
flag, and an additional Slice
constructor to
Result
. If the isSliced
flag is set, parsers avoid building a meaningful result.
You can take a look at the complete implementation
here.
Exercise 9.18
Right now, we're missing error information when we combine several parsers with the or
combinator. For instance,
when both parsers fail we only take into account errors from the second parser. Maybe we could show both error
messages, or choose the error from the branch that got furthest without failing. Let's sketch a way of doing that:
// We'll just give a sketch here. The basic idea is to add an additional field to `ParseError`
case class ParseError(
stack: List[(Location, String)] = List(),
otherFailures: List[ParseError] = List()) {
def addFailure(e: ParseError): ParseError =
this.copy(otherFailures = e :: this.otherFailures)
// ...
}
// We then need to make sure we populate this in the implementation of `or`
def or[A](p: Parser[A], p2: => Parser[A]): Parser[A] =
s => p(s) match {
case Failure(e, false) => p2(s).mapError(_.addFailure(e))
case r => r // committed failure or success skips running `p2`
}
Of course, we have to decide how to print a ParseError
for human consumption. We also can expose combinators for
selecting which error(s) get reported in the event that a chain of a | b | c
fails--we might choose to collect up
all the errors for each of the three parsers, or perhaps only show the parser that got the furthest in the input
before failing, etc.
Exercise 9.x
To wrap this section up, here's a small exercise for you to test the aforementioned JSON
parser with your own
entries. Source code for this parser can be found
here
or in the source code of this section.
val json: Parser[JSON] = JSON.jsonParser(Reference)
run(json)(res0).isRight shouldBe true