Saturday, December 3, 2016

Part 2: The role of Monad

This is the second of a four-part series on tagless-final effects.

Previously

  1. Introduction, motivation, and the core techniques.

The role of Monad

A useful effectful program needs some way of not only producing F effects, but combining and manipulating them. There are no methods on F from the perspective of effectful programs to satisfy this need; the interpreter must provide.

Suppose we have a basic copy method.

def copy(source: String, dest: String, alg: FSAlg)
    : alg.F[Unit] = {
  alg.readFile(source)
  // somehow get the String from F[String]...
  alg.writeFile(dest, contents)
}

It is tempting to include an evaluator in the algebra, but resist! Don’t include the runner in the algebra!

The appeal of this temptation lies with its similarity to an imperative style. You write

val contents = alg.run(alg.readFile(source))
alg.writeFile(dest, contents)

However, not only is this no longer functional programming, it is very, very hard to think about when this readFile side effect happens in relation to the other side effects in a side-effecting interpreter.

  1. It could happen before all of the F-controlled side effects.
  2. It could happen before one or more of the side effects you expect to happen first.
  3. it could happen after one or more of the side effects you expect to happen later.
  4. Any mix of the above can happen in the same interpreter.

Instead, we can supply combinators in the algebra to allow effects to be sequenced. Here’s a combinator for FSAlg that will allow copy to be written.

def bind[A, B](first: F[A])(next: A => F[B]): F[B]

This is called monadic bind, and can be written for both FSAlg interpreters.

// IOFul
def bind[A, B](first: () => A)(next: A => () => B): () => B =
  () => next(first())() // we can't call `first` now; that would
                        // break the delay of its side-effects

// TestMap
def bind[A, B](first: Directory => (Either[Err, A], Directory))
              (next: A => Directory => (Either[Err, B], Directory))
    : Directory => (Either[Err, B], Directory) =
  dir => {
    val (ea, dir2) = first(dir)
    ea match {
      case Left(err) => (Left(err), dir2)
      case Right(a) => next(a)(dir2)
    }
  }

With this function in the algebra, we can implement an effectful copy in a functional way.

def copy(source: String, dest: String, alg: FSAlg)
    : alg.F[Unit] =
  alg.bind(alg.readFile(source)){
    contents =>
      alg.writeFile(dest, contents)
  }

The broad applicability of this pattern to sequential effect problems—of both the pure and side sort, as exemplified by our two interpreters—is why Monad is so commonly used for problems like this.

Chances are that your effectful programs will need to perform F effects in this sequential way, where later effects (e.g. writeFile) need to be calculated based on the resulting values (e.g. contents) of earlier effects (e.g. readFile). So, as a design shortcut, you ought to incorporate Monad into your algebra.

Don’t reinvent the Monad wheel

It may be tempting to avoid incorporating a library of functional abstractions such as Scalaz or Cats into your program. This is a mistake; these libraries incorporate a large number of functions that are useful for working with abstract effects, as well as a large number of pre-built and tested implementations of bind and many similar combinators, ready for reuse in your interpreter.

These libraries are foundational because they cover so many common tasks for algebraic abstractions. For example, take a pure effectful program that reads a list of files.

def readFiles(names: List[String], alg: FSAlg)
    : alg.F[List[String]] =
  names.map(alg.readFile(_))
  //       ↑
  // [error] type mismatch;
  //  found   : List[alg.F[String]]
  //  required: alg.F[List[String]]

This doesn’t work because the Fs must be sequenced and returned from the method, not dropped on the floor. (foreach is completely useless in these programs for a similar reason.) An author of a pure effectful program can solve this problem herself, presuming the algebra includes the other essential monad function point.

// in algebra
def point[A](a: A): F[A]

// readFiles
  names.foldRight(alg.point(List.empty[String])){
    (name, rightF) =>
      alg.bind(alg.readFile(name)){hContents =>
      alg.bind(rightF){rest =>
        alg.point(hContents :: rest)
      }}}

With Scalaz, not only does abstract monad syntax give you a nicer way to write this fold and list reconstitution:

names.foldRight(List.empty[String].point[alg.F]){
  (name, rightF) =>
    for {
      hContents <- alg.readFile(name)
      rest <- rightF
    } yield hContents :: rest
}

But you wouldn’t bother, because the library already includes this function, for List and several other types.

names.traverse(alg.readFile(_))

Now all we did was change map to traverse.

Using a good foundational functional library is especially important for newcomers to monadic abstraction, because it contains in so many common patterns a demonstration of the proper way to work with F.

Monad reuse in FSAlg

Let’s rewrite what we have so far to incorporate a standard Monad into FSAlg.

First, we eliminate bind and point, substituting a Monad typeclass instance into FSAlg.

import scalaz.Monad

implicit val M: Monad[F]

For IOFulFSAlg, Scalaz already includes an implementation, which we can find by importing.

val M = {
  import scalaz.std.function._
  Monad[F]
}

Scalaz does have a Monad for the test algebra’s F, but using it will require some rewriting of our existing interpreter functions, so let’s just port over the previous bind implementation.

val M = new Monad[F] {
  // bind as above under TestMap
}

[error] object creation impossible, since method point in
 trait Applicative of type [A](a: => A)tfe.TestMapAlg.F[A]
 is not defined
  val M = new Monad[F] {
              ^

We didn’t get around to implementing point, the “effect-free” combinator, and the compiler asks for that now.

def point[A](a: => A): F[A] =
  d => (Right(a), d)

copy can be written in a method-calling style.

def copy(source: String, dest: String, alg: FSAlg)
    : alg.F[Unit] = {
  import alg.M.monadSyntax._
  alg.readFile(source).flatMap{
    contents => alg.writeFile(dest, contents)
  }
}

But it could also be written by calling alg.M.bind directly. Implementers’ choice.

For the traverse method, we need two imports, using the “à la carte” import style, and to pass along the Monad.

import scalaz.syntax.traverse._
import scalaz.std.list._

// and in the method
  names.traverse(alg.readFile(_))(alg.M)

(Use the type-parameter style for algebra definition to avoid this unfortunate failure of implicit resolution.)

Finding more functional combinators, like catching errors

In the type signatures for FSAlg, we haven’t really accounted for the fact that these functions can fail in real-world interpreters, and even in the test interpreter. Well, we have, in the design of the F choices, but that just delays any error until the caller runs the F.

  1. For IOFulFSAlg, calling the F function will throw, effectively halting the sequence.
  2. For TestDir, errors are represented with Left; reading the bind implementation, you can see that the Left case means that next is not called, effectively short-circuiting the program just like our IOFul does with exceptions.

This isn’t part of the tagless-final pattern; it’s a design choice. When we didn’t include error reporting in the return type of functions like writeFile that certainly can fail in practice, we implied that every interpreter’s F would account for errors. That’s not a convention, it’s an unavoidable outcome of this design decision.

If we want to write effectful programs that can handle errors, which is also a choice itself, we have a couple options.

1. The “explicit” strategy

For functions that can fail, include a representation of error cases inside the F. So FSAlg might have a different signature for readFile:

def readFile(name: String): F[Either[Err, String]]

This is the “explicit” strategy, and has a few major advantages:

  1. F can be simpler, because it need not model errors.
  2. The algebra can have a mix of failing and non-failing functions. The user of the algebra can tell which is which by looking at the return types.
  3. Effectful program authors can delineate which parts of their program may have unhandled errors.

2. The “implicit” strategy

Incorporate an error-recovery function to convert an error into something that can be handled. Choose an error type, such as E, and add such a function as this:

def catchError[A](mayFail: F[A], recover: E => F[A]): F[A]

Assuming that you have incorporated Monad or at least its weaker relative Functor into your algebra, as we have, this is precisely equivalent in power to

def catchError[A](mayFail: F[A]): F[Either[E, A]]

except that its similarity to the try/catch form is more obvious. Alternatively, you might provide for some kind of filtering, so you drop some errors but not others; perhaps recover might be a PartialFunction, or you might take an additional argument that somehow explains to the interpreter which errors you want to handle or let go.

You may also wish to include the equivalent of try/finally, bracket:

def bracket[A](first: F[A], cleanup: F[Unit]): F[A]

This “implicit” strategy has its own set of advantages:

  1. The potential for failure need not be noted on each algebra method; it is assumed.
  2. Effectful programs can allow errors to percolate up “automatically”, so to speak. (Doing this with the “explicit” variant is possible, but a little tricky.

Unfortunately, there is no way to type-check that an effectful program in the “implicit” style has handled all errors, because F, no matter what, represents a program that might fail in this design.

3. The “someone else’s problem” strategy

As with #2, but don’t provide any means of recovery.

This is a question of delineated responsibility. For many effectful programs, it simply isn’t meaningful to recover from errors originating in the interpreter, and it’s always more appropriate for them to be handled by the invoker of the program, as an “early termination”.

In such a situation, you can communicate this by leaving error-catching out of the interpreter. Though you might want to at least document that you intended to leave the functionality out, and didn’t simply forget it!

None of these strategies is more or less pure; they all preserve the purity of effectful programs.

There is a broader problem here, though: how can you find type signatures like catchError, that convert concepts that seem to require side effects or special support, into plain algebra calls that work for pure FP programs? One great resource is the Haskell base library. Haskell requires all programs, even effectful ones, to be written using only pure constructs and ordinary functions, so many such problems have been solved there. catchError comes from the MonadError typeclass, which supplies a mini-algebra much like FSAlg, but specifically for throwing and catching errors.

  1. Break an “effectful” idea down into a primitive concept you’d like to support.
  2. Research how this is handled in the IO algebra for Haskell.
  3. Replace IO with F and incorporate into your algebra.

Here are the implementations of catchError and bracket for our two interpreters. One test for your choice of effectful API is whether interpreters can implement it. I’ve chosen the Err type to represent errors to effectful programs, but the choice is yours.

import scala.util.control.NonFatal

// IOFulFSAlg
def catchError[A](mayFail: F[A],        // () => A
                  recover: Err => F[A]) // Err => () => A
    : F[A] =
  () => try mayFail() catch {
    case NonFatal(e) => recover(Err(e.getMessage))()
  }

// TestDirFSAlg
def catchError[A](mayFail: F[A],
                  recover: Err => F[A]): F[A] =
  dir => {
    val (result, dir2) = mayFail(dir)
    result match {
      case Left(err) => recover(err)(dir2)
      case Right(a) => (Right(a), dir2)
    }
  }

// IOFul
def bracket[A](first: F[A], cleanup: F[Unit]): F[A] =
  () => try first() finally cleanup()

// TestDir
def bracket[A](first: F[A], cleanup: F[Unit]): F[A] =
  dir => {
    val (result, dir2) = first(dir)
    val (_, dir3) = cleanup(dir2)
    (result, dir3)
  }

When should I take an F argument?

The functions so far follow a pattern that is common for algebra that include Monad. Specifically, our algebra API comes in two flavors:

  1. Specific functions like readFile that carry out some task specific to this domain; these return an F but do not take an F as an argument.
  2. Abstract effect combinators like map, bind, catchError that likewise return an F but also take one or more as arguments.

This pattern arises because for functions like readFile, this is the most useful signature in the presence of Monad.

With Monad in place, we can easily implement copy’s writing step in terms of flatMap or bind. Without it, we might be tempted to solve the problem of calling writeFile by adding an F argument.

// don't do this
def writeFileBad(filename: String, contents: F[String]): F[Unit]

Now we can call writeFileBad directly with the result of readFile. But what about these?

  1. Suppose we want to process the contents of the source file before writing them. Maybe we want to split in two lines and filter some out (like grep), or sort them?
  2. Suppose we want to read two or more files, writing all of their contents to the target file?
  3. Suppose we wanted to read a file that contains, itself, a list of filenames, and we want to concatenate all of the contents of those files and put them into the target file?

The redesigned writeFileBad is good for only one sort of thing: things like copy. Monad is so ubiquitous partly because it is flexible enough to solve all these combination problems, and many more besides.

Effectful programs can all split their demands on their algebras into wanting to call these two sorts of primitive algebra functions; a program calling writeFileBad can call writeFile and flatMap instead, and will be better for it. Learning to recognize when you’ve accidentally given a specific function the job of a generic combinator is a highly useful skill for the design of abstract APIs.

Bending the mind the right way

The most difficult part of learning to use this pattern is learning how to design usable function types for your effect algebras. I suggested earlier looking at IO in Haskell, because they’ve encountered and solved such problems many times, because they had to.

So this is an attitude worth adopting. Demand a purely-functional approach in your effectful programs. That so many purely functional effect types are widely known is a testament to the unwillingness to compromise the integrity of the Haskell model or abandon the reasoning power that comes with referential transparency.

It’s a good idea to have two interpreters, one like IOFul and one like TestMap, or at least to imagine both. When adding a new function, think

  1. Will using this function in effectful programs cause effects before “running” the returned F?
  2. If I use this with a side-effect-free interpreter, will running F have side effects?

If the answer to either of these is “yes”, change the type signature.

Still to come

  1. Working with the abstract F;
  2. How much is this “dependency injection”?

This article was tested with Scala 2.12.0 and Scalaz 7.2.8.

No comments: