Monday, 20 November 2017

Practical Awesome Recursion - Ch 01: Fixpoints

This is chapter 1 in a series of blog posts about recursion schemes. This series uses Scala, and will focus on usage and applicability. It will be scarce in theory, and abundant in examples. The theory is valuable and fascinating, but I often find that knowing the theory alone is only half understanding. The other half of understanding comes from, and enables, practical application. I recommend you bounce back and forth between this series and theory. The internet is rich with blogs and videos on theory that explain it better than I would, so use those awesome resources.

Overview

The goal of this post is to prepare your data types so that you can abstrct over/away recursion, and be able to use all the generalisations that we'll explore in all future chapters in this series.

In order to prepare your data, there are three things to do:

  1. Make the recursive positions in your data type abstract
  2. Create a Functor for your data type
  3. Wraps your data type in Fix[_]

Step 1. Remove recursion from the data type

Add a type parameter to your data type that will be used to represent recursion (and other things as you'll see in future chapters). Everywhere that your type references itself, replace the type with the new abstract type parameter.

IntList

Example #1: let's say you have your own hand-written cons-list, specialised to Int; maybe you want to avoid boxing with List[Int]. It might look like this:

// Before
sealed trait IntList
final case class IntCons(head: Int, tail: IntList) extends IntList
case object IntNil extends IntList

This type references itself in IntCons#tail. Let's fix that...

//  After (v1)
sealed trait IntList[F]
final case class IntCons[F](head: Int, tail: F) extends IntList[F]
final case class IntNil[F]() extends IntList[F]

Ok so now it never references itself, but we've had to modify the non-recursive case, IntNil, from an object to a case class with zero params. I'd prefer IntNil to remain an object for better ergonomics and efficiency so let's use variance.

(Note: Many Scala FP'ers avoid type variance entirely because it's kind of broken in theory, can lead to bugs in higher-kinded contexts, screws up implicits sometimes and breaks type inference sometimes too. I advise: learn about it, understand the tradeoffs and decide on a case-by-case basis.)

//  After (v2)
sealed trait IntList[+F]
final case class IntCons[+F](head: Int, tail: F) extends IntList[F]
case object IntNil extends IntList[Nothing]

BinaryTree

How about a binary tree with an abstract type A in the nodes:

// Before
sealed trait BinaryTree[+A]
final case class Node[+A](left: BinaryTree[A], value: A, right: BinaryTree[A]) extends BinaryTree[A]
case object Leaf extends BinaryTree[Nothing]

Here we replace the left and right branches.

// After
sealed trait BinaryTree[+A, +F]
final case class Node[+A, +F](left: F, value: A, right: F) extends BinaryTree[A, F]
case object Leaf extends BinaryTree[Nothing, Nothing]

Don't worry about preserving the A in the branches -- don't try to make it an F[A] -- just a plain old *-kinded F is all you need. Step 3 will ensure that As persist in both branches' children.

JSON

JSON is recursive too.

JSON has arrays of JSON, it has objects with JSON values, those values can be arrays that contains even more objects with nested arrays and... you get the picture.

// Before
sealed trait Json
object Json {
  case object      Null                               extends Json
  final case class Bool(value: Boolean)               extends Json
  final case class Str (value: String)                extends Json
  final case class Num (value: Double)                extends Json
  final case class Arr (values: List[Json])           extends Json
  final case class Obj (fields: List[(String, Json)]) extends Json
}

Only replace the self references; preserve the outer type. List[Json] should be List[F], not F.

// After
sealed trait Json[+F]
object Json {
  case object      Null                              extends Json
  final case class Bool  (value: Boolean)            extends Json[Nothing]
  final case class Str   (value: String)             extends Json[Nothing]
  final case class Num   (value: Double)             extends Json[Nothing]
  final case class Arr[F](values: List[F])           extends Json[F]
  final case class Obj[F](fields: List[(String, F)]) extends Json[F]
}

Step 2. Create a Functor

In this step we create a Functor for our data types. Functor is a type class that exists in the FP lib of your choice, namely Scalaz or Cats. It looks like this:

trait Functor[F[_]] {
  def map[A, B](fa: F[A])(f: A => B): F[B]
}

It empowers the world outside your data structure to change one of its abstract types, and by necessity, all values of that type. It's just like calling .map on a list:

// Change the values
List(1, 2, 3).map(_ * 10)
  // yields
  List[Int](10, 20, 30)

// Change the type too
List(1, 2, 3).map(i => s"[$i]")
  // yields
  List[String]("[1]", "[2]", "[3]")

This functor is how the generic recursion abstractions can have access to, and control over the recursive spots in your data.

(Note: If you want to use monadic variants of morphisms later, you'll need to upgrade your Functor to a Traverse. I'll show that in a future post but just keep it in mind.)

Let's create instances for our examples above. It doesn't matter if you use Scalaz or Cats, only the imports need to change. The code itself is identical.

IntList

Let the compiler guide you, it will only accept one implementation:

implicit val functor: Functor[IntList] = new Functor[IntList] {
  override def map[A, B](fa: IntList[A])(f: A => B): IntList[B] = fa match {
    case IntCons(head, tail) => IntCons(head, f(tail))
    case IntNil              => IntNil
  }
}

Note: You could also use an implicit object but it can cause implicit resolution problems later. I can't remember why/when anymore, it's just become habit.

// Also possible but leads to implicit resolution problems
implicit object IntListFunctor extends Functor[IntList] {
  override def map[A, B](fa: IntList[A])(f: A => B): IntList[B] = fa match {
    case IntCons(head, tail) => IntCons(head, f(tail))
    case IntNil              => IntNil
  }
}

Another side-note, do explicitly annotate the type.

  1. If you don't the type will be a structural type which will slow down the compiler and mess with implicit resolution.
  2. Explicit annotation will be mandatory in Scala v3 anyway.
// Don't do this
implicit val functor = new Functor[IntList] {

BinaryTree

As intended, this next case introduces a bit more complexity. We want to provide the ability to transform the recursive positions which means we need to keep the value: A position abstract and stable.

You'll need to use kind-projector to get the nice BinaryTree[A, ?] syntax, instead of the monstrous, out-of-the-box ({ type L[X] = BinaryTree[A, X] })#L syntax. If types and terms weren't so different, it'd be BinaryTree[A, _] just like the underscore in List(1,2,3).map(_ * 100).

implicit def functor[A]: Functor[BinaryTree[A, ?]] = new Functor[BinaryTree[A, ?]] {
  override def map[B, C](fa: BinaryTree[A, B])(f: B => C): BinaryTree[A, C] = fa match {
    case Node(left, value, right) => Node(f(left), value, f(right))
    case Leaf                     => Leaf
  }
}

There's nothing special about the F type over the A. You could also write a functor over the A type, i.e. a Functor[BinaryTree[?, F]] instance.

JSON

Here our Fs are embedded in other values so we have to work a little harder to transform them, but not much harder. Lists have functors too so it's easy peasy: just call .map.

implicit val functor: Functor[Json] = new Functor[Json] {
  override def map[A, B](fa: Json[A])(f: A => B): Json[B] = fa match {
    case Null        => Null
    case j: Bool     => j
    case j: Str      => j
    case j: Num      => j
    case Arr(values) => Arr(values.map(f))
    case Obj(fields) => Obj(fields.map { case (k, v) => (k, f(v)) })
  }
}

Step 3. The Fixpoint

In step 1, we went from a recursive data structure to a non-recursive data structure. Our goal is to be able to abstract over/away recursion, not to vanquish it. How do we regain our recursion? After all, we still want the users of our amazing IntList library to be able to store more than one element!

We've going to wrap our types in a magical fixpoint type. Doing so will give us back our recursion.

Here's a definition of Fix:

case class Fix[F[_]](unfix: F[Fix[F]])

Confused? This isn't a theory series but get a pen and paper, plug in IntList and expand the alias step-by-step; it'll clear it up real quick.

Exciting side note: Stephen Compall and Tomas Mikula found a way to define Fix without boxing! See how, here. I measured a 20-30% improvement and copied the approach in my own recursion library. Very awesome stuff.

Anyway, wrap your data types in Fix[_] and you get your recursion back.

type RecursiveIntList       = Fix[IntList]
type RecursiveBinaryTree[A] = Fix[BinaryTree[A, ?]]
type RecursiveJson          = Fix[Json]

...which is a bit long-winded and unpleasant. Let's rename things.

The most-common convention I've seen is to append an F for "functor" to the data type and use the proper name in the alias.

sealed trait IntListF[+F]
type IntList = Fix[IntListF]

sealed trait BinaryTreeF[+A, +F]
type BinaryTree[A] = Fix[BinaryTreeF[A, ?]]

sealed trait JsonF[+F]
type Json = Fix[JsonF]

This actually works out great because then, in practice, I often create a new object with helpers that improve ergonomics and avoid Fix boilerplate when you need to manually create a structure, for example, unit test expectations, or using a parser that doesn't play nice with recursion schemes.

For example:

type IntList = Fix[IntListF]

object IntList {

  // Helpful cos Scala's type inference fails so often
  def apply(f: IntListF[IntList]): IntList =
    Fix(f)

  def nil: IntList =
    apply(IntNil)

  def cons(head: Int, tail: IntList): IntList  =
    apply(IntCons(head, tail))

  def fromList(is: Int*): IntList  =
    is.foldRight(nil)(cons)
}

Done!

That's it! We're done. It may seem like a lot of work but there's benefit coming that greatly outweighs the cost. The cost isn't huge anyway, just a bit of one-time-only boilerplate.

All source code available here: https://github.com/japgolly/learning/tree/public/recursion-blog/example/src/main/scala/japgolly/blog/recursion

If you like what I do and you'd like to support me, this series or any of my other projects, become a patron! It will make a big difference to me and help further more content. You can also follow me on Twitter for updates about the next post the series.

5 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. Thank you for the great post, David. Unfortunately, I must be misunderstanding some basics here. The code I have taken from your GH repo does not produce expected results. Unfortunately, the repo does not contain examples for the `IntList` stuff.

    How is the new parametrized `IntList[F]` supposed to be instantiated and what type to parametrize it to?

    parametrizing to `Int` would be wrong wrong, bc the Cons is no longer recursive:
    `val l: IntList[Int] = IntCons(1, 2)` works but `val l: IntList[Int] = IntCons(1, IntNil)` does not

    parametrizing to `IntCons` does not work as you cannot use `IntNil`
    parametrizing to `IntList[_]` sounds plausible but does not work with the `Functor` - seee later example.

    I dont quite get how the `Functor` implementation is supposed to work for `IntList` - the provided implementation does not change the elements. `IntCons` has its `head` fixed to `Int` so any function `f` operating on the elements of the `IntList` has to stay with `Int`. But `f` is only applied to the `tail` where again it does not change anything.

    A naive impl with `IntList[Int]` does not produce the expected result:

    ```scala
    val l: IntList[Int] = IntCons(1, 2)
    val res: IntList[Int] = l.map(x => x * 2)
    // -> IntCons(1, 4)
    ```

    I have not found a sensible impl for `IntList[IntList[_]`
    `val res: IntList[IntList[_]] = l.map(x => x * 2)` obviously does not work but neither does

    ```scala
    val res: IntList[IntList[_]] = l.map {
    case IntCons(head, tail) => IntCons(head * 2, tail.map(?)) // does not recurse
    case IntNil => IntNil
    }
    ```

    Could you please add some simple working examples for `IntList`? An explanation of what I seem to misunderstand here would be very much appreciated.

    ReplyDelete
    Replies
    1. Hi kostja, hopefully this clarifies it for you. It works and demonstrates the sample code works too.

      sbt:Recursion blog series> example/console
      [info] Starting scala interpreter...
      Welcome to Scala 2.12.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_152).
      Type in expressions for evaluation. Or try :help.

      scala> import japgolly.blog.recursion.ch01.IntListExample.After._
      import japgolly.blog.recursion.ch01.IntListExample.After._

      scala> val a = IntList.fromList(1, 3, 5, 7)
      a: japgolly.blog.recursion.ch01.IntListExample.After.IntList = IntCons(1,IntCons(3,IntCons(5,IntCons(7,IntNil))))

      scala> val b = IntCons(1, IntList(IntCons(3, IntList(IntCons(5, IntList(IntCons(7, IntList(IntNil))))))))
      b: japgolly.blog.recursion.ch01.IntListExample.After.IntCons[japgolly.blog.recursion.ch01.IntListExample.After.IntList] = IntCons(1,IntCons(3,IntCons(5,IntCons(7,IntNil))))

      scala> val c = IntList.cons(1, IntList.cons(3, IntList.cons(5, IntList.cons(7, IntList.nil))))
      c: japgolly.blog.recursion.ch01.IntListExample.After.IntList = IntCons(1,IntCons(3,IntCons(5,IntCons(7,IntNil))))

      scala> a == b
      res0: Boolean = true

      scala> a == c
      res1: Boolean = true

      Delete
  3. Ah, I seem to have missed the point of the Fix type. Some instructional videos later and with your examples, it all makes sense. Thank you David :)

    ReplyDelete
  4. kostja: happy to hear it and np!

    ReplyDelete