Saturday, May 16, 2020

Global typeclass coherence (Principles 3, Scalaz Files)

In Scalaz, we provide at most one typeclass instance per type throughout your whole program, and we expect users to preserve that invariant. For that type, there should be no way to get two incompatible instances. This is global coherence, and is required to make Scala programs using implicits for typeclasses understandable and maintainable.

If you want a different instance for the “same” type when working with Scalaz, the answer is always to start with a different type first. Then you can define your incompatible instance for the same structure, but on the different type, preserving coherence. Scalaz’s @@ type tags or the underlying existential newtype mechanism are convenient, flexible ways to get these “different” types.

It’s not surprising that “new type” as a solution to this problem comes from Haskell, as Haskell too depends on global coherence. While we can’t get all the benefits from coherence that Haskell does, what remain is sufficient to justify this seemingly non-modular rule.

The invisible action of implicits in Scala is a serious problem for understanding and maintenance if used in an undisciplined manner. Consider, for example, an incoherent implicit-based design: scala-library’s ExecutionContext.

At any point in the program, the ExecutionContext resolved depends on what variables are in scope. It’s unsafe to move any code that depends on ExecutionContext to some other method, because that set of variables can change, thus changing the behavior of the program. And you can’t determine at a glance whether some code depends on ExecutionContext, so moving any code carries some risk.

You can’t add an ec: ExecutionContext argument anywhere without potentially breaking working code, because it changes that set of variables. It’s only safe to introduce brand-new methods with that argument.

If you are refactoring and suddenly get an error about multiple conflicting ExecutionContexts in scope, you have no help to determine which is the “right” one; you have to figure it out based on the probable intent of the code originally was. Possibly, none of the options in scope is right.

By contrast, consider Monoid[List[N]], where N might be some type variable in scope. The correct instance depends on the type, not what variables are in scope. So you can add a Semigroup[N] constraint and know the answer won’t change. You can split up the method, or move any of its code anywhere else, and know the answer won’t change. You can even add a Monoid[List[N]] argument to your function, because you know the caller is required to come up with the same instance you were working with before.

You can add or delete constraints as required, because they’re always going to be fulfilled by the same instances. For example, Scalaz’s sorted map K ==>> V doesn’t carry the K comparator around, because we can assume it always depends on K and only on K.

If you ever get an error about multiple conflicting instances, you know there’s always a workable solution: choose one, because they’re the same.

Tools try to help with understanding Scala’s implicit resolution, but they don’t always work. Global coherence is your greatest ally when trying to understand what an implicit resolution is doing by hand: you know that any path to the instance you find is a correct one, and you can then work out why that isn’t being resolved.

Global coherence also lets Scalaz offer a much simpler API. For example, the efficient union of sorted maps requires that the ordering of K used for both is equal. With “local” (read: incoherent) instances, the only safe way to do this is to define a singleton type depending on the ordering, and treat this singleton type as a sort of third type parameter to the map type. If you happen to have built two maps with the same key type where you didn’t use polymorphism to unify their use of that singleton type, too bad, you can’t safely union them. With global coherence, because the instance only depends on K, the simple two-parameter map type is perfectly sufficient, and those maps are easy to union.

The “flexibility” of local instances is not worth it given the constraints of Scala, and Scalaz assumes you won’t be using them when you use its functionality. Define a newtype instead.