Failed Experiments

a Mr. Fleming wishes to study bugs in smelly cheese; a Polish woman wishes to sift through tons of Central African ore to find minute quantities of a substance she says will glow in the dark; a Mr. Kepler wants to hear the songs the planets sing.

Friday, September 16, 2016

The missing diamond of Scala variance

This article is an expanded version of my LambdaConf 2016 talk of the same name (slides). The below covers every topic from that talk, and much more on possible extensions to variance polymorphism, but the talk is a gentler introduction to the main concepts here.

As part of its subtyping system, Scala features variance. Using variance lets you lift the subtyping relation into type constructors, just like type equality (~) already does.

type Endo[A] = A => A     // invariant
type Get[+A] = Foo => A   // covariant
type Put[-A] = A => Foo   // contravariant

X ~ Y  Endo[X] ~ Endo[Y] // invariant
X <: Y  Get[X] <: Get[Y] // covariant
X <: Y  Put[Y] <: Put[X] // contravariant
             reversed!↑

Subtyping is incomplete without variance

With simple type equality, you have four properties:

  1. Reflexivity: A ~ A
  2. Symmetry: A ~ BB ~ A
  3. Transitivity: A ~ BB ~ CA ~ C
  4. Congruence: A ~ BF[A] ~ F[B]

Just try to use GADTs without equality congruence! That’s what’s expected in a subtyping system without variance.

  1. Reflexivity: A <: A
  2. Antisymmetry: A <: BB <: AA = B
  3. Transitivity: A <: BB <: CA <: C
  4. Congruence: A <: BPut[B] <: Put[A]

Completing subtyping: variables

val aCat = Cat("Audrey")
val anAnimal: Animal = aCat

A bare type is in a covariant position. You can’t abstract over something as simple as a value box without variance.

Completing subtyping: the harmony of a function call

def speak(a: Animal): IO[Unit]
speak(aCat)

This is how functions and their arguments form a more perfect union. One way to think of the mechanics here is that Cat upcasts to Animal. But there’s no way to tell that what is really happening isn’t that the function is upcasting from Animal => IO[Unit] to Cat => IO[Unit]! Or that they aren’t meeting in the middle somewhere; maybe Cat upcasts to Mammal, and Animal => IO[Unit] upcasts to Mammal => IO[Unit].

So I don’t think there’s really subtyping without variance; there’s just failing to explicitly model variance that is there anyway. Since you cannot have subtyping without variance, if variance is too complicated, so is subtyping.

What else is there? What else is needed?

There is one advanced feature that fans of higher-kinded types would have found a deal-breaker to live without in Scala, even if they are unaware of its existence.

def same[A, F[_]](fa: F[A]): F[A] = fa
def widen[A, B >: A, F[+_]](fa: F[A]): F[B] = fa

All of Endo, Get, and Put can be passed as the F type parameter to same. However, only Get can be passed as the F type parameter to widen. This works because you can only “use” variance if you know you have it, but don’t need to make any assumptions about your type constructors’ variances if you don’t “use” it.

Variance exhibits a subkinding relation

same takes an invariant type constructor F, but you are free to pass covariant and contravariant type constructors to it. That’s because there is a subtype relationship between these at the type level, or a subkind relationship.

Invariance is the ‘top’ variance, the most abstract, and co- and contravariance are its subvariances. When you pass a covariant or contravariant type constructor as F to same, its variance “widens”.

Because variance is part of the “type of type constructor”, not the specific parameter where the variance annotation appears, it’s an element of the kind of that type constructor, just as arity is. For example, when we talk about the kind of Get , we don’t say that A has a covariant kind, because this variance has nothing to do with A. Instead, we say that Get has kind +* -> *, because this particular variance annotation is all about the behavior of types referring to Get. Moreover, subvariance is just a restricted flavor of subkind.

Flipping variances

I started to find it odd that this order of subclassing was enforced as a result of the subvariance relation.

mutable.Seq[A] extends Seq[+A]

CovCoy[F[+_]] extends Coy[F[_]]

It makes perfect sense empirically, though; you can easily derive unsafeCoerce if you assume this exact ordering isn’t enforced.

Then I found, while working on monad transformers, that this is the only way that makes sense, too.

InvMT[T[_[_]]] extends CovMT[T[_[+_]]]

CovMTT[W[_[_[+_]]]] extends InvMTT[W[_[_[_]]]]

I had seen this before.

Type parameter positions are variance-contravariant

The kinds of type constructors (type-level functions) work like the types of value-level functions. Both sorts of functions are contravariant in the parameter position. So every extra layer of nesting, “flips” the variance, just like with functions. Below is the version of this for value-level functions, akin to the examples above.

type One[-A] = A => Z
type Two[+A] = (A => Z) => Z
type Three[-A] = ((A => Z) => Z) => Z
type Four[+A] = (((A => Z) => Z) => Z) => Z

A bottom variance: the diamond, completed

Scala wisely included a bottom type, Nothing, to go with its top type Any. That helps complete its subtyping system, but a bottom variance was unfortunately left out of its subkinding system. Here’s something that might work.

type ConstI[👻A] = Int

ConstI[A] ~ ConstI[B] // phantom, or 👻

This is exactly what you’d get if you applied both the covariant and contravariant rules to a type parameter. Therefore, phantom variance or anyvariance is more specific than either, and as the ‘bottom’ variance, completes the diamond. It is the perfect choice, because it is truly a greatest lower bound; it is both more specific than either by itself and no more specific than both together.

Whence monad transformer variance?

There are two competing flavors of the monad transformers, like OptionT.

final case class NewOptionT[F[_],   A](run: F[Option[A]])

final case class OldOptionT[F[+_], +A](run: F[Option[A]])

The first remains invariant over A, but conveniently doesn’t care about the variance of F—its declared variance is the “top” variance. The latter gives the more specific covariance over A, but requires the F to be covariant (or phantom), which can be very inconvenient. These two transformers can’t be practically unified.

If you look at the structure, the variance of A’s position is always of a piece with the variance of F.

Variance variables

final case class OptionT[😕V, F[V😕_], V😕A]
  1. The “variance variable” V is declared with the syntactic marker 😕.
  2. 😕 appears infix before _ to say “F has the variance V for the type parameter in this position”.
  3. 😕 appears infix before A to say “the parameter A has the variance V”.

Therefore, when you specify variance V = +, the remaining parameters have kind F[+_] and +A, and similarly for other variances.

I’ve chosen the confused smiley 😕 to represent likely reactions to this idea, and especially to my complete lack of consideration for elegant syntax, but it’s really just a limited form of a kind variable in PolyKinds—after all, variance is part of the kind of the surrounding type constructor. And, as seen above, we already have subkind-polymorphism with respect to variance, so what’s wrong with parametric kind polymorphism? “Variance variables” are just parametric kind polymorphism, but not as powerful.

Variance bounds

The OptionT example is a simple case; it supports all possible variances, and features a simple relationship: whatever you select for V is exactly the variance used at the various places V appears. It’s easy to break this simple scheme, though.

final case class WrenchT[F[_], A](run: F[Option[A]], wrench: A)

Now the variance of the A position isn’t strictly the variance of F, as it was with OptionT; it can only be covariant or invariant, due to the wrench being in covariant position.

Well, we have variables over something with a conformance relation, let’s add bounds!

final case class WrenchT[😕V >: +, F[V😕_], V😕A]

And these bounds themselves could be determined by other variance variables, &c, but I don’t want to dwell on that because the complications aren’t over yet.

No easy unification

final case class Compose[F[_], G[_], A](run: F[G[A]])

This type complicates things even further! There are numerous possibilities based on the variances of F and G.

F,GG,FA
InvInvInv
Inv+ Inv
Inv- Inv
Inv👻 👻
+ + +
+ - -
+ 👻 👻
- - +
- 👻 👻
👻 👻 👻

(The order of F and G doesn’t matter here, so I’ve left out the reverses here.) So the variance of A’s position is the result of multiplying the variance of F and G. The multiplication table is just above. Guess we need a notation for that.

[😕FV, 😕GV, 😕V >: FV × GV, F[FV😕_], G[GV😕_], V😕A]

The bound here only means that V is in variance-covariant position, but bounded by FV × GV.

Another wrench

We can privilege F a little bit to make things more interesting.

final case class ComposeWr[F[_], G[_], A](run: F[G[A]], fa: F[A])
F×G F A
Inv,+,-Inv
Inv,+,-,👻Inv Inv
+,👻 + +
- + Inv
-,👻 - -
+ - Inv
👻 👻 👻

Here’s another level to the function that determines the A-position variance: it’s now lub(F×G, F), where lub is the least upper bound, the most specific variance that still holds both arguments as subvariances.

Variance families? Really?

I hope you guessed where I was going when I said “function”: the rules determining the lower bound on the A-position variance can be specified by the programmer with a variance-level function—a mapping where the arguments and results are variances—or a variance family. Again, I don’t think this is terribly novel; it’s just a kind family, but more restricted.

You’d want it closed and total, and I can see the Haskell now.

variance family LUBTimes a b where
  LUBTimes - + = -
  LUBTimes 👻 Inv = 👻
  

Only because I’m not sure where to begin making up the Scala syntax.

There are four variances

When I started looking for ways to describe the variances of the datatypes I’ve shown you, I noticed the relationship between variance and kinds, converted the problem to a kind-level problem, and started thinking of solutions in terms of PolyKinds. That’s where variance variables come from, and everything else follows from those.

However, I think I’ve made a mistake. Not with variance variables themselves, mind you, nor the variance-conformance already built in to Scala. But, to deal with the problems that arise with only these, I’ve hypothesized tools—described above—that are way too powerful. They work on open domains of unbounded complexity—the kinds of types—and there are only four variances.

There are two reasons I think there must be a better approach.

First, there are finitely many variance families for a given sort.

Second, there are only so many ways to put variables in positions of variance. You get argument and result position of methods, within other type constructors, and that’s it. Things are simple enough that, discounting desire for working GADTs (which will be discussed in a later post, “Choosing variance for a phantom type”), it is always possible to infer which of the four variances a type parameter ought to have, in a first-order variance situation.

A “good” polyvariance constraint system

Since there are only four variances, and only so many ways to combine them, it might be possible to design something more refined and suited for the task than the too-powerful variance families.

It might be that there are only a few useful variance relations, like × and lub, and a good solution would be to supply these relations along with an expression model to combine them. Or maybe not. Instead, I’ll stop hypothesizing and instead say what a I think a “good” system would look like.

  1. It must be writable. Just as it is desirable to write a stronger type role than the inferred one in GHC Haskell ≥ 7.8, there are very common reasons to want a more general variance than the one that would be inferred. So the convenience of writing out the rule explicitly matters a great deal.
  2. It must be checkable. For variance variables, that means every possible variance you can choose puts every type parameter only in positions consistent with its variance. For example, our fully generalized OptionT always places A only in positions matching the variance of the F type constructor.

    We can just check every possible variance—up to four for each variable—but I think this is the wrong way to go. We don’t just enumerate over every possible type to check type parameters—that would take forever—we have a systematic way to check exactly one time, with skolemization. Variance is simpler—it should be an easier problem.

  3. Not a requirement—but it ought to be inferrable. In the same way that skolemization gives us a path to automatic generalization, if there is a similar way to do quantified variance checking, it should be possible to use the output of that decision procedure to determine the relationships between and bounds on variance variables.

    How that decision is expressed is another question.

Variance & GHC type roles

It might seem that Haskell, with no subtyping, might not care about this problem. But GHC 7.8 type roles are similar enough to variances; the main difference is that Scala variance is about the congruence/liftability of the strong conformance relation, while type roles are about the congruence/liftability of the weak type equality/“coercibility” relation.

nominal:          a ~ b  f a ~ f b
representational: a ~w b  f a ~w f b
phantom:          f a ~w f b

This is pretty useful from a practical, performance-minded perspective, but the problem is

newtype MaybeT m a = MaybeT (m (Maybe a))

there is no way to describe the role of the a parameter in the most general way. It’s stuck at nominal, even if m’s parameter is representational.

Just as the integration of variance and higher kinds in Scala is incomplete without something like the polyvariance system I’ve been describing, Haskell’s type roles are not fully integrated with higher kinds.

I hope if one of these language communities finds a good solution, it is adopted by the other posthaste. The Haskell community is attempting to tackle these problems with roles; perhaps Scala can profit from its innovations. Much of the prose on the GHC matter can be profitably read by replacing “role” with “variance” where it appears. For example, this should sound familiar.

This design incorporates roles into kinds. It solves the exact problems here, but at great cost: because roles are attached to kinds, we have to choose a types roles in the wrong place. For example, consider the Monad class. Should the parameter m have type */R -> *, requiring all monads to take representational arguments, or should it have type */N -> *, disallowing GND if join is in the Monad class? We’re stuck with a different set of problems.

Subtyping or higher kinds?

As things stand, you will have a little trouble combining heavy use of subtyping and of higher kinds in the same system.

I’m not saying for certain that it comes down to one or the other. In Scalaz, we weakened support for subtyping to have better support for higher kinds, because its users typically do not want to use subtyping. However, this preference doesn’t generalize to the Scala community at large. This was only a real concern for monad transformers; most Scalaz constructs, for now, have fine subtyping support.

My suggestion is that you should favor higher kinds; they’re a more powerful abstraction mechanism, and ultimately easier to understand, than subtyping; they also happen to be less buggy in Scala. If you must use subtyping, be warned: it’s much more complex than it first seems.

Further reading

Tuesday, October 13, 2015

The uninteresting monoids of certain monads

Suppose there is some structure from which arises a monad. Let’s call one Sem.

data Sem a = ... -- doesn't matter

In the spirit of defining every typeclass instance you can think of—a spirit that I share, believe me—you discover a monoid, and suggest that it be included with Sem.

instance ??? => Monoid (Sem a) where
  -- definition here

But then, you are surprised to encounter pessimism and waffling, from me!

I’m so skeptical of your monoid because it is “common”; many monoids simply fall out of numerous monads, to greater or lesser degree, but that doesn’t make them “good” monoids. Having rediscovered a common, uninteresting monoid, you need to provide more justification of why it should be “the” monoid for this data type.

The lifted monoid

Every applicative functor gives rise to a monoid that lifts their arguments’ monoid.

instance Monoid a => Monoid (Sem a) where
  mempty = pure mempty
  mappend = liftA2 mappend

This is “the” monoid for (->) r and Maybe. It is decidedly not the monoid for []. For in that universe,

> [Sum 2] `mappend` [Sum 3, Sum 7]
[Sum 5, Sum 9]
> [Sum 42] `mappend` []
[]

Maybe you reaction is “but that’s not a legal monoid!” Sure it is. The mappend is based on combination, just as Applicative []’s <*> is. And, in the example above, the left and right identity is [Sum 0], not [].

It’s just not the monoid you’re used to.

Moreover, it isn’t quite right for Maybe! The constraint generalizes to Semigroup a. It is an unfortunate accident of history that the constraint on Haskell Maybe’s monoid is also Monoid.

Even the choice for (->) r makes many people unhappy, though we’re not quite ready to explore the reason for that.

So, what makes you think this is a good choice for Sem? It’s not enough justification that it can be written; that is always the case. There must be something that makes Sem like (->) r or Maybe, and not like [].

The MonadPlus monoid

To be entirely modern, this would be the Alternative monoid. Despite the possibilities for equivocation, this monoid is just as good as any other.

Simply: every Alternative (a subclass of Applicative and a superclass of the more well-known MonadPlus) gives rise to a monoid that is universal over the argument, no Monoid constraint required.

-- supposing Alternative Sem,
instance Monoid (Sem a) where
  mempty = empty
  mappend = (<|>)

You would not be surprised at this having prepared by reading the haddock for Alternative: “a monoid on applicative functors”, it says.

[] is Alternative, and indeed this is the monoid of choice for []. But Maybe is also Alternative. Why is this one good for [], but not Maybe? Let’s take a peek through the looking glass.

> Just 1 `mappend` Just 4
Just 1
> Nothing `mappend` Just 3
Just 3

I happen to agree with the monoid of choice for Maybe. But I’m sure many have been surprised it’s not “just take the leftmost Just, or give Nothing”.

Except where phantom Const-style functors are involved, the two preceding monoids always have incompatible behavior. One sums the underlying values, the other never touchs them, only rearranging them. So, if both are available to Sem, to define a monoid, we must give up at least one of these.

Alternatively, we could put off the decision until someone comes up with a convincing argument for “the” monoid.

The category endomorphism monoid

This monoid hasn’t let the lack of a pithy name handicap it; despite the stunning blow of losing the prized (->) to the lifted monoid (the commit), this one probably has even more fans eager for a rematch today than it did back then.

I’m referring to this one, still thought of as “the” monoid for (->) by some.

instance Monoid (a -> a) where
  mempty = id
  mappend = (.)

The elegance of this kind of “summing” of functions is undeniable. Moreover, it applies to every Category, not just (->). Even more, it works for anything sufficiently Category-ish, such as ReaderT.

instance Monad m => Monoid (ReaderT a m a) where
  mempty = ask
  ReaderT f `mappend` ReaderT g =
    ReaderT $ f <=< g

Its fatal flaw is that twin appearance of a; it requires FlexibleInstances, so can’t be written in portable Haskell 2010. As such, it will probably remain in the minor leagues of newtypes like Endo.

Moreover, should you discover it for Sem, its applicability to any category-ish thing should still give you pause.

The burden of proof

In Haskell, hacking until it compiles is a great way to work. It is tempting to rely on its conclusions in ever more cases, once you have discovered its effectiveness. However, in the cases above, it is very easy to be led astray by the facile promises of the typechecker.

Introducing one of these monoids is risky. It precludes the later introduction of the “right” monoid for a datatype, for want of compatibility. If you really must offer one of these monoids as “the” monoid for a datatype, the responsibility falls to you: demonstrate that this is a good monoid, not just an easy one.

Sunday, March 2, 2014

Encountering the people of Free Software

Over my time as a programmer, I have grown in the practice mostly by way of contact with the free software community. However, for the first 8 years of this time, that contact was entirely mediated by Internet communication, for my hometown did not feature a free software community to speak of.

So, instead, I learned what these people, mostly distributed among the other continents, were like by way of their mailing list messages, IRC chats, wiki edits, and committed patches. This is a fine way to become acquainted with the hats people wear for the benefit of the projects they're involved with, but isn't really a way to observe what they are really like.


About face


Then, a few years ago, I moved to Boston. Well-known for being steeped in history, Boston is the geographic heart of free software, being also the home of the Free Software Foundation. Here also is the FSF's annual LibrePlanet conference, a policy conference accompanied by a strong lineup of technical content.

I first attended LibrePlanet in 2012. There, after a decade of forming an idea in my head of what these people were like, I could finally test that idea against real-life examples.


Oddity


Richard Stallman (rms), the founder and leader, both in spirit and in practice, of free software has long since excised non-free software from his life. If he cannot use a website without using non-free software, he will not use that website. If he can't open a document you send him without using a non-free program to open it, he will ask you to send it in a different format, or otherwise simply not read it. If he cannot use a newer computer with only free software on it, he will use an older computer instead. Because people keep asking him to do these things, this is an ongoing effort. This is well-known about him.

So here was the surprise: my fellow attendees had all followed rms's example, with varying success. They traded tips on the freedom-respecting aspects of this or that hardware, yet admitted those areas where they hadn't yet been able to cut out non-free software.


Little things


There was no grand philosophical reason for this, no essential disagreement with rms's philosophy in play. It was just life. Perhaps they had a spouse who simply would not do without this non-free video streaming service. Perhaps they had friends with whom contact over that non-free messaging service was the foundation of the community. Perhaps they would like to work from home, albeit over some other non-free corporate network connector, in case they get snowed in.

Or maybe they simply haven't found the time. Maybe they tried once, failed, and haven't had the opportunity to try again. There are many demands on people; they deal with them as best as they can.

I should have realized this, and I should have known it from what rms himself had said.
"I hesitate to exaggerate the importance of this little puddle of freedom," he says. "Because the more well-known and conventional areas of working for freedom and a better society are tremendously important. I wouldn't say that free software is as important as they are. It's the responsibility I undertook, because it dropped in my lap and I saw a way I could do something about it…"


Try!


For all of these compromises, though, there was still the sense that these compromises are not the end of the story. Maybe free software isn't (practically) better sometimes. Maybe there are compromises that could ease up as the situation changes. Or maybe some inconvenience will be worth the trouble in the long run; after all, that practically inferior software probably won't get better without users.

People are perfectly capable, on our own, of following Milton Friedman's method of entering gain or loss of freedom on the appropriate side of the pros-and-cons list when making such choices: when you have little, the loss of a little means a lot. Why, then, look to the example of rms, or those small ranks of others who have also cut out all non-free software from their lives?

The people of free software don't necessarily believe that rms's goal is reachable within our lifetimes. I think that what people respond to is his clear, clearly stated, and continually adapting ideas of how the world could be better, never mind the occasional bout of eerie prescience. Maybe we will never get there. Does that mean people shouldn't set a lofty goal for making a better world, and spend a bit of time pushing the real one towards it?

Thursday, August 1, 2013

Rebasing makes collaboration harder

Thanks to certain version control systems' making these operations too attractive, history rewriting, e.g. rebase and squashed merge, of published revisions is currently quite popular in free software projects.  What does the git-rebase manpage, an otherwise advocate of the practice, have to say about that?

Rebasing (or any other form of rewriting) a branch that others have based work on is a bad idea: anyone downstream of it is forced to manually fix their history.

The manpage goes on to describe, essentially, cascading rebase.  I will not discuss further here why it is a bad idea.

So, let us suppose you wish to follow git-rebase's advice, and you wish to alter history you have made available to others, perhaps in a branch in a public repository.  The qualifying question becomes: "has anyone based work on this history I am rewriting?"

There are four ways in which you might answer this question.

  1. Someone has based work on your commits; rewriting history is a bad idea.
  2. Someone may have or might yet base work on your commits; rewriting history is a bad idea.
  3. It's unlikely that someone has based work on your commits so you can dismiss the possibility; the manpage's advice does not apply.
  4. It is not possible that someone has or will yet based work on your commits; the manpage's advice does not apply.

If you have truly met the requirement above and made the revisions available to others, you can only choose #4 if you have some kind of logging of revision fetches, and check this logging beforehand; this almost never applies, so it is not interesting here.  Note: it is not enough to check other public repositories; someone might be writing commits locally to be pushed later as you consider this question.  Perhaps someone is shy about sharing experiments until they're a little further along.

Now that we must accept it is possible someone has based changes on yours, even if you have dismissed it as unlikely, let's look at this from the perspective of another developer who wishes to build further revisions on yours.  The relevant question here is "should I base changes on my fellow developer's work?"  For which these are reasonable answers.

  1. You know someone has built changes on your history and will therefore not rewrite history, wanting to follow the manpage's advice.  It is safe for me to build on it.
  2. You assume someone might build changes on your history, and will not rewrite it for the same reason as with #1.  It is safe for me to build on it.
  3. You've dismissed the possibility of someone like me building on your history, and might rebase or squash, so it is not safe for me to build on it.

I have defined these answers to align with the earlier set, and wish to specifically address #3.  By answering #3 to the prior question, you have reinforced the very circumstances you might think you are only predicting.  In other words, by assuming no one will wish to collaborate on your change, you have created the circumstances by which no one can safely collaborate on your change.  It is a self-fulfilling prophecy that reinforces the tendency to keep collaboration unsafe on your next feature branch.

In this situation, it becomes very hard to break this cycle where each feature branch is "owned" by one person.  I believe this is strongly contrary to the spirits of distributed version control, free software, and public development methodology.

In circumstances with no history rewriting, the very interesting possibility of ad hoc cross-synchronizing via merges between two or more developers on a single feature branch arises.  You work on your parts, others work on other parts, you merge from each other when ready.  Given the above, it is not surprising to me that so many developers have not experienced this very satisfying way of working together, even as our modern tools with sophisticated merge systems enable it.

Sunday, June 23, 2013

Fake Theorems for Free

This article documents an element of Scalaz design that I practice, because I believe it to be an element of Scalaz design principles, and quite a good one at that. It explains why Functor[Set] was removed yet Foldable[Set] remains. More broadly, it explains why a functor may be considered invalid even though “it doesn't break any laws”. It is a useful discipline to apply to your own Scala code.
  1. Do not use runtime type information in an unconstrained way.
  2. Corollary: do not use Object#equals or Object#hashCode in unconstrained contexts, because that would count as #1.
The simplest way to state and remember it is “for all means for all”. Another, if you prefer, might be “if I don't know anything about it, I can't look at it”.

We accept this constraint for the same reason that we accept the constraint of referential transparency: it gives us powerful reasoning tools about our code. Specifically, it gives us our free theorems back.

Madness

Let's consider a basic signature.

     def const[A](a: A, a2: A): A

With the principle intact, there are only two total, referentially transparent functions that we can write with this signature.

     def const[A](a: A, a2: A): A = a
     
     def const2[A](a: A, a2: A): A = a2

That is, we can return one or the other argument. We can't “look at” either A, so we can't do tests on them or combine them in some way.

Much of Scalaz is minimally documented because it is easy enough to apply this approach to more complex functions once you have a bit of practice. Many Scalaz functions are the only function you could write with such a signature.

Now, let us imagine that we permit the unconstrained use of runtime type information. Here are functions that are referentially transparent, which you will find insane anyway.

     def const3[A](a: A, a2: A): A = (a, a2) match {
       case (s: Int, s2: Int) => if (s < s2) a else a2
       case _ => a
     }
     
     def const4[A](a: A, a2: A): A =
       if (a.## < a2.##) a else a2

Now, look at what we have lost! If the lowly const can be driven mad this way, imagine what could happen with fmap. One of our most powerful tools for reasoning about generic code has been lost. No, this kind of thing is not meant for the realm of Scalaz.

Missing theorems

For completeness's sake, let us see the list of theorems from Theorems for free!, figure 1 on page 3, for which I can think of a counterexample, still meeting the stated function's signature, if we violate the above explained principle.

In each case, I have assumed all other functions but the one in question have their standard definitions as explained on the previous page of the paper. I recommend having the paper open to page 3 to follow along. They are restated in fake-Scala because you might like that. Let lift(f) be (_ map f), or f* as written in the paper.

head[X]: List[X] => X
a compose head = head compose lift(a)

tail[X]: List[X] => List[X]
lift(a) compose tail = tail compose lift(a)

++[X]: (List[X], List[X]) => List[X]
lift(a)(xs ++ ys) = lift(a)(xs) ++ lift(a)(ys)

zip[X, Y]: ((List[X], List[Y])) => List[(X, Y)]
lift(a product b) compose zip = zip compose (lift(a) product lift(b))

filter[X]: (X => Boolean) => List[X] => List[X]
lift(a) compose filter(p compose a) = filter(p) compose a

sort[X]: ((X, X) => Boolean) => List[X] => List[X]
wherever for all x, y in A , (x < y) = (a(x) <' a(y)), also lift(a) compose sort(<) = sort(<') compose lift(a)

fold[X, Y]: ((X, Y) => Y, Y) => List[X] => Y
wherever for all x in A, y in B, b(x + y) = a(x) * b(y) and b(u) = u', also b compose fold(+, u) = fold(*, u') compose lift(a)

Object#equals and Object#hashCode are sufficient to break all these free theorems, though many creative obliterations via type tests of the const3 kind also exist.

By contrast, here are the ones which I think are preserved. I hesitate to positively state that they are, just because there are so many possibilities opened up by runtime type information.

fst[X, Y]: ((X, Y)) => X
a compose fst = fst compose (a product b)

snd[X, Y]: ((X, Y)) => Y
b compose snd = snd compose (a product b)

I[X]: X => X
a compose I = I compose a

K[X, Y]: (X, Y) => X
a(K(x, y)) = K(a(x), a(y))

Here is a useful excerpt from the paper itself, section 3.4 “Polymorphic equality”, of which you may consider this entire article a mere expansion.

... polymorphic equality cannot be defined in the pure polymorphic lambda calculus. Polymorphic equality can be added as a constant, but then parametricity will not hold (for terms containing the constant).

This suggests that we need some way to tame the power of the polymorphic equality operator. Exactly such taming is provided by the eqtype variables of Standard ML [Mil87], or more generally by the type classes of Haskell [HW88, WB89].

Compromise

Scalaz has some tools to help deal with things here. The Equal typeclass contains the equalIsNatural method as runtime evidence that Object#equals is expected to work; this evidence is used by other parts of Scalaz, and available to you.

Scalaz also provides

     implicit def setMonoid[A]: Monoid[Set[A]]

Relative to Functor, this is more or less harmless, because Monoid isn't so powerful; once you have Monoid evidence in hand, it doesn't “carry” any parametric polymorphism the way most Scalaz typeclasses do. It provides no means to actually fill sets, and the semigroup is also symmetric, so it seems unlikely that there is a way to write Monoid-generic code that can use this definition to break things.

More typical are definitions like

     implicit def setOrder[A: Order]: Order[Set[A]]

Which may use Object#equals, but is constrained in a way that they can be sure it's safe to do so, just as implied in the quote above.

Insofar as “compromise” characterizes the above choices, I think Scalaz's position in the space of possibilities is quite good. However, I would be loath to see any further relaxing of the principles I have described here, and I hope you would be too.

About Me

My photo

I am S11001001, s11 for short.  Programmer and Free Software enthusiast.

Search for my name to see more stuff about me; no one shares my real name, and no one shares my username, though I can't understand why.