Failed Experiments.

a Mr. Fleming wishes to study bugs in smelly cheese; a Polish woman wishes to sift through tons of Central African ore to find minute quantities of a substance she says will glow in the dark; a Mr. Kepler wants to hear the songs the planets sing.

Tuesday, October 13, 2015

The uninteresting monoids of certain monads

Suppose there is some structure from which arises a monad. Let’s call one `Sem`.

```data Sem a = ... -- doesn't matter
```

In the spirit of defining every typeclass instance you can think of—a spirit that I share, believe me—you discover a monoid, and suggest that it be included with `Sem`.

```instance ??? => Monoid (Sem a) where
-- definition here
```

But then, you are surprised to encounter pessimism and waffling, from me!

I’m so skeptical of your monoid because it is “common”; many monoids simply fall out of numerous monads, to greater or lesser degree, but that doesn’t make them “good” monoids. Having rediscovered a common, uninteresting monoid, you need to provide more justification of why it should be “the” monoid for this data type.

The lifted monoid

Every applicative functor gives rise to a monoid that lifts their arguments’ monoid.

```instance Monoid a => Monoid (Sem a) where
mempty = pure mempty
mappend = liftA2 mappend
```

This is “the” monoid for `(->) r` and `Maybe`. It is decidedly not the monoid for `[]`. For in that universe,

```> [Sum 2] `mappend` [Sum 3, Sum 7]
[Sum 5, Sum 9]
> [Sum 42] `mappend` []
[]
```

Maybe you reaction is “but that’s not a legal monoid!” Sure it is. The `mappend` is based on combination, just as `Applicative []`’s `<*>` is. And, in the example above, the left and right identity is `[Sum 0]`, not `[]`.

It’s just not the monoid you’re used to.

Moreover, it isn’t quite right for `Maybe`! The constraint generalizes to `Semigroup a`. It is an unfortunate accident of history that the constraint on Haskell `Maybe`’s monoid is also `Monoid`.

Even the choice for `(->) r` makes many people unhappy, though we’re not quite ready to explore the reason for that.

So, what makes you think this is a good choice for `Sem`? It’s not enough justification that it can be written; that is always the case. There must be something that makes `Sem` like `(->) r` or `Maybe`, and not like `[]`.

The `MonadPlus` monoid

To be entirely modern, this would be the `Alternative` monoid. Despite the possibilities for equivocation, this monoid is just as good as any other.

Simply: every `Alternative` (a subclass of `Applicative` and a superclass of the more well-known `MonadPlus`) gives rise to a monoid that is universal over the argument, no `Monoid` constraint required.

```-- supposing Alternative Sem,
instance Monoid (Sem a) where
mempty = empty
mappend = (<|>)
```

You would not be surprised at this having prepared by reading the haddock for `Alternative`: “a monoid on applicative functors”, it says.

`[]` is `Alternative`, and indeed this is the monoid of choice for `[]`. But `Maybe` is also `Alternative`. Why is this one good for `[]`, but not `Maybe`? Let’s take a peek through the looking glass.

```> Just 1 `mappend` Just 4
Just 1
> Nothing `mappend` Just 3
Just 3
```

I happen to agree with the monoid of choice for `Maybe`. But I’m sure many have been surprised it’s not “just take the leftmost `Just`, or give `Nothing`”.

Except where phantom `Const`-style functors are involved, the two preceding monoids always have incompatible behavior. One sums the underlying values, the other never touchs them, only rearranging them. So, if both are available to `Sem`, to define a monoid, we must give up at least one of these.

Alternatively, we could put off the decision until someone comes up with a convincing argument for “the” monoid.

The category endomorphism monoid

This monoid hasn’t let the lack of a pithy name handicap it; despite the stunning blow of losing the prized `(->)` to the lifted monoid (the commit), this one probably has even more fans eager for a rematch today than it did back then.

I’m referring to this one, still thought of as “the” monoid for `(->)` by some.

```instance Monoid (a -> a) where
mempty = id
mappend = (.)
```

The elegance of this kind of “summing” of functions is undeniable. Moreover, it applies to every `Category`, not just `(->)`. Even more, it works for anything sufficiently `Category`-ish, such as `ReaderT`.

```instance Monad m => Monoid (ReaderT a m a) where
```

Its fatal flaw is that twin appearance of `a`; it requires `FlexibleInstances`, so can’t be written in portable Haskell 2010. As such, it will probably remain in the minor leagues of newtypes like `Endo`.

Moreover, should you discover it for `Sem`, its applicability to any category-ish thing should still give you pause.

The burden of proof

In Haskell, hacking until it compiles is a great way to work. It is tempting to rely on its conclusions in ever more cases, once you have discovered its effectiveness. However, in the cases above, it is very easy to be led astray by the facile promises of the typechecker.

Introducing one of these monoids is risky. It precludes the later introduction of the “right” monoid for a datatype, for want of compatibility. If you really must offer one of these monoids as “the” monoid for a datatype, the responsibility falls to you: demonstrate that this is a good monoid, not just an easy one.

Saturday, March 1, 2014

Encountering the people of Free Software

Over my time as a programmer, I have grown in the practice mostly by way of contact with the free software community. However, for the first 8 years of this time, that contact was entirely mediated by Internet communication, for my hometown did not feature a free software community to speak of.

So, instead, I learned what these people, mostly distributed among the other continents, were like by way of their mailing list messages, IRC chats, wiki edits, and committed patches. This is a fine way to become acquainted with the hats people wear for the benefit of the projects they're involved with, but isn't really a way to observe what they are really like.

Then, a few years ago, I moved to Boston. Well-known for being steeped in history, Boston is the geographic heart of free software, being also the home of the Free Software Foundation. Here also is the FSF's annual LibrePlanet conference, a policy conference accompanied by a strong lineup of technical content.

I first attended LibrePlanet in 2012. There, after a decade of forming an idea in my head of what these people were like, I could finally test that idea against real-life examples.

Oddity

Richard Stallman (rms), the founder and leader, both in spirit and in practice, of free software has long since excised non-free software from his life. If he cannot use a website without using non-free software, he will not use that website. If he can't open a document you send him without using a non-free program to open it, he will ask you to send it in a different format, or otherwise simply not read it. If he cannot use a newer computer with only free software on it, he will use an older computer instead. Because people keep asking him to do these things, this is an ongoing effort. This is well-known about him.

So here was the surprise: my fellow attendees had all followed rms's example, with varying success. They traded tips on the freedom-respecting aspects of this or that hardware, yet admitted those areas where they hadn't yet been able to cut out non-free software.

Little things

There was no grand philosophical reason for this, no essential disagreement with rms's philosophy in play. It was just life. Perhaps they had a spouse who simply would not do without this non-free video streaming service. Perhaps they had friends with whom contact over that non-free messaging service was the foundation of the community. Perhaps they would like to work from home, albeit over some other non-free corporate network connector, in case they get snowed in.

Or maybe they simply haven't found the time. Maybe they tried once, failed, and haven't had the opportunity to try again. There are many demands on people; they deal with them as best as they can.

I should have realized this, and I should have known it from what rms himself had said.
"I hesitate to exaggerate the importance of this little puddle of freedom," he says. "Because the more well-known and conventional areas of working for freedom and a better society are tremendously important. I wouldn't say that free software is as important as they are. It's the responsibility I undertook, because it dropped in my lap and I saw a way I could do something about it…"

Try!

For all of these compromises, though, there was still the sense that these compromises are not the end of the story. Maybe free software isn't (practically) better sometimes. Maybe there are compromises that could ease up as the situation changes. Or maybe some inconvenience will be worth the trouble in the long run; after all, that practically inferior software probably won't get better without users.

People are perfectly capable, on our own, of following Milton Friedman's method of entering gain or loss of freedom on the appropriate side of the pros-and-cons list when making such choices: when you have little, the loss of a little means a lot. Why, then, look to the example of rms, or those small ranks of others who have also cut out all non-free software from their lives?

The people of free software don't necessarily believe that rms's goal is reachable within our lifetimes. I think that what people respond to is his clear, clearly stated, and continually adapting ideas of how the world could be better, never mind the occasional bout of eerie prescience. Maybe we will never get there. Does that mean people shouldn't set a lofty goal for making a better world, and spend a bit of time pushing the real one towards it?

Thursday, August 1, 2013

Rebasing makes collaboration harder

Thanks to certain version control systems' making these operations too attractive, history rewriting, e.g. rebase and squashed merge, of published revisions is currently quite popular in free software projects.  What does the git-rebase manpage, an otherwise advocate of the practice, have to say about that?

Rebasing (or any other form of rewriting) a branch that others have based work on is a bad idea: anyone downstream of it is forced to manually fix their history.

The manpage goes on to describe, essentially, cascading rebase.  I will not discuss further here why it is a bad idea.

So, let us suppose you wish to follow git-rebase's advice, and you wish to alter history you have made available to others, perhaps in a branch in a public repository.  The qualifying question becomes: "has anyone based work on this history I am rewriting?"

There are four ways in which you might answer this question.

1. Someone has based work on your commits; rewriting history is a bad idea.
2. Someone may have or might yet base work on your commits; rewriting history is a bad idea.
3. It's unlikely that someone has based work on your commits so you can dismiss the possibility; the manpage's advice does not apply.
4. It is not possible that someone has or will yet based work on your commits; the manpage's advice does not apply.

If you have truly met the requirement above and made the revisions available to others, you can only choose #4 if you have some kind of logging of revision fetches, and check this logging beforehand; this almost never applies, so it is not interesting here.  Note: it is not enough to check other public repositories; someone might be writing commits locally to be pushed later as you consider this question.  Perhaps someone is shy about sharing experiments until they're a little further along.

Now that we must accept it is possible someone has based changes on yours, even if you have dismissed it as unlikely, let's look at this from the perspective of another developer who wishes to build further revisions on yours.  The relevant question here is "should I base changes on my fellow developer's work?"  For which these are reasonable answers.

1. You know someone has built changes on your history and will therefore not rewrite history, wanting to follow the manpage's advice.  It is safe for me to build on it.
2. You assume someone might build changes on your history, and will not rewrite it for the same reason as with #1.  It is safe for me to build on it.
3. You've dismissed the possibility of someone like me building on your history, and might rebase or squash, so it is not safe for me to build on it.

I have defined these answers to align with the earlier set, and wish to specifically address #3.  By answering #3 to the prior question, you have reinforced the very circumstances you might think you are only predicting.  In other words, by assuming no one will wish to collaborate on your change, you have created the circumstances by which no one can safely collaborate on your change.  It is a self-fulfilling prophecy that reinforces the tendency to keep collaboration unsafe on your next feature branch.

In this situation, it becomes very hard to break this cycle where each feature branch is "owned" by one person.  I believe this is strongly contrary to the spirits of distributed version control, free software, and public development methodology.

In circumstances with no history rewriting, the very interesting possibility of ad hoc cross-synchronizing via merges between two or more developers on a single feature branch arises.  You work on your parts, others work on other parts, you merge from each other when ready.  Given the above, it is not surprising to me that so many developers have not experienced this very satisfying way of working together, even as our modern tools with sophisticated merge systems enable it.

Sunday, June 23, 2013

This article documents an element of Scalaz design that I practice, because I believe it to be an element of Scalaz design principles, and quite a good one at that. It explains why `Functor[Set]` was removed yet `Foldable[Set]` remains. More broadly, it explains why a functor may be considered invalid even though “it doesn't break any laws”. It is a useful discipline to apply to your own Scala code.
1. Do not use runtime type information in an unconstrained way.
2. Corollary: do not use `Object#equals` or `Object#hashCode` in unconstrained contexts, because that would count as #1.
The simplest way to state and remember it is “for all means for all”. Another, if you prefer, might be “if I don't know anything about it, I can't look at it”.

We accept this constraint for the same reason that we accept the constraint of referential transparency: it gives us powerful reasoning tools about our code. Specifically, it gives us our free theorems back.

Let's consider a basic signature.

```     def const[A](a: A, a2: A): A
```

With the principle intact, there are only two total, referentially transparent functions that we can write with this signature.

```     def const[A](a: A, a2: A): A = a

def const2[A](a: A, a2: A): A = a2
```

That is, we can return one or the other argument. We can't “look at” either `A`, so we can't do tests on them or combine them in some way.

Much of Scalaz is minimally documented because it is easy enough to apply this approach to more complex functions once you have a bit of practice. Many Scalaz functions are the only function you could write with such a signature.

Now, let us imagine that we permit the unconstrained use of runtime type information. Here are functions that are referentially transparent, which you will find insane anyway.

```     def const3[A](a: A, a2: A): A = (a, a2) match {
case (s: Int, s2: Int) => if (s < s2) a else a2
case _ => a
}

def const4[A](a: A, a2: A): A =
if (a.## < a2.##) a else a2
```

Now, look at what we have lost! If the lowly `const` can be driven mad this way, imagine what could happen with `fmap`. One of our most powerful tools for reasoning about generic code has been lost. No, this kind of thing is not meant for the realm of Scalaz.

Missing theorems

For completeness's sake, let us see the list of theorems from Theorems for free!, figure 1 on page 3, for which I can think of a counterexample, still meeting the stated function's signature, if we violate the above explained principle.

In each case, I have assumed all other functions but the one in question have their standard definitions as explained on the previous page of the paper. I recommend having the paper open to page 3 to follow along. They are restated in fake-Scala because you might like that. Let `lift(f)` be `(_ map f)`, or `f*` as written in the paper.

`head[X]: List[X] => X`
`a compose head` = `head compose lift(a)`

`tail[X]: List[X] => List[X]`
`lift(a) compose tail` = `tail compose lift(a)`

`++[X]: (List[X], List[X]) => List[X]`
`lift(a)(xs ++ ys)` = `lift(a)(xs) ++ lift(a)(ys)`

`zip[X, Y]: ((List[X], List[Y])) => List[(X, Y)]`
`lift(a product b) compose zip` = ```zip compose (lift(a) product lift(b))```

`filter[X]: (X => Boolean) => List[X] => List[X]`
`lift(a) compose filter(p compose a)` = `filter(p) compose a`

`sort[X]: ((X, X) => Boolean) => List[X] => List[X]`
wherever for all `x`, `y` in `A` , ```(x < y) = (a(x) <' a(y))```, also `lift(a) compose sort(<)` = ```sort(<') compose lift(a)```

`fold[X, Y]: ((X, Y) => Y, Y) => List[X] => Y`
wherever for all `x` in `A`, `y` in `B`, ```b(x + y) = a(x) * b(y)``` and `b(u) = u'`, also ```b compose fold(+, u)``` = `fold(*, u') compose lift(a)`

`Object#equals` and `Object#hashCode` are sufficient to break all these free theorems, though many creative obliterations via type tests of the `const3` kind also exist.

By contrast, here are the ones which I think are preserved. I hesitate to positively state that they are, just because there are so many possibilities opened up by runtime type information.

`fst[X, Y]: ((X, Y)) => X`
`a compose fst` = `fst compose (a product b)`

`snd[X, Y]: ((X, Y)) => Y`
`b compose snd` = `snd compose (a product b)`

`I[X]: X => X`
`a compose I` = `I compose a`

`K[X, Y]: (X, Y) => X`
`a(K(x, y))` = `K(a(x), a(y))`

Here is a useful excerpt from the paper itself, section 3.4 “Polymorphic equality”, of which you may consider this entire article a mere expansion.

... polymorphic equality cannot be defined in the pure polymorphic lambda calculus. Polymorphic equality can be added as a constant, but then parametricity will not hold (for terms containing the constant).

This suggests that we need some way to tame the power of the polymorphic equality operator. Exactly such taming is provided by the eqtype variables of Standard ML [Mil87], or more generally by the type classes of Haskell [HW88, WB89].

Compromise

Scalaz has some tools to help deal with things here. The `Equal` typeclass contains the `equalIsNatural` method as runtime evidence that `Object#equals` is expected to work; this evidence is used by other parts of Scalaz, and available to you.

Scalaz also provides

```     implicit def setMonoid[A]: Monoid[Set[A]]
```

Relative to `Functor`, this is more or less harmless, because `Monoid` isn't so powerful; once you have `Monoid` evidence in hand, it doesn't “carry” any parametric polymorphism the way most Scalaz typeclasses do. It provides no means to actually fill sets, and the semigroup is also symmetric, so it seems unlikely that there is a way to write Monoid-generic code that can use this definition to break things.

More typical are definitions like

```     implicit def setOrder[A: Order]: Order[Set[A]]
```

Which may use `Object#equals`, but is constrained in a way that they can be sure it's safe to do so, just as implied in the quote above.

Insofar as “compromise” characterizes the above choices, I think Scalaz's position in the space of possibilities is quite good. However, I would be loath to see any further relaxing of the principles I have described here, and I hope you would be too.

Mistakes are part of history

And sometimes, later, they turn out not to be mistakes at all.

Has this never happened to you?  For my part, sometimes I am mistaken, and sometimes I am even mistaken about what I am mistaken about.  So it is worthwhile to keep records of failed experiments.

You can always delete information later, as a log-viewing tool might, but you can never get it back if you just deleted it in the first place.

Please consider this, git lovers, before performing your next rebase or squashed merge.

(My favorite VC quote courtesy ddaa of GNU Arch land, of all places)

Tuesday, April 3, 2012

Some type systems are much better than others.

As a shortcut, I previously complained about C and Java's type systems leaving bad impressions. Let's broaden that to the realm of type system research, avoiding useless discussion of the relative merits of this or that programming language as a whole. Some type systems let you express more of the set of valid programs, or different subsets. More out of the theoretical realm, but because we are human, different inference engines are better at figuring out types in different situations.

There is a nice graphic in GĂ¶del, Escher, Bach that illustrates this issue: it displays the universe split into the true and the false, and elaborate structures covered with appendages illustrating “proof”. Of all valid programs¹, many are “typeable”, meaning that we can prove that they are sound, type-wise, with a particular type system; conversely, most invalid programs are not typeable.

However, you might imagine that the fingers stretch into the invalid space on occasion, and they don't cover the valid space entirely either; for any given type system, there are valid programs that are rejected, and there are invalid programs that are accepted.

Not so simple math

The goal of type system research is to both accept more valid programs and reject more invalid programs. Let's consider these three programs in three different languages, which implement one step of a divide-and-conquer list reduction strategy.
```     def pm(x, y, z):                # Python
return x + y - z

pm x y z = x + y - z            -- Haskell
-- inferred as Num a ⇒ a → a → a → a

int pm(int x, int y, int z) {   /* C */
return x + y - z;
}
```
First, the expressive problem in C is easy to spot: you can only add and subtract `int`s. You have to reimplement this for each number type, even if the code stays the same.

The Haskell expressive problem is a little more obscure, but more obvious given the inferred type: all the numbers must be of the same type. You can see this with some partial application:
```     pm (1::Int)      -- Int → Int → Int
pm (1::Integer)  -- Integer → Integer → Integer
pm (1::Float)    -- Float → Float → Float
```
The Python version works on a numeric tower: once you introduce a float, it sticks. This may be good for you, or it may be bad. If you are reducing some data with `pm` and a float sneaks in there, you won't see it until you get the final output. So with dynamic promotion, `pm` works with everything, even some things you probably don't want.

There are adjustments you can make to get the Haskell version to be more general, but this all depends on the kind of generalization you mean. This is a matter of continued innovation, even in the standard library; many commonly used libraries provide alternative, more generic versions of built-in `Prelude` functions, such as `fmap` for functors, one of many, many generalizations that work with lists, in this case replacing `map`.

Footnotes

[1] Use whatever meaning you like for “valid”, but if you want something more formal, perhaps “it terminates” will suffice.

Sunday, April 1, 2012

“Inference” is “proof assistance”.

As a Lisper, I'm used to just writing out some code that more or less expresses what I mean, then trying it out. I don't want to mess around with proving that what I'm writing makes sense. Moreover, when I first started learning Haskell, I didn't know nearly enough to begin proving what I meant.

Fortunately, Haskell can figure out all the argument types and return types of very complex functions.¹ You know how to write a function that applies a function to each element of a list, and then combines all the resulting lists into one, so just write it:

`    concatMap _ [] = []    concatMap f (x:xs) = f x ++ concatMap f xs    -- inferred type of concatMap is (t → [a]) → [t] → [a] `

That's pretty nice; I didn't have to specify a single type, and Haskell figured out not only the types of the arguments and results, one of which was itself a function type, but figured out the precise level of polymorphism appropriate. A frequent mistake when trying to guess this type is writing `(a → [a]) → [a] → [a]`, which is not as general as the inferred version above. It will compile, but unnecessarily (and often fatally) restrict users of the `concatMap` function.²

So inference helps you prove things, often avoiding or explaining generalizations you didn't think of. It is a “proof assistant”. It greatly aids in refactoring, if you continue to rely on it, as you have to fix your proof in fewer places when you change the rules. It's an absolutely vital tool for entry into the typeful world, when you frequently know how to write a function, but not how to express its true, maximally polymorphic type.

Unfortunately, the “proof assistant” can't figure out absolutely everything. Moreover, the semantics of the language and type system affect how much the assistant can prove.³

Footnotes

[1] Haskell is really figuring out “the types of very complex expressions”, but that doesn't sound quite so good, despite being quite a bit better.

[2] As it happens, we've restricted ourselves to lists, where `concatMap` actually makes sense for all monads, but that's a result of our using the list-only operations for our implementation, not a failure of the type inference. In Haskell terms, `concatMapM f xs = join (liftM f xs)`, which is inferred as a `Monad m ⇒ (t → m a) → m t → m a`. Other generalizations are possible, and you can accidentally lock them down in exactly the same way as `concatMap`, including to our original inferred type.

[3] To be overly general, features like mutability and looking up functions by name on a receiver type make inference harder. These are the rule in Simula-inspired object-oriented environments, making inference harder in OOP, and conversely, easier in functional environments. For example, in the Java expression `x.getBlah(y)`, you can't infer anything about `getBlah` until you know the type of `x`. But in Haskell, `getBlah` has one known type, albeit perhaps polymorphic or constrained by typeclasses, which can be used to infer things about `x` and `y` without necessarily knowing anything else about them.