Showing posts with label proofs. Show all posts
Showing posts with label proofs. Show all posts

Sunday, April 1, 2012

“Inference” is “proof assistance”.

As a Lisper, I'm used to just writing out some code that more or less expresses what I mean, then trying it out. I don't want to mess around with proving that what I'm writing makes sense. Moreover, when I first started learning Haskell, I didn't know nearly enough to begin proving what I meant.

Fortunately, Haskell can figure out all the argument types and return types of very complex functions.¹ You know how to write a function that applies a function to each element of a list, and then combines all the resulting lists into one, so just write it:

    concatMap _ [] = []
concatMap f (x:xs) = f x ++ concatMap f xs
-- inferred type of concatMap is (t → [a]) → [t] → [a]

That's pretty nice; I didn't have to specify a single type, and Haskell figured out not only the types of the arguments and results, one of which was itself a function type, but figured out the precise level of polymorphism appropriate. A frequent mistake when trying to guess this type is writing (a → [a]) → [a] → [a], which is not as general as the inferred version above. It will compile, but unnecessarily (and often fatally) restrict users of the concatMap function.²

So inference helps you prove things, often avoiding or explaining generalizations you didn't think of. It is a “proof assistant”. It greatly aids in refactoring, if you continue to rely on it, as you have to fix your proof in fewer places when you change the rules. It's an absolutely vital tool for entry into the typeful world, when you frequently know how to write a function, but not how to express its true, maximally polymorphic type.

Unfortunately, the “proof assistant” can't figure out absolutely everything. Moreover, the semantics of the language and type system affect how much the assistant can prove.³



Footnotes

[1] Haskell is really figuring out “the types of very complex expressions”, but that doesn't sound quite so good, despite being quite a bit better.

[2] As it happens, we've restricted ourselves to lists, where concatMap actually makes sense for all monads, but that's a result of our using the list-only operations for our implementation, not a failure of the type inference. In Haskell terms, concatMapM f xs = join (liftM f xs), which is inferred as a Monad m ⇒ (t → m a) → m t → m a. Other generalizations are possible, and you can accidentally lock them down in exactly the same way as concatMap, including to our original inferred type.

[3] To be overly general, features like mutability and looking up functions by name on a receiver type make inference harder. These are the rule in Simula-inspired object-oriented environments, making inference harder in OOP, and conversely, easier in functional environments. For example, in the Java expression x.getBlah(y), you can't infer anything about getBlah until you know the type of x. But in Haskell, getBlah has one known type, albeit perhaps polymorphic or constrained by typeclasses, which can be used to infer things about x and y without necessarily knowing anything else about them.

Thursday, March 29, 2012

With a type system, whether you can write a program depends on whether you can prove its correctness.

The trouble with bad type systems is that you have to use “escape hatches” pretty frequently. In C and many of its derivatives, these take the form of casts, which, in type terms, are like saying “I can't prove this even a little bit, but trust me, it's true. Try it a few times and you'll see.” The traditional cast system is very broad, as it must be to account for the shortcomings of lesser type systems. No one wants to reimplement the collection classes for every element type they might like to use.

After being so impressed by the power of Haskell's inference, many people next discover that they can't put values of different types into lists.¹ Well, that's not quite true, you can always chain 2-tuples together, but that's not what you really mean. Well, what did you mean?

Oh, well, I meant that sometimes the elements of my list are integers, and sometimes they're strings.

Okay, no problem. Put the integer type and string type together as alternatives in a single type:
     data IntOrString = AnInt Int | AString String
     [AnInt 42, AString "hi"] -- has type [IntOrString] 
No, I meant that it's alternating integers and strings, one of each for the other.

Well why didn't you say so!
     data IntAndString = IntAndString Int String
     [IntAndString 42 "hi"] -- has type [IntAndString] 
You can't just stick integers and strings together in a list without proving something about what you mean. To write any program typefully, you have to prove that it sort of makes sense. In the former example, you really meant that each element could be either one, and you have to prove that it's one of those, and not, say, a map, before you can put it in the list. In the latter example, you have to prove that you have exactly one string for each integer that you put into the list.

This permits a more analytical approach to programming than can occur in latent-typed systems. Let's say you had the [IntOrString], and you realized it was wrong and changed it to [IntAndString] in one module. You have two other modules that are trying to use the lists, and now they don't work, because you didn't prove that you had one string for each integer in those modules. Now nothing loads, and you have to jump around for a bit fixing your proofs until you can test again. This separates the task into two phases: one where you're only thinking about and testing the proofs, and the other where you're thinking about and testing the behavior.

I don't think this is an unqualified improvement over the latent-typed situation. On one hand, breaking tasks down into little bits is the foundation of human software development. Moreover, this example clearly helped us to clarify what we meant about the structure we were building. On the other hand, sometimes I prefer to focus on getting one module right on both type and runtime levels before moving on to the next. This is harder to do with most typeful programming languages, as type errors naturally cascade, and types both described and inferred usually influence runtime behavior.


[1] Haskell also has escape hatches, but using them is viewed rather like gratuitous use of eval is by Lispers. Whereas most C and Java programs use casts, very few Haskell programs use Data.Dynamic, just as very few OCaml programs use Obj.magic.

Sunday, March 18, 2012

Type systems prove things about your programs.

Typeful programmers don't seem to see types the way Common Lispers or C programmers do. Instead, type systems are how you can partially prove the correctness of your program.

Thinking of types as proof systems offers several useful intuitive analogies. It is not just more rigorous testing; it's in a whole other universe from testing. Scientifically, we don't accept a hypothesis that seemed to work when we tried it with a few different things, we demand proof, which is qualitatively different.

Taking the time to prove your program's correctness may take more time than simply writing the program in many cases. (This is a complex issue I will return to later.) However, even in those cases, the benefit of the rigor of proof may outweigh the cost.

It may be that, when you are writing a program, you can intuitively see that your program is somewhat correct, but you don't know how to prove it. Well, if you can't prove it, should you, or anyone else, really trust your intuition? If you do prove it, no trust is necessary; all you need to do is trust the proving framework, the type system, to disallow nonsense, and the fact that the type system passed the program is enough to satisfy everyone that the proof is sound, so far as it goes.

Finally, type inference is “proof assistance”. Typeful compilers come with both proof checkers and proof assistance, but these, again, deserves separate posts.

Accordingly, I will often use the language of “proof” when talking about using type systems.