The Scalaz API has a large number of functions, most of which have several things in common:
- They have type parameters.
- Their implementations are very short, typically one line.
- They do not have Scaladoc.
While they don’t have Scaladoc, their types are printed in the Scaladoc pages. However, with some practice, you can get a quick, accurate understanding of each function by taking into account two things:
- One of the goals of the Scalazzi Safe Scala Subset is that “Types Are Documentation”. For example, one of the rules is “no side-effects”. Side effects, by their nature, do not appear in type signatures of methods that create them; by banishing them, we eliminate that source of untyped program behavior.
- Heavy use of type parameters, and minimization of concrete information in each type signature, amplifies parametricity. For Scalaz API functions, there are far fewer factors to consider when working out what a type “means” than you must consider when reading typical Scala libraries.
Adding Scaladoc may be valuable to many users. However, with these factors, the value is greatly diminished, and it’s difficult to justify the development cost of comprehensive Scaladoc when we are already encouraging users to think in terms of “Types Are Documentation”.
That doesn’t mean there aren’t some simple “tricks” and rules-of-thumb that you can learn to accelerate this process.
Values of a type variable cannot arise from thin air.
Suppose that a function has a type variable B
, and this type
variable is used in the return type, which might be something like
(List[Int], List[B])
. There is no way for the function to make up
its own B
s, so all B
s appearing in that list must have come
from the arguments. If there were no B
s in the arguments, then that
List[B]
must be empty.
This rule, “must come from the arguments”, is significantly more
flexible than it sounds, while allowing the caller to preserve data
integrity in a type-checked way. For example, the signature def m[A,
B](xs: List[A], f: A => B): List[B]
does not require the result list
to be empty, because the arguments supply a “way to get B
s”:
- Take any element from the
xs
list. - Call
f
on that element.
Since the body of m
has no other source of A
s than xs
, it’s a
fact of the return value that “all elements must have come from
calling f
on elements of xs
”.
In this way, you can use a type signature to read the “flow” of data
from its source in the arguments, through other arguments, to its
destination in the result. When you get used to this, all sorts of
useful corollaries naturally arise. For example, the above fact means
also “if xs
is empty, then the result list must be empty”.
If you don’t know anything about it, you can’t look at it.
Within the body of def m[A, B](xs: List[A], f: A => B): List[B]
,
there is exactly one special operation available for values of type
A
—they can be converted to B
via calling f
—and there are none
for values of type B
. You can’t compare them, so you can’t sort
either the input list or the output list. You can’t take their hash
code. You can’t add two A
s together to get a combined A
, and the
same goes for B
s. You could do something polymorphic like put an A
and B
in an (A, B)
tuple, but that tells you nothing about those
values.
In Scalazzi-safe programming, we supply information about types via
typeclasses. For example, if we declared A: Order
, there’s a way to
compare A
s. If we declared B: Semigroup
, there’s a way to append
B
s. When thinking about what a typeclass constraint means for the
“flow of data” in a function, you can think of a typeclass constraint
as supplying the primitives of that typeclass as extra arguments to
the function. (That is, after all, how Scala implements typeclass
constraints.) For example, A: Order
means that there’s an extra
argument (A, A) => Ordering
, where Ordering
is the usual
three-valued result of comparison, and the function is guaranteed to
follow some special properties (the laws of the typeclass). B:
Semigroup
means that there’s an extra argument (B, B) => B
, also
guaranteed to follow its own special properties.
Naturally, if there are no typeclass constraints on a type variable, no such extra arguments are supplied; only the “ordinary” arguments provide capabilities for working with the type. Surprisingly, for all that is made of the importance of Scalaz’s typeclasses, this is by far the most common case.
You can’t “just crash”.
Consider the type signature def lp[L, R](i: L, s: L => Either[L, R]):
R
. The flow of data says that the result must have come from s
returning a Right
, and that s
call’s argument must be either i
,
or a Left
from a prior call to s
. Moreover, you can safely expect
lp
to use the L
produced by s
each time after the first, rather
than trying i
or another previous L
again; in functional
programming, it would be absurd to try s(i)
again and expect that it
might return a Right
, when you already know it previously returned a
Left
.
In particular, there’s no allowance for “timing out” or “duplicate L
detection”. Timing out (in terms of a maximum number of s
calls)
would require a different return type, like Option[R]
. Duplicate
detection would require a constraint like L: Equal
at minimum.
“It’s impossible to implement this signature” is not a reason to “implement” it by crashing; it’s a reason to not have a function with that signature at all. Writing correct type signatures is part of writing a correct type-safe program. When type signatures are only declared for functions that are possible to implement, and reading those type signatures can tell you what they are honestly doing, they start to become true machine-checked documentation.
The utilities of each typeclass are bound by the basics of that typeclass.
Many of the most useful functions in the Scalaz API are defined as
typeclass utilities, so they can be most easily understood by keeping
in mind those basics as you read the utility functions. So utilities
in Functor
must have been gotten by map
ping, utilities in
Foldable
must have been gotten by folding, and so on.
For example, consider the utility under typeclass Functor[F[_]]
,
def void[A](fa: F[A]): F[Unit]
. What does this function do? If you
guess based on its name void
, or a poor analogy like “F
is some
kind of collection” (a common mistake when first approaching
Functor
), you might conclude something like “it must return an empty
F
” or “it must return an F
with a single Unit
in
it”. Unfortunately, these “intuitive” answers are not only wrong, they
don’t make sense.
Instead, think about void
like this: def void[F[_]: Functor, A](fa:
F[A]): F[Unit]
. Just as with the Order
and Semigroup
constraints
described above, that constraint manifests as its primitives; in this
case, “a map
function for F
s”. That leaves only one possibility,
which happens to be the true behavior of void
: the result is gotten
by calling map
on fa
, with _ => ()
as the function argument.
Practice on the simple cases.
The above are a lot of words to describe a thinking process that is very fast in practice. With practice, it’s much faster to understand what a function does by reading its type than by reading a documentation comment. At the very least, a documentation comment should only be considered as secondary advice to the primary source, the type.
The functions that are easiest to understand purely with types are,
unfortunately, the most likely to be fully documented. That makes
relying solely on documentation comments more tempting, but this is a
mistake as a Scalaz newcomer. If you practice reading only the types
for simple functions like void
, you’ll gain important practice for
quickly understanding much more complex functions using the same
techniques.
No comments:
Post a Comment