tag:blogger.com,1999:blog-11845491854382475502024-02-08T11:23:47.131-05:00Failed Experimentsa Mr. Fleming wishes to study bugs in smelly cheese; a Polish woman wishes to sift through tons of Central African ore to find minute quantities of a substance she says will glow in the dark; a Mr. Kepler wants to hear the songs the planets sing.Stephen Compallhttp://www.blogger.com/profile/13346366374080232972noreply@blogger.comBlogger46125tag:blogger.com,1999:blog-1184549185438247550.post-4001700779422427992020-06-11T18:48:00.000-04:002020-06-11T18:48:40.593-04:00Reading Scalaz API Functions (Principles 5, Scalaz Files)<p>The Scalaz API has a large number of functions, most of which have
several things in common:</p>
<ol>
<li>They have type parameters.</li>
<li>Their implementations are very short, typically one line.</li>
<li>They do not have Scaladoc.</li>
</ol>
<p>While they don’t have Scaladoc, their types are printed in the
Scaladoc pages. However, with some practice, you can get a quick,
<strong>accurate</strong> understanding of each function by taking into account two
things:</p>
<ol>
<li>One of the goals of the Scalazzi Safe Scala Subset is that “<strong>Types
Are Documentation</strong>”. For example, one of the rules is “no
side-effects”. Side effects, by their nature, do not appear in type
signatures of methods that create them; by banishing them, we
eliminate that source of untyped program behavior.</li>
<li>Heavy use of type parameters, and minimization of concrete
information in each type signature, <em>amplifies parametricity</em>. For
Scalaz API functions, there are far fewer factors to consider when
working out what a type “means” than you must consider when reading
typical Scala libraries.</li>
</ol>
<p>Adding Scaladoc may be valuable to many users. However, with these
factors, the value is greatly diminished, and it’s difficult to
justify the development cost of comprehensive Scaladoc when we are
already encouraging users to think in terms of “Types Are
Documentation”.</p>
<p>That doesn’t mean there aren’t some simple “tricks” and rules-of-thumb
that you can learn to accelerate this process.</p>
<h2 id="markdown-header-values-of-a-type-variable-cannot-arise-from-thin-air">Values of a type variable cannot arise from thin air.</h2>
<p>Suppose that a function has a type variable <code>B</code>, and this type
variable is used in the return type, which might be something like
<code>(List[Int], List[B])</code>. There is no way for the function to make up
its own <code>B</code>s, so <em>all</em> <code>B</code>s appearing in that list <em>must have come
from the arguments</em>. If there were no <code>B</code>s in the arguments, then that
<code>List[B]</code> <em>must be empty</em>.</p>
<p>This rule, “must come from the arguments”, is significantly more
flexible than it sounds, while allowing the caller to preserve data
integrity in a type-checked way. For example, the signature <code>def m[A,
B](xs: List[A], f: A => B): List[B]</code> does not require the result list
to be empty, because the arguments supply a “way to get <code>B</code>s”:</p>
<ol>
<li>Take any element from the <code>xs</code> list.</li>
<li>Call <code>f</code> on that element.</li>
</ol>
<p>Since the body of <code>m</code> has no other source of <code>A</code>s than <code>xs</code>, it’s a
<em>fact</em> of the return value that “all elements must have come from
calling <code>f</code> on elements of <code>xs</code>”.</p>
<p>In this way, you can use a type signature to read the “flow” of data
from its source in the arguments, through other arguments, to its
destination in the result. When you get used to this, all sorts of
useful corollaries naturally arise. For example, the above fact means
also “if <code>xs</code> is empty, then the result list must be empty”.</p>
<h2 id="markdown-header-if-you-dont-know-anything-about-it-you-cant-look-at-it">If you don’t know anything about it, you can’t look at it.</h2>
<p>Within the body of <code>def m[A, B](xs: List[A], f: A => B): List[B]</code>,
there is exactly one special operation available for values of type
<code>A</code>—they can be converted to <code>B</code> via calling <code>f</code>—and there are <em>none</em>
for values of type <code>B</code>. You can’t compare them, so you can’t sort
either the input list or the output list. You can’t take their hash
code. You can’t add two <code>A</code>s together to get a combined <code>A</code>, and the
same goes for <code>B</code>s. You could do something polymorphic like put an <code>A</code>
and <code>B</code> in an <code>(A, B)</code> tuple, but that tells you nothing about those
values.</p>
<p>In Scalazzi-safe programming, we supply information about types via
typeclasses. For example, if we declared <code>A: Order</code>, there’s a way to
compare <code>A</code>s. If we declared <code>B: Semigroup</code>, there’s a way to append
<code>B</code>s. When thinking about what a typeclass constraint means for the
“flow of data” in a function, you can think of a typeclass constraint
as supplying the primitives of that typeclass as extra arguments to
the function. (That is, after all, how Scala implements typeclass
constraints.) For example, <code>A: Order</code> means that there’s an extra
argument <code>(A, A) => Ordering</code>, where <code>Ordering</code> is the usual
three-valued result of comparison, and the function is guaranteed to
follow some special properties (the laws of the typeclass). <code>B:
Semigroup</code> means that there’s an extra argument <code>(B, B) => B</code>, also
guaranteed to follow its own special properties.</p>
<p>Naturally, if there are no typeclass constraints on a type variable,
no such extra arguments are supplied; only the “ordinary” arguments
provide capabilities for working with the type. Surprisingly, for all
that is made of the importance of Scalaz’s typeclasses, this is by far
the most common case.</p>
<h2 id="markdown-header-you-cant-just-crash">You can’t “just crash”.</h2>
<p>Consider the type signature <code>def lp[L, R](i: L, s: L => Either[L, R]):
R</code>. The flow of data says that the result must have come from <code>s</code>
returning a <code>Right</code>, and that <code>s</code> call’s argument must be either <code>i</code>,
or a <code>Left</code> from a prior call to <code>s</code>. Moreover, you can safely expect
<code>lp</code> to use the <code>L</code> produced by <code>s</code> each time after the first, rather
than trying <code>i</code> or another previous <code>L</code> again; in functional
programming, it would be absurd to try <code>s(i)</code> again and expect that it
might return a <code>Right</code>, when you already know it previously returned a
<code>Left</code>.</p>
<p>In particular, there’s no allowance for “timing out” or “duplicate <code>L</code>
detection”. Timing out (in terms of a maximum number of <code>s</code> calls)
would require a different return type, like <code>Option[R]</code>. Duplicate
detection would require a constraint like <code>L: Equal</code> at minimum.</p>
<p>“It’s impossible to implement this signature” is not a reason to
“implement” it by crashing; it’s a reason to not have a function with
that signature at all. Writing correct type signatures is part of
writing a correct type-safe program. When type signatures are only
declared for functions that are possible to implement, and reading
those type signatures can tell you what they are honestly doing, they
start to become true machine-checked documentation.</p>
<h2 id="markdown-header-the-utilities-of-each-typeclass-are-bound-by-the-basics-of-that-typeclass">The utilities of each typeclass are bound by the basics of that typeclass.</h2>
<p>Many of the most useful functions in the Scalaz API are defined as
typeclass utilities, so they can be most easily understood by keeping
in mind those basics as you read the utility functions. So utilities
in <code>Functor</code> must have been gotten by <code>map</code>ping, utilities in
<code>Foldable</code> must have been gotten by folding, and so on.</p>
<p>For example, consider the utility under typeclass <code>Functor[F[_]]</code>,
<code>def void[A](fa: F[A]): F[Unit]</code>. What does this function do? If you
guess based on its name <code>void</code>, or a poor analogy like “<code>F</code> is some
kind of collection” (a common mistake when first approaching
<code>Functor</code>), you might conclude something like “it must return an empty
<code>F</code>” or “it must return an <code>F</code> with a single <code>Unit</code> in
it”. Unfortunately, these “intuitive” answers are not only wrong, they
don’t make sense.</p>
<p>Instead, think about <code>void</code> like this: <code>def void[F[_]: Functor, A](fa:
F[A]): F[Unit]</code>. Just as with the <code>Order</code> and <code>Semigroup</code> constraints
described above, that constraint manifests as its primitives; in this
case, “a <code>map</code> function for <code>F</code>s”. That leaves only one possibility,
which happens to be the true behavior of <code>void</code>: the result is gotten
by calling <code>map</code> on <code>fa</code>, with <code>_ => ()</code> as the function argument.</p>
<h2 id="markdown-header-practice-on-the-simple-cases">Practice on the simple cases.</h2>
<p>The above are a lot of words to describe a thinking process that is
very fast in practice. With practice, it’s much faster to understand
what a function does by reading its type than by reading a
documentation comment. At the very least, a documentation comment
should only be considered as secondary advice to the primary source,
the type.</p>
<p>The functions that are easiest to understand purely with types are,
unfortunately, the most likely to be fully documented. That makes
relying solely on documentation comments more tempting, but this is a
mistake as a Scalaz newcomer. If you practice reading <em>only</em> the types
for simple functions like <code>void</code>, you’ll gain important practice for
quickly understanding much more complex functions using the same
techniques.</p>Stephen Compallhttp://www.blogger.com/profile/13346366374080232972noreply@blogger.com0tag:blogger.com,1999:blog-1184549185438247550.post-4658124085253889152020-05-16T13:01:00.000-04:002020-05-16T13:12:04.241-04:00Global typeclass coherence (Principles 3, Scalaz Files)<p>In Scalaz, we provide at most one typeclass instance per type
throughout your whole program, and we expect users to preserve that
invariant. For that type, there should be no way to get two
incompatible instances. This is <em>global coherence</em>, and is required
to make Scala programs using implicits for typeclasses understandable
and maintainable.</p>
<p>If you want a different instance for the “same” type when working with
Scalaz, the answer is always to start with a different type first.
Then you can define your incompatible instance for the same
<em>structure</em>, but on the different <em>type</em>, preserving coherence.
Scalaz’s <code>@@</code> type tags or the underlying existential newtype
mechanism are convenient, flexible ways to get these “different”
types.</p>
<p>It’s not surprising that “new type” as a solution to this problem
comes from Haskell, as Haskell too depends on global coherence. While
we can’t get all the benefits from coherence that Haskell does, what
remain is sufficient to justify this seemingly non-modular rule.</p>
<p>The invisible action of implicits in Scala is a serious problem for
understanding and maintenance if used in an undisciplined manner.
Consider, for example, an incoherent implicit-based design:
scala-library’s <code>ExecutionContext</code>.</p>
<p>At any point in the program, the <code>ExecutionContext</code> resolved depends
on what variables are in scope. It’s unsafe to move any code that
depends on <code>ExecutionContext</code> to some other method, because that set
of variables can change, thus changing the behavior of the program.
And you can’t determine at a glance whether some code depends on
<code>ExecutionContext</code>, so moving <em>any</em> code carries some risk.</p>
<p>You can’t add an <code>ec: ExecutionContext</code> argument <em>anywhere</em> without
potentially breaking working code, because it changes that set of
variables. It’s only safe to introduce brand-new methods with that
argument.</p>
<p>If you are refactoring and suddenly get an error about multiple
conflicting <code>ExecutionContext</code>s in scope, you have no help to
determine which is the “right” one; you have to figure it out based on
the probable intent of the code originally was.
Possibly, none of the options in scope is right.</p>
<p>By contrast, consider <code>Monoid[List[N]]</code>, where <code>N</code> might be some type
variable in scope. The correct instance depends on the <em>type</em>, not
what variables are in scope. So you can add a <code>Semigroup[N]</code>
constraint and know the answer won’t change. You can split up the
method, or move any of its code anywhere else, and know the answer
won’t change. You can even add a <code>Monoid[List[N]]</code> argument to your
function, because you know the caller is required to come up with the
same instance you were working with before.</p>
<p>You can add or delete constraints as required, because they’re always
going to be fulfilled by the same instances. For example, Scalaz’s
sorted map <code>K ==>> V</code> doesn’t carry the <code>K</code> comparator around, because
we can assume it always depends on <code>K</code> and only on <code>K</code>.</p>
<p>If you ever get an error about multiple conflicting instances, you
know there’s always a workable solution: <strong>choose one, because they’re
the same</strong>.</p>
<p>Tools try to help with understanding Scala’s implicit resolution, but
they don’t always work. Global coherence is your greatest ally when
trying to understand what an implicit resolution is doing by hand: you
know that any path to the instance you find is a correct one, and you
can then work out why <em>that</em> isn’t being resolved.</p>
<p>Global coherence also lets Scalaz offer a much simpler API. For
example, the efficient union of sorted maps requires that the ordering
of <code>K</code> used for both is equal. With “local” (read: incoherent)
instances, the only safe way to do this is to define a singleton type
depending on the ordering, and treat this singleton type as a sort of
third type parameter to the map type. If you happen to have built two
maps with the same key type where you didn’t use polymorphism to unify
their use of that singleton type, too bad, you can’t safely union
them. With global coherence, because the instance only depends on
<code>K</code>, the simple two-parameter map type is perfectly sufficient, and
those maps are easy to union.</p>
<p>The “flexibility” of local instances is not worth it given the
constraints of Scala, and Scalaz assumes you won’t be using them when
you use its functionality. Define a newtype instead.</p>Stephen Compallhttp://www.blogger.com/profile/13346366374080232972noreply@blogger.com0tag:blogger.com,1999:blog-1184549185438247550.post-34266425572218348522019-07-16T21:12:00.000-04:002019-07-16T21:12:42.196-04:00Scalazzi safe Scala subset (Principles 2, Scalaz Files)<p>Scalaz is designed for programmers building <em>type-safe</em>, <em>functional</em>
programs. If you program like this, you can start to see very deep
properties of your functions by only reading their types; in other
words, <strong>types become documentation</strong>. This also lets you see how you
can combine your functions in more ways, with greater confidence that
the combination will actually make sense.</p>
<p>But Scala, the language, contains many unsafe features that get in the
way of this method of thinking about your programs. “Scalazzi” means
that you are avoiding or banning these features from your Scala
codebase, thus restoring your ability to use types to discover those
properties.</p>
<ol>
<li><s><code>null</code></s></li>
<li><s>exceptions</s></li>
<li><s>Type-casing (<code>isInstanceOf</code>)</s></li>
<li><s>Type-casting (<code>asInstanceOf</code>)</s></li>
<li><s>Side-effects</s></li>
<li><s><code>equals</code>/<code>toString</code>/<code>hashCode</code></s></li>
<li><s><code>notify</code>/<code>wait</code></s></li>
<li><s><code>classOf</code>/<code>.getClass</code></s></li>
<li>General recursion</li>
</ol>
<p>Here’s an example of how you might use these rules to reduce your
testing requirements.</p>
<p>Suppose that you have this very simple function to return the greater
<code>Int</code>.</p>
<div class="codehilite language-scala"><pre><span></span><span class="k">def</span> <span class="n">maximum</span><span class="o">(</span><span class="n">x</span><span class="k">:</span> <span class="kt">Int</span><span class="o">,</span> <span class="n">y</span><span class="k">:</span> <span class="kt">Int</span><span class="o">)</span><span class="k">:</span> <span class="kt">Int</span> <span class="o">=</span> <span class="k">if</span> <span class="o">(</span><span class="n">x</span> <span class="o">>=</span> <span class="n">y</span><span class="o">)</span> <span class="n">x</span> <span class="k">else</span> <span class="n">y</span>
</pre></div>
<p>(I encourage you to imagine that this is harder than this example;
after all, aren’t your own programs more complicated?) This type
signature says that any <code>Int</code> can be returned; we must test to verify
that this isn’t happening.</p>
<p>Instead of writing a test, we can use parametricity to check that
either <code>x</code> or <code>y</code> is returned, but nothing else, at compile time
instead of test time.</p>
<div class="codehilite language-scala"><pre><span></span><span class="k">def</span> <span class="n">maximum</span><span class="o">[</span><span class="kt">N</span> <span class="k"><:</span> <span class="kt">Int</span><span class="o">](</span><span class="n">x</span><span class="k">:</span> <span class="kt">N</span><span class="o">,</span> <span class="n">y</span><span class="k">:</span> <span class="kt">N</span><span class="o">)</span><span class="k">:</span> <span class="kt">N</span> <span class="o">=</span> <span class="c1">// • • •</span>
</pre></div>
<p>I can read from this <em>type</em> that only <code>x</code> or <code>y</code> can be returned. With
some practice, you’ll start to see more complex facts arising from
types as well.</p>
<p>Unfortunately, Scala has many “features” that let you break this
safety. These features aren’t useful for type-safe functional
programs, so we simply declare them verboten.</p>
<p>Scalaz expects you to follow Scalazzi rules, but is also packed with
features to help you follow them. For example, if you are calling
<code>map</code> on a <code>List</code> and feel like your lambda needs to perform some side
effect, it’s time to look into Scalaz’s <code>Traverse</code> typeclass.</p>Stephen Compallhttp://www.blogger.com/profile/13346366374080232972noreply@blogger.com0tag:blogger.com,1999:blog-1184549185438247550.post-35870110457825155412019-07-09T19:37:00.001-04:002019-07-09T19:37:44.001-04:00A standard library for principled functional programming in Scala (Principles 1, Scalaz Files)<p>The best way to think about “what is Scalaz?” is as a <b>standard
library for functional programming</b>. This goes all the way back to
its creation: the Scalaz project started because there are not enough
facilities in Scala's standard library for convenient, everyday
functional programming, without cheating.</p>
<p>How should this affect your approach to the library? Like a standard
library, <b>you learn the bits and pieces you need, not the whole
thing</b>. There is no must-read book, no must-watch tutorial video, no
must-attend course. Scalaz can be used successfully from day 1 as a
new Scala programmer; as you do not learn every part of the standard
library before starting to use a language, so it goes for Scalaz as
well. All that is required of you is the desire to solve programming
problems in type-safe, functional ways, and the curiosity to learn
about what components that others have discovered and how they might
be useful. After all, most pieces of Scalaz were added to it because
somebody was solving a problem, and found a solution they thought
others might consider useful and well-thought-out.</p>Stephen Compallhttp://www.blogger.com/profile/13346366374080232972noreply@blogger.com0tag:blogger.com,1999:blog-1184549185438247550.post-77541273018676432632018-02-17T16:41:00.000-05:002018-02-18T00:24:43.961-05:00Scala FP: how good an idea now?<p>Ed Kmett’s <a href="https://www.reddit.com/r/haskell/comments/1pjjy5/odersky_the_trouble_with_types_strange_loop_2013/cd3bgcu/#thing_t1_cd3bgcu">reddit comment full of biting commentary on the troubles
of attempting functional programming in
Scala</a>
remains the most concise listing of such problems, and remains mostly
up-to-date over four years after it was written. It covers problems
from the now famous to the less well known. It’s still a useful guide
to what needs to change for Scala to be a great functional programming
language, or conversely, why a functional programmer might want to
avoid Scala.</p>
<p>But not everything is the same as when it was written. Some things
have gotten better. Some can even be eliminated from the list
safely. Some haven’t changed at all.</p>
<p>I’d like to go through each of Kmett’s bullet points, one by one, and
elaborate on what has happened in the ensuing four years since he
posted this comment.</p>
<h2 id="markdown-header-types">Types</h2>
<blockquote>
<p>[1:] If you take any two of the random extensions that have been
thrown into scala and try to use them together, they typically don't
play nice. e.g. Implicits and subtyping don't play nice together.</p>
</blockquote>
<p>This hasn’t really changed. Paul Phillips’s age-old
<a href="https://groups.google.com/d/msg/scala-language/ZE83TvSWpT4/YiwJJLZRmlcJ">“contrarivariance”
thread</a>
about the specific example Kmett uses here might as well have been
written yesterday.</p>
<p>On a positive note, what is good hasn’t really changed, either. The
type soundness of new features still cannot be justified merely
because you can’t think of any ways programs would go wrong were your
idea implemented; you still need positive evidence that your idea
preserves soundness. This is more than can be said for, say,
TypeScript.</p>
<p>On the other hand, we’ve seen a lot of attempts to “solve” these kinds
of feature-compositionality problems by claims like “we don’t want you
to write that kind of code in Scala”. New features like <code>AnyVal</code>
subclasses are still made with the concerns of ill-typed, imperative
programming placed above the concerns of well-typed, functional
programming. Proposals like ADT syntax are likely to support <a href="https://github.com/lampepfl/dotty/issues/1970#issuecomment-279356882">only
those GADT features deemed
interesting</a>
for implementing the standard library, rather than what application
programs might find useful.</p>
<blockquote>
<p>[2:] Type inference works right up until you write anything that
needs it. If you go to write any sort of tricky recursive function,
you know, where inference would be useful, then it stops working.</p>
</blockquote>
<p>Still 100% true.</p>
<blockquote>
<p>[3:] Due to type erasure, its easy to refine a type in a case
expression / pattern match to get something that is a lie.</p>
</blockquote>
<p>I’m not sure why Ed wrote “due to type erasure” here, but the
underlying problems are there. This comment came after the
introduction of “virtpatmat”, which improved things in a lot of ways,
not least with the improved support for GADTs. I’ve noticed some
things get better for GADTs in 2.12, too.</p>
<p>But there are numerous unsound things you can do with pattern
matching, some accompanied by compiler warnings, some not. Most of
these are due to its reliance on <code>Object#equals</code>. Paul Phillips wrote
several bug reports a long time ago about these, and one of the major
ones is fixed: the type consequences of pattern matching <a href="https://github.com/scala/scala/pull/3558">used to
think</a> that <code>Object#equals</code>
returning <code>true</code> implied that the two values were perfect substitutes
for each other. For example, you could use an empty <code>Buffer[A]</code> and an
empty <code>Buffer[B]</code> to derive <code>A = B</code>, even when they’re completely
incompatible types.</p>
<p>This has been fixed, but the very <a href="https://github.com/scala/bug/issues/1503">similar problem with matching
constants</a> has not. I
suspect that it will never be fixed unless pattern matching’s use of
<code>equals</code> is removed entirely.</p>
<blockquote>
<p>[4:] Free theorems aren't.</p>
</blockquote>
<p>In the base Scala language, nothing has changed here. But we’ve tried
to account for this shortcoming with practice. I wrote an <a href="https://failex.blogspot.com/2013/06/fake-theorems-for-free.html">article
elaborating on the free theorems problem in
Scala</a>;
surprise surprise, <code>Object#equals</code> makes another villainous
appearance. Tony Morris popularized the <a href="https://imgur.com/a04WoHn">“Scalazzi safe Scala
subset”</a> through his <a href="https://youtu.be/BtEEZa_Q8Vw">“Parametricity: Types
are Documentation” talk</a>, and since then
“Scalazzi” has become the shorthand for this style of Scala
programming. (If you’ve heard “Scalazzi” before, this is what it’s
about: free theorems.) Tools like
<a href="http://www.wartremover.org/">Wartremover</a> have arisen to mechanically
enforce parts of the Scalazzi rules (among other rules), and they’re
well worth using.</p>
<p>So the situation in the Scala language hasn’t changed at all. The
situation in Scala <em>practice</em> has gotten better, as long as you’re
aware of it and compensating in your projects with tools like
Wartremover.</p>
<h2 id="markdown-header-collections-and-covariant-things">Collections and covariant things</h2>
<blockquote>
<p>[5:] Since you can pass any dictionary anywhere to any implicit you
can't rely on the canonicity of anything. If you make a Map or Set
using an ordering, you can't be sure you'll get the same ordering
back when you come to do a lookup later. This means you can't safely
do hedge unions/merges in their containers. It also means that much
of scalaz is lying to itself and hoping you'll pass back the same
dictionary every time.</p>
</blockquote>
<p>I don’t want to cover this in detail, because Ed’s already gone into
it in his talk <a href="https://youtu.be/hIZxTQP1ifo">“Typeclasses vs the
world”</a>. I’ve <a href="https://github.com/scalaz/scalaz/issues/1236#issuecomment-289584935">also
written</a>
about Scalaz’s “lying to itself” approach (a fair characterization),
and why we think it’s the best possible choice for Scalaz users in
Scala as it’s defined today.</p>
<p>You can think of this as the “coherence vs local instances” argument,
too, and Ed is describing here how Scala fails as a substrate for the
coherence approach. But he’s not saying that, as a result, coherence
is the wrong choice. Since we think that, despite the potential for
error, coherence is still the best choice for a Scala library, that
should tell you what we think about the alternative: that with local
instances, the potential for error is <em>still greater</em>.</p>
<p>So for us, the important question is, what has changed in Scala?
There’s been a <a href="https://github.com/lampepfl/dotty/issues/2047">“coherence”
proposal</a>, but its
purpose is not to force you to define only coherent instances, nor
even to detect when you have not; instead, it’s to let you assert to
the compiler that you’ve preserved coherence, whether you have or not;
if you’re wrong, scalac simply makes wrong decisions, silently.</p>
<p>This would be very useful for performance, and I will embrace it for
all typeclasses if implemented. It will make many implicit priority hacks unnecessary.
But it wouldn’t address Ed’s concern at all.</p>
<blockquote>
<p>[6:] The container types they do have have weird ad hoc
overloadings. e.g. Map is treated as an iterable container of pairs,
but this means you can't write code that is parametric in the
Traversable container type that can do anything sensible. It is one
of those solutions that seems like it might be a nice idea unless
you've had experience programming with more principled classes like
<code>Foldable</code>/<code>Traversable</code>.</p>
</blockquote>
<p>The design of the current collections library is the one Kmett was
talking about, so nothing has changed in released code. As for the
future collections library, known as “collections-strawman”? <a href="https://youtu.be/ofbaM7Yz3IM?t=22m52s">The
situation is the same.</a></p>
<blockquote>
<p>[7:] You wind up with code that looks like myMap.map(...).toMap all
over the place due to <code>CanBuildFrom</code> inference woes.</p>
</blockquote>
<p>I’m not sure what Kmett is referring to here, because I’ve been
relying on the correct behavior for a long time, that is, without the
trailing <code>.toMap</code>. The only thing I can think of would be the
function being passed to <code>map</code> returning something <em>implicitly
convertible</em> to two-tuple instead of a proper two-tuple, which would
require an extra step to force that conversion to be applied.</p>
<h2 id="markdown-header-monads-and-higher-kinds">Monads and higher kinds</h2>
<blockquote>
<p>[8:] Monads have to pay for an extra map at the end of any
comprehension, because of the way the <code>for { }</code> sugar works.</p>
</blockquote>
<p>This hasn’t changed at all, but is worth some elaboration. This
behavior makes it so you can’t write “tail-recursive” monadic
functions in the obvious way. As <a href="https://plus.google.com/+DanDoel/posts/B1pSqNojj3k">Dan Doel
demonstrated</a>,
this can turn a purely right-associated bind chain, i.e. one that can
be interpreted tail-recursively, into a repeatedly broken chain with
arbitrary left-binds injected into it, thus either crashing the stack
or requiring useless extra frames to be repeatedly shoved onto the
heap.</p>
<p>This is kind of silly, and could be ameliorated if <code>for</code> wasn’t trying
to be non-monadic. But that’s not going to change.</p>
<blockquote>
<p>[9:] You have type lambdas. Yay, right? But now you can't just talk
about <code>Functor (StateT s IO)</code>. Its <code>Functor[({type F[X] =
StateT[S,IO,X]})#F]</code>, and you have to hand plumb it to something
like <code>return</code>, because it basically can't infer any of that, once
you start dealing with transformers ever. The instance isn't
directly in scope. <code>12.pure[({type F[X] = StateT[S,IO,X]})#F]</code> isn't
terribly concise. It can't figure out it should use the inference
rule to define the implicit for <code>StateT[S,M,_]</code> from the one for
<code>M[_]</code> because of the increased flexibility that nobody uses.</p>
</blockquote>
<p>This is probably the best story of the bunch, and possibly the most
well-known of the whole series. This is good for Scala marketing, but
probably not best for the future of Scala FP…</p>
<p>We first got the
<a href="https://github.com/non/kind-projector#overview">kind-projector</a> to
help us write these type lambdas more succinctly. So Kmett’s first
example above can now be written <code>Functor[StateT[S, IO, ?]]</code>. Not as
nice as the curried Haskell form, but much better.</p>
<p>Eventually, though, <a href="https://github.com/scala/scala/pull/5102">Miles Sabin
implemented</a> the
“higher-order unification” feature, often called the “SI-2712 fix”
after <a href="https://github.com/scala/bug/issues/2712">the infamous bug</a>.
This feature performs the inference Kmett describes above, and gets
away with it precisely because it ignores “increased flexibility that
nobody uses”.</p>
<p>The situation is not perfect—you have to flip this nonstandard switch,
the resulting language isn’t source-compatible with standard Scala,
and warts like <a href="https://github.com/scala/bug/issues/5075">bug 5075</a>
(despite first appearances, this is quite distinct from 2712)
remain—but Scala is in great shape with respect to this problem
compared to where we were at the time of Kmett’s original writing.</p>
<blockquote>
<p>[10:] In this mindset and in the same vein as the <code>CanBuildFrom</code>
issue, things like <code>Either</code> don't have the biased <code>flatMap</code> you'd
expect, somehow encouraging you to use other tools, just in case you
wanted to bind on the <code>Left</code>. So you don't write generic monadic
code over the <code>Either</code> monad, but rather are constantly chaining
<code>foo.right.flatMap(... .right.flatMap(....))</code> ensuring you can't use
the sugar without turning to something like <code>scalaz</code> to fill it
in. Basically almost the entire original motivation for all the type
lambda craziness came down to being able to write classes like
Functor have have several instances for different arguments, but
because they are so hard to use nobody does it, making the feature
hardly pay its way, as it makes things like unification, and path
dependent type checking harder and sometimes impossible, but the
language specification requires them to do it!</p>
</blockquote>
<p>I’m not sure the situation was ever as severe as Kmett states, but
that might be down to my personal experience in Scala, with Scalaz as
my permanent companion.</p>
<p>The interspersed <code>.right</code>s never prevented you from using the <code>for</code>
syntax, though they <em>did</em> make it significantly more
obscure. Supposing <code>foo</code> and <code>bar</code> are <code>Either</code>s:</p>
<div class="codehilite language-scala"><pre><span></span><span class="k">for</span> <span class="o">{</span>
<span class="n">x</span> <span class="k"><-</span> <span class="n">foo</span><span class="o">.</span><span class="n">right</span>
<span class="n">y</span> <span class="k"><-</span> <span class="n">bar</span><span class="o">.</span><span class="n">right</span>
<span class="o">...</span>
</pre></div>
<p>That trailing <code>.right</code> looks like it’s missing a dance partner, but
it’s in just the right place for that biased <code>flatMap</code> or <code>map</code> method
to kick in.</p>
<p>But in Scalaz, we never had to worry about it. Because we only
supplied the right-biased <code>Monad</code> for <code>Either</code>. When you also bring in
Scalaz’s <code>Monad</code> syntax, suddenly <code>Either</code> acquires the standard
right-biased <code>map</code> and <code>flatMap</code>.</p>
<div class="codehilite language-scala"><pre><span></span><span class="k">import</span> <span class="nn">scalaz.syntax.bind._</span><span class="o">,</span> <span class="n">scalaz</span><span class="o">.</span><span class="n">std</span><span class="o">.</span><span class="n">either</span><span class="o">.</span><span class="k">_</span>
<span class="k">for</span> <span class="o">{</span>
<span class="n">x</span> <span class="k"><-</span> <span class="n">foo</span>
<span class="n">y</span> <span class="k"><-</span> <span class="n">bar</span>
<span class="o">...</span>
</pre></div>
<p>No more lonely dancers.</p>
<p>But now <a href="https://github.com/scala/scala/pull/5135">right-biasing has returned to the standard
library</a>, so even these
extra imports are no longer necessary.</p>
<p>Kmett pairs this point with a tangentially related point about
functors over other type parameters. But I think higher-order
unification is going to solve this problem, albeit in a very <em>ad hoc</em>
way, in the long run. Programmers who want to use higher-kinded types
will increasingly want to turn on the feature, or even be forced to by
library designs that depend on it. Types that conform to
right-bias—placing the functor parameter last, not first—will find
happy users with nice inference.</p>
<div class="codehilite language-scala"><pre><span></span><span class="k">class</span> <span class="nc">FA</span><span class="o">[</span><span class="kt">F</span><span class="o">[</span><span class="k">_</span><span class="o">]</span>, <span class="kt">A</span><span class="o">]</span>
<span class="nc">def</span> <span class="n">fa</span><span class="o">[</span><span class="kt">F</span><span class="o">[</span><span class="k">_</span><span class="o">]</span>, <span class="kt">A</span><span class="o">](</span><span class="n">fa</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">A</span><span class="o">])</span><span class="k">:</span> <span class="kt">FA</span><span class="o">[</span><span class="kt">F</span>, <span class="kt">A</span><span class="o">]</span> <span class="k">=</span>
<span class="k">new</span> <span class="nc">FA</span>
<span class="n">scala</span><span class="o">></span> <span class="n">fa</span><span class="o">(</span><span class="nc">Left</span><span class="o">(</span><span class="mi">33</span><span class="o">)</span><span class="k">:</span> <span class="kt">Either</span><span class="o">[</span><span class="kt">Int</span>, <span class="kt">String</span><span class="o">])</span>
<span class="n">res0</span><span class="k">:</span> <span class="kt">FA</span><span class="o">[[</span><span class="kt">+B</span><span class="o">]</span><span class="kt">Either</span><span class="o">[</span><span class="kt">Int</span>,<span class="kt">B</span><span class="o">]</span>,<span class="kt">String</span><span class="o">]</span> <span class="k">=</span> <span class="nc">FA</span><span class="k">@</span><span class="mi">542</span><span class="n">c2bc8</span>
</pre></div>
<p>This works even in more elaborate situations, such as with monad
transformers:</p>
<div class="codehilite language-scala"><pre><span></span><span class="k">trait</span> <span class="nc">EitherT</span><span class="o">[</span><span class="kt">E</span>, <span class="kt">M</span><span class="o">[</span><span class="k">_</span><span class="o">]</span>, <span class="kt">A</span><span class="o">]</span>
<span class="nc">trait</span> <span class="nc">ReaderT</span><span class="o">[</span><span class="kt">R</span>, <span class="kt">F</span><span class="o">[</span><span class="k">_</span><span class="o">]</span>, <span class="kt">A</span><span class="o">]</span>
<span class="k">trait</span> <span class="nc">IO</span><span class="o">[</span><span class="kt">A</span><span class="o">]</span>
<span class="nc">class</span> <span class="nc">Discovery</span><span class="o">[</span><span class="kt">T1</span><span class="o">[</span><span class="k">_</span><span class="o">[</span><span class="k">_</span><span class="o">]</span>, <span class="k">_</span><span class="o">]</span>, <span class="kt">T2</span><span class="o">[</span><span class="k">_</span><span class="o">[</span><span class="k">_</span><span class="o">]</span>, <span class="k">_</span><span class="o">]</span>, <span class="kt">M</span><span class="o">[</span><span class="k">_</span><span class="o">]</span>, <span class="kt">A</span><span class="o">]</span>
<span class="k">def</span> <span class="n">discover</span><span class="o">[</span><span class="kt">T1</span><span class="o">[</span><span class="k">_</span><span class="o">[</span><span class="k">_</span><span class="o">]</span>, <span class="k">_</span><span class="o">]</span>, <span class="kt">T2</span><span class="o">[</span><span class="k">_</span><span class="o">[</span><span class="k">_</span><span class="o">]</span>, <span class="k">_</span><span class="o">]</span>, <span class="kt">M</span><span class="o">[</span><span class="k">_</span><span class="o">]</span>, <span class="kt">A</span><span class="o">](</span><span class="n">a</span><span class="k">:</span> <span class="kt">Option</span><span class="o">[</span><span class="kt">T1</span><span class="o">[</span><span class="kt">T2</span><span class="o">[</span><span class="kt">M</span>, <span class="kt">?</span><span class="o">]</span>, <span class="kt">A</span><span class="o">]])</span>
<span class="k">:</span> <span class="kt">Discovery</span><span class="o">[</span><span class="kt">T1</span>, <span class="kt">T2</span>, <span class="kt">M</span>, <span class="kt">A</span><span class="o">]</span> <span class="k">=</span> <span class="k">new</span> <span class="nc">Discovery</span>
<span class="n">scala</span><span class="o">></span> <span class="n">discover</span><span class="o">(</span><span class="nc">None</span><span class="k">:</span> <span class="kt">Option</span><span class="o">[</span><span class="kt">EitherT</span><span class="o">[</span><span class="kt">String</span>, <span class="kt">ReaderT</span><span class="o">[</span><span class="kt">Int</span>, <span class="kt">IO</span>, <span class="kt">?</span><span class="o">]</span>, <span class="kt">ClassLoader</span><span class="o">]])</span>
<span class="n">res0</span><span class="k">:</span> <span class="kt">Discovery</span><span class="o">[[</span><span class="kt">M</span><span class="o">[</span><span class="k">_</span><span class="o">]</span>, <span class="kt">A</span><span class="o">]</span><span class="kt">EitherT</span><span class="o">[</span><span class="kt">String</span>,<span class="kt">M</span>,<span class="kt">A</span><span class="o">]</span>,
<span class="o">[</span><span class="kt">F</span><span class="o">[</span><span class="k">_</span><span class="o">]</span>, <span class="kt">A</span><span class="o">]</span><span class="kt">ReaderT</span><span class="o">[</span><span class="kt">Int</span>,<span class="kt">F</span>,<span class="kt">A</span><span class="o">]</span>,
<span class="kt">IO</span>,
<span class="kt">ClassLoader</span><span class="o">]</span> <span class="k">=</span> <span class="nc">Discovery</span><span class="k">@</span><span class="mi">4</span><span class="n">f20ea29</span>
</pre></div>
<p>Contrarian types that don’t conform will find themselves rejected for
constantly introducing mysterious type mismatches that must be
corrected with more explicit type lambdas. So the libraries should
develop.</p>
<blockquote>
<p>[11:] You don't have any notion of a kind system and can only talk
about fully saturated types, monad transformers are hell to
write. It is easier for me to use the fact that every <code>Comonad</code>
gives rise to a monad transformer to intuitively describe how to
manually plumb a semimonoidal <code>Comonad</code> through my parser to carry
extra state than to work with a monad transformer!</p>
</blockquote>
<p>This isn’t so much about inference of higher-kinded type parameters,
which I’ve dealt with above, but how convenient it is to write them
down.</p>
<p>As mentioned above, the kind-projector compiler plugin has made
writing these types significantly easier. Yet it remains ugly
compared to the curried version, for sure.</p>
<blockquote>
<p>[12:] I've been able to get the compiler to build classes that it
thinks are fully instantiated, but which still have abstract methods
in them.</p>
</blockquote>
<p>I haven’t seen this kind of thing in quite a while, but it wouldn’t
surprise me if a few such bugs were still outstanding. Let’s give the
compiler the benefit of the doubt and suppose that things have gotten
significantly better in this area.</p>
<blockquote>
<p>[13:] Tail-call optimization is only performed for self-tail calls,
where you do not do polymorphic recursion.</p>
</blockquote>
<p>There are two issues packed here. The first still holds: only
self-tail calls are supported. Plenty of ink has been expended
elsewhere; I point to <a href="https://plus.google.com/+DanDoel/posts/RmoC2dMikxQ">Dan Doel
again</a> for some of
that.</p>
<p>The second issue has a fix <a href="https://github.com/scala/scala/pull/6065">in Scala
2.12.4</a>!</p>
<div class="codehilite language-scala"><pre><span></span><span class="nd">@annotation</span><span class="o">.</span><span class="n">tailrec</span> <span class="k">def</span> <span class="n">lp</span><span class="o">[</span><span class="kt">A</span><span class="o">](</span><span class="n">n</span><span class="k">:</span> <span class="kt">Int</span><span class="o">)</span><span class="k">:</span> <span class="kt">Int</span> <span class="o">=</span>
<span class="k">if</span> <span class="o">(</span><span class="n">n</span> <span class="o"><=</span> <span class="mi">0</span><span class="o">)</span> <span class="n">n</span> <span class="k">else</span> <span class="n">lp</span><span class="o">[</span><span class="kt">Option</span><span class="o">[</span><span class="kt">A</span><span class="o">]](</span><span class="n">n</span> <span class="o">-</span> <span class="mi">1</span><span class="o">)</span>
<span class="c1">// [in 2.12.3] error:⇑ could not optimize @tailrec annotated method lp:</span>
<span class="c1">// it is called recursively with different type arguments</span>
<span class="n">scala</span><span class="o">></span> <span class="n">lp</span><span class="o">[</span><span class="kt">Unit</span><span class="o">](</span><span class="mi">1000000</span><span class="o">)</span>
<span class="n">res0</span><span class="k">:</span> <span class="kt">Int</span> <span class="o">=</span> <span class="mi">0</span>
</pre></div>
<p>To pour a little oil on, this isn’t a 50% fix; this is a nice
improvement, dealing with a particular annoyance in interpreting GADT
action graphs, but the much larger issue is the still-missing general
TCO.</p>
<blockquote>
<p>[14:] Monads are toys due to the aforementioned restriction. <code>(>>=)</code>
is called <code>flatMap</code>. Any chain of monadic binds is going to be a
series of non-self tailcalls. A function calls flatMap which calls a
function, which calls flatMap... This means that non-trivial
operations in even the identity monad, like using a Haskell style
<code>traverse</code> for a monad over an arbitrary container blows the stack
after a few thousand entries.</p>
</blockquote>
<p>And this is the same, for the same reason. Kmett goes on to discuss
the “solutions” to this.</p>
<blockquote>
<p>[15:] We can fix this, and have in <code>scalaz</code> by adapting <code>apfelmus</code>'
operational monad to get a trampoline that moves us off the stack to
the heap, hiding the problem, but at a 50x slowdown, as the JIT no
longer knows how to help.</p>
</blockquote>
<p>Nothing has changed here. We’ve tweaked the trampoline representation
<a href="https://github.com/scalaz/scalaz/pull/938">repeatedly</a> to get better
averages, but the costs still hold.</p>
<blockquote>
<p>[16:] We can also fix it by passing imperative state around, and
maybe getting scala to pass the state for me using implicits and
hoping I don't accidentally use a <code>lazy val</code>. Guess which one is the
only viable solution I know at scale? The code winds up less than
1/2 the size and 3x faster than the identity monad version. If scala
was the only language I had to think in, I'd think functional
programming was a bad idea that didn't scale, too.</p>
</blockquote>
<p>This is still something you have to do sometimes. Just as above,
nothing has really changed here. You just have to hope you don’t run
into it too often.</p>
<h2 id="markdown-header-random-restrictions">Random restrictions</h2>
<blockquote>
<p>[17:] <code>for</code> <code>yield</code> sugar is a very simple expansion, but that means
it has all sorts of rules about what you can't define locally inside
of it, e.g. you can't stop and <code>def</code> a function, <code>lazy val</code>,
etc. without nesting another <code>for</code> <code>yield</code> block.</p>
</blockquote>
<p>One thing has changed in this area! You no longer have to use the
<code>val</code> keyword when defining a val locally in the <code>for</code> block.</p>
<p>Otherwise, situation constant.</p>
<blockquote>
<p>[18:] You wind up with issues like
<a href="https://github.com/scala/bug/issues/3295">SI-3295</a> where out of a
desire to not "confuse the computation model", it was decided that
it was better to you know, just crash when someone folded a
reasonably large list than fix the issue.. until it finally affected
<code>scalac</code> itself. I've been told this has been relatively recently
fixed.</p>
</blockquote>
<p>As Kmett mentions, this was fixed. It remains fixed.</p>
<blockquote>
<p>[19:] No first-class universal quantification means that quantifier
tricks like <code>ST s</code>, or automatic differentiation without
infinitesimal confusion are basically impossible.</p>
<pre>def test = diff(new FF[Id,Id,Double] {
def apply[S[_]](x: AD[S, Double])(implicit mode: Mode[S, Double]): AD[S, Double]
= cos(x)
})</pre>
<p>is a poor substitute for</p>
<p><code>test = diff cos</code></p>
</blockquote>
<p>kind-projector <a href="https://github.com/non/kind-projector#polymorphic-lambda-values">has
provided</a>
less well-known support for some varieties of polymorphic lambdas,
such as <code>FF</code> in this example, for a while. The <code>implicit</code> constraint
and fact that we’re trying to be polymorphic over a higher-kinded type
might make things tricky, but let’s see if we can get it working.</p>
<div class="codehilite language-scala"><pre><span></span><span class="nc">Lambda</span><span class="o">[</span><span class="kt">FF</span><span class="o">[</span><span class="kt">Id</span>, <span class="kt">Id</span>, <span class="kt">Double</span><span class="o">]](</span><span class="n">x</span> <span class="k">=></span> <span class="n">cos</span><span class="o">(</span><span class="n">x</span><span class="o">))</span>
<span class="nc">Lambda</span><span class="o">[</span><span class="kt">FF</span><span class="o">[</span><span class="kt">Id</span>, <span class="kt">Id</span>, <span class="kt">Double</span><span class="o">]](</span><span class="n">x</span> <span class="k">=></span> <span class="k">implicit</span> <span class="n">mode</span> <span class="k">=></span> <span class="n">cos</span><span class="o">(</span><span class="n">x</span><span class="o">))</span>
<span class="c1">// both forms fail with the uninteresting error:</span>
<span class="c1">// not found: value Lambda</span>
</pre></div>
<p>Scalaz 8 contains <a href="https://github.com/scalaz/scalaz/pull/1417">a very clever unboxed encoding of universal
quantification</a> based on
the observation that if side effects and singleton type patterns are
forbidden, as they are under Scalazzi rules, multiple type
applications in Scala are indistinguishable at runtime. (To see why
this is, consider the difference between <code>List.empty[A]</code> and
<code>mutable.Buffer.empty[A]</code>.) The one that comes with Scalaz 8 only
quantifies over a <code>*</code>-kinded type parameter, but we should be able to
use the same technique to quantify over <code>S: * -> *</code>.</p>
<div class="codehilite language-scala"><pre><span></span><span class="k">trait</span> <span class="nc">ForallK1Module</span> <span class="o">{</span>
<span class="k">type</span> <span class="kt">ForallK1</span><span class="o">[</span><span class="kt">F</span><span class="o">[</span><span class="k">_</span><span class="o">[</span><span class="k">_</span><span class="o">]]]</span>
<span class="k">type</span> <span class="kt">∀</span><span class="o">[</span><span class="kt">F</span><span class="o">[</span><span class="k">_</span><span class="o">[</span><span class="k">_</span><span class="o">]]]</span> <span class="k">=</span> <span class="nc">ForallK1</span><span class="o">[</span><span class="kt">F</span><span class="o">]</span>
<span class="k">def</span> <span class="n">specialize</span><span class="o">[</span><span class="kt">F</span><span class="o">[</span><span class="k">_</span><span class="o">[</span><span class="k">_</span><span class="o">]]</span>, <span class="kt">A</span><span class="o">[</span><span class="k">_</span><span class="o">]](</span><span class="n">f</span><span class="k">:</span> <span class="kt">∀</span><span class="o">[</span><span class="kt">F</span><span class="o">])</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">A</span><span class="o">]</span>
<span class="k">def</span> <span class="n">of</span><span class="o">[</span><span class="kt">F</span><span class="o">[</span><span class="k">_</span><span class="o">[</span><span class="k">_</span><span class="o">]]]</span><span class="k">:</span> <span class="kt">MkForallK1</span><span class="o">[</span><span class="kt">F</span><span class="o">]</span>
<span class="k">sealed</span> <span class="k">trait</span> <span class="nc">MkForallK1</span><span class="o">[</span><span class="kt">F</span><span class="o">[</span><span class="k">_</span><span class="o">[</span><span class="k">_</span><span class="o">]]]</span> <span class="nc">extends</span> <span class="nc">Any</span> <span class="o">{</span>
<span class="k">type</span> <span class="kt">T</span><span class="o">[</span><span class="k">_</span><span class="o">]</span>
<span class="k">def</span> <span class="n">apply</span><span class="o">(</span><span class="n">ft</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">T</span><span class="o">])</span><span class="k">:</span> <span class="kt">∀</span><span class="o">[</span><span class="kt">F</span><span class="o">]</span>
<span class="o">}</span>
<span class="o">}</span>
<span class="k">object</span> <span class="nc">ForallK1Module</span> <span class="o">{</span>
<span class="k">val</span> <span class="nc">ForallK1</span><span class="k">:</span> <span class="kt">ForallK1Module</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">ForallK1Module</span> <span class="o">{</span>
<span class="k">type</span> <span class="kt">ForallK1</span><span class="o">[</span><span class="kt">F</span><span class="o">[</span><span class="k">_</span><span class="o">[</span><span class="k">_</span><span class="o">]]]</span> <span class="k">=</span> <span class="n">F</span><span class="o">[</span><span class="kt">λ</span><span class="o">[</span><span class="kt">α</span> <span class="k">=></span> <span class="kt">Any</span><span class="o">]]</span>
<span class="k">def</span> <span class="n">specialize</span><span class="o">[</span><span class="kt">F</span><span class="o">[</span><span class="k">_</span><span class="o">[</span><span class="k">_</span><span class="o">]]</span>, <span class="kt">A</span><span class="o">[</span><span class="k">_</span><span class="o">]](</span><span class="n">f</span><span class="k">:</span> <span class="kt">∀</span><span class="o">[</span><span class="kt">F</span><span class="o">])</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">A</span><span class="o">]</span> <span class="k">=</span> <span class="n">f</span><span class="o">.</span><span class="n">asInstanceOf</span><span class="o">[</span><span class="kt">F</span><span class="o">[</span><span class="kt">A</span><span class="o">]]</span>
<span class="k">def</span> <span class="n">of</span><span class="o">[</span><span class="kt">F</span><span class="o">[</span><span class="k">_</span><span class="o">[</span><span class="k">_</span><span class="o">]]]</span><span class="k">:</span> <span class="kt">MkForallK1</span><span class="o">[</span><span class="kt">F</span><span class="o">]</span> <span class="k">=</span> <span class="k">new</span> <span class="nc">MkForallK1</span><span class="o">[</span><span class="kt">F</span><span class="o">]</span> <span class="o">{</span>
<span class="k">type</span> <span class="kt">T</span><span class="o">[</span><span class="k">_</span><span class="o">]</span> <span class="k">=</span> <span class="nc">Any</span>
<span class="k">def</span> <span class="n">apply</span><span class="o">(</span><span class="n">ft</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">T</span><span class="o">])</span><span class="k">:</span> <span class="kt">∀</span><span class="o">[</span><span class="kt">F</span><span class="o">]</span> <span class="k">=</span> <span class="n">ft</span>
<span class="o">}</span>
<span class="o">}</span>
<span class="o">}</span>
<span class="c1">// we're using an unboxed representation</span>
<span class="k">type</span> <span class="kt">FF</span><span class="o">[</span><span class="kt">F</span><span class="o">[</span><span class="k">_</span><span class="o">]</span>, <span class="kt">G</span><span class="o">[</span><span class="k">_</span><span class="o">]</span>, <span class="kt">T</span>, <span class="kt">S</span><span class="o">[</span><span class="k">_</span><span class="o">]]</span> <span class="k">=</span> <span class="nc">AD</span><span class="o">[</span><span class="kt">S</span>, <span class="kt">T</span><span class="o">]</span> <span class="k">=></span> <span class="nc">Mode</span><span class="o">[</span><span class="kt">S</span>, <span class="kt">T</span><span class="o">]</span> <span class="k">=></span> <span class="nc">AD</span><span class="o">[</span><span class="kt">S</span>, <span class="kt">T</span><span class="o">]</span>
<span class="n">scala</span><span class="o">></span> <span class="nc">ForallK1</span><span class="o">.</span><span class="n">of</span><span class="o">[</span><span class="kt">Lambda</span><span class="o">[</span><span class="kt">S</span><span class="o">[</span><span class="k">_</span><span class="o">]</span> <span class="k">=></span> <span class="kt">FF</span><span class="o">[</span><span class="kt">Id</span>, <span class="kt">Id</span>, <span class="kt">Double</span>, <span class="kt">S</span><span class="o">]]](</span>
<span class="n">x</span> <span class="k">=></span> <span class="k">implicit</span> <span class="n">m</span> <span class="k">=></span> <span class="n">cos</span><span class="o">(</span><span class="n">x</span><span class="o">))</span>
<span class="n">res3</span><span class="k">:</span> <span class="kt">ForallK1Module.ForallK1.ForallK1</span><span class="o">[</span>
<span class="o">[</span><span class="kt">S</span><span class="o">[</span><span class="k">_</span><span class="kt">$1</span><span class="o">]]</span><span class="kt">AD</span><span class="o">[</span><span class="kt">S</span>,<span class="kt">Double</span><span class="o">]</span> <span class="k">=></span> <span class="o">(</span><span class="kt">Mode</span><span class="o">[</span><span class="kt">S</span>,<span class="kt">Double</span><span class="o">]</span> <span class="k">=></span> <span class="kt">AD</span><span class="o">[</span><span class="kt">S</span>,<span class="kt">Double</span><span class="o">])</span>
<span class="o">]</span> <span class="k">=</span> <span class="nc">$$Lambda$2018</span><span class="o">/</span><span class="mi">266706504</span><span class="k">@</span><span class="mi">91</span><span class="n">f8cde</span>
</pre></div>
<p>Upshot? Nothing has changed in core Scala. People in the Scala
community have discovered some clever tricks, which work even better
than on the slightly complicated test case Kmett supplied when tried
with more traditional <code>*</code>-kinded rank-2 idioms like <code>ST</code>.</p>
<div class="codehilite language-scala"><pre><span></span><span class="n">scala</span><span class="o">></span> <span class="nc">Lambda</span><span class="o">[</span><span class="kt">List</span> <span class="kt">~></span> <span class="kt">Option</span><span class="o">](</span><span class="k">_</span><span class="o">.</span><span class="n">headOption</span><span class="o">)</span>
<span class="n">res2</span><span class="k">:</span> <span class="kt">List</span> <span class="kt">~></span> <span class="kt">Option</span> <span class="o">=</span> <span class="nc">$anon$1</span><span class="k">@</span><span class="mi">73</span><span class="n">c4d4b5</span>
<span class="k">trait</span> <span class="nc">ST</span><span class="o">[</span><span class="kt">S</span>, <span class="kt">A</span><span class="o">]</span> <span class="o">{</span>
<span class="k">def</span> <span class="n">flatMap</span><span class="o">[</span><span class="kt">B</span><span class="o">](</span><span class="n">f</span><span class="k">:</span> <span class="kt">A</span> <span class="o">=></span> <span class="nc">ST</span><span class="o">[</span><span class="kt">S</span>, <span class="kt">B</span><span class="o">])</span><span class="k">:</span> <span class="kt">ST</span><span class="o">[</span><span class="kt">S</span>, <span class="kt">B</span><span class="o">]</span>
<span class="o">}</span>
<span class="k">trait</span> <span class="nc">STVar</span><span class="o">[</span><span class="kt">S</span>, <span class="kt">A</span><span class="o">]</span> <span class="o">{</span>
<span class="k">def</span> <span class="n">read</span><span class="k">:</span> <span class="kt">ST</span><span class="o">[</span><span class="kt">S</span>, <span class="kt">A</span><span class="o">]</span>
<span class="o">}</span>
<span class="k">def</span> <span class="n">newVar</span><span class="o">[</span><span class="kt">S</span>, <span class="kt">A</span><span class="o">](</span><span class="n">a</span><span class="k">:</span> <span class="kt">A</span><span class="o">)</span><span class="k">:</span> <span class="kt">ST</span><span class="o">[</span><span class="kt">S</span>, <span class="kt">STVar</span><span class="o">[</span><span class="kt">S</span>, <span class="kt">A</span><span class="o">]]</span> <span class="k">=</span> <span class="o">???</span>
<span class="k">def</span> <span class="n">mkAndRead</span><span class="o">[</span><span class="kt">S</span><span class="o">]</span><span class="k">:</span> <span class="kt">ST</span><span class="o">[</span><span class="kt">S</span>, <span class="kt">Int</span><span class="o">]</span> <span class="k">=</span> <span class="n">newVar</span><span class="o">[</span><span class="kt">S</span>, <span class="kt">Int</span><span class="o">](</span><span class="mi">33</span><span class="o">)</span> <span class="n">flatMap</span> <span class="o">(</span><span class="k">_</span><span class="o">.</span><span class="n">read</span><span class="o">)</span>
<span class="k">def</span> <span class="n">runST</span><span class="o">[</span><span class="kt">A</span><span class="o">](</span><span class="n">st</span><span class="k">:</span> <span class="kt">Forall</span><span class="o">[</span><span class="kt">ST</span><span class="o">[</span><span class="kt">?</span>, <span class="kt">A</span><span class="o">]])</span><span class="k">:</span> <span class="kt">A</span> <span class="o">=</span> <span class="o">???</span>
<span class="n">scala</span><span class="o">></span> <span class="k">:</span><span class="kt">t</span> <span class="kt">Forall.of</span><span class="o">[</span><span class="kt">ST</span><span class="o">[</span><span class="kt">?</span>, <span class="kt">Int</span><span class="o">]](</span><span class="n">mkAndRead</span><span class="o">)</span>
<span class="n">scalaz</span><span class="o">.</span><span class="n">data</span><span class="o">.</span><span class="nc">Forall</span><span class="o">.</span><span class="nc">Forall</span><span class="o">[[</span><span class="kt">α$0$</span><span class="o">]</span><span class="kt">ST</span><span class="o">[</span><span class="kt">α$0$</span>,<span class="kt">Int</span><span class="o">]]</span>
<span class="n">scala</span><span class="o">></span> <span class="k">:</span><span class="kt">t</span> <span class="kt">Forall.of</span><span class="o">[</span><span class="kt">Lambda</span><span class="o">[</span><span class="kt">s</span> <span class="k">=></span> <span class="kt">ST</span><span class="o">[</span><span class="kt">s</span>, <span class="kt">STVar</span><span class="o">[</span><span class="kt">s</span>, <span class="kt">Int</span><span class="o">]]]](</span><span class="n">newVar</span><span class="o">(</span><span class="mi">33</span><span class="o">))</span>
<span class="n">scalaz</span><span class="o">.</span><span class="n">data</span><span class="o">.</span><span class="nc">Forall</span><span class="o">.</span><span class="nc">Forall</span><span class="o">[[</span><span class="kt">s</span><span class="o">]</span><span class="kt">ST</span><span class="o">[</span><span class="kt">s</span>,<span class="kt">STVar</span><span class="o">[</span><span class="kt">s</span>,<span class="kt">Int</span><span class="o">]]]</span>
<span class="n">scala</span><span class="o">></span> <span class="k">:</span><span class="kt">t</span> <span class="kt">runST</span><span class="o">(</span><span class="kt">Forall.of</span><span class="o">[</span><span class="kt">ST</span><span class="o">[</span><span class="kt">?</span>, <span class="kt">Int</span><span class="o">]](</span><span class="kt">mkAndRead</span><span class="o">))</span>
<span class="nc">Int</span>
<span class="n">scala</span><span class="o">></span> <span class="k">:</span><span class="kt">t</span> <span class="kt">runST</span><span class="o">(</span><span class="kt">Forall.of</span><span class="o">[</span><span class="kt">Lambda</span><span class="o">[</span><span class="kt">s</span> <span class="k">=></span> <span class="kt">ST</span><span class="o">[</span><span class="kt">s</span>, <span class="kt">STVar</span><span class="o">[</span><span class="kt">s</span>, <span class="kt">Int</span><span class="o">]]]](</span><span class="kt">newVar</span><span class="o">(</span><span class="err">33</span><span class="o">)))</span>
<span class="o"><</span><span class="n">console</span><span class="k">>:</span><span class="mi">19</span><span class="k">:</span> <span class="kt">error:</span> <span class="k">type</span> <span class="kt">mismatch</span><span class="o">;</span>
<span class="n">found</span> <span class="k">:</span> <span class="kt">Forall</span><span class="o">[[</span><span class="kt">s</span><span class="o">(</span><span class="kt">in</span> <span class="k">type</span> <span class="kt">Λ$</span><span class="o">)]</span>
<span class="kt">ST</span><span class="o">[</span><span class="kt">s</span><span class="o">(</span><span class="kt">in</span> <span class="k">type</span> <span class="kt">Λ$</span><span class="o">)</span>,
<span class="kt">STVar</span><span class="o">[</span><span class="kt">s</span><span class="o">(</span><span class="kt">in</span> <span class="k">type</span> <span class="kt">Λ$</span><span class="o">)</span>,<span class="kt">Int</span><span class="o">]]]</span>
<span class="n">required</span><span class="k">:</span> <span class="kt">Forall</span><span class="o">[[</span><span class="kt">α$0$</span><span class="o">(</span><span class="kt">in</span> <span class="k">type</span> <span class="kt">Λ$</span><span class="o">)]</span>
<span class="kt">ST</span><span class="o">[</span><span class="kt">α$0$</span><span class="o">(</span><span class="kt">in</span> <span class="k">type</span> <span class="kt">Λ$</span><span class="o">)</span>,
<span class="kt">STVar</span><span class="o">[</span><span class="k">_</span> <span class="k">>:</span> <span class="o">(</span><span class="kt">some</span> <span class="kt">other</span><span class="o">)</span><span class="kt">s</span><span class="o">(</span><span class="kt">in</span> <span class="k">type</span> <span class="kt">Λ$</span><span class="o">)</span> <span class="kt">with</span> <span class="o">(</span><span class="kt">some</span> <span class="kt">other</span><span class="o">)</span><span class="kt">α$0$</span><span class="o">(</span><span class="kt">in</span> <span class="k">type</span> <span class="kt">Λ$</span><span class="o">)</span>, <span class="kt">Int</span><span class="o">]]]</span>
</pre></div>
<p>Knowledgable use of these tricks will give you much better code than
we could produce when Kmett wrote this, but it’s still nowhere near as
elegant or easy-to-use as rank-2 in Haskell.</p>
<blockquote>
<p>... but it runs on the JVM.</p>
</blockquote>
<p>Indeed, Scala still runs on the JVM.</p>
<h2 id="markdown-header-how-good-an-idea-is-it">How good an idea is it?</h2>
<p>So, a few things have gotten better, and a few things have gotten a
lot better. That bodes well, anyway.</p>
<p>Functional programming practice in Scala will continue to encounter
these issues for the foreseeable future. If you are writing Scala, you
<em>should</em> be practicing functional programming; the reliability
benefits are worth the price of entry. While you’re doing so, however,
it’s no thoughtcrime to occasionally feel like it’s a bad idea that
doesn’t scale.</p>
<p><em>This article was tested with Scala 2.12.4 <code>-Ypartial-unification</code>,
Scalaz 8 3011709ba, and kind-projector 0.9.4.</em></p>
<p><em>Portions Copyright © 2013 Edward Kmett, used with permission.</em></p>
<p><em>Copyright © 2017, 2018 Stephen Compall. This work is licensed under a
<a href="http://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International
License</a>.</em></p>Stephen Compallhttp://www.blogger.com/profile/13346366374080232972noreply@blogger.com2tag:blogger.com,1999:blog-1184549185438247550.post-14960145541591240802018-02-10T21:01:00.002-05:002018-02-10T21:01:38.975-05:00Writing about subtyping<p>I want programmers to stop using subtyping, yet I keep bringing it up,
in article after article. Partly that is because it is very hard to
avoid subtyping-related issues in Scala, and I find myself concerned
with Scala when I ought to be devoting mental cycles to simpler, more
powerful languages. But that may simply feed into what I suppose is
the greater reason:</p>
<blockquote>
<p>Subtyping is an amusing puzzle for my mind. I enjoy the mental
diversion of the needlessly complex puzzle of practical programming
techniques making use of subtyping.</p>
</blockquote>
<p>I can justify this self-gratification by saying to myself, “the more
they read about what subtyping <em>really</em> means, the more their desire
to avoid this mess will grow”. I think this is more rationalization
than honest motivation for myself, though I do think those who learn
more about subtyping are more likely to avoid it, just as those who
learn more about typing are more likely to advocate it.</p>
<p>Yet, it <em>does</em> have a kind of beauty. So I take it out here and
there, and appreciate its facets for a while. Then I carefully return
it to its display case, lest I am afflicted with the subtyping bug
again.</p>Stephen Compallhttp://www.blogger.com/profile/13346366374080232972noreply@blogger.com0tag:blogger.com,1999:blog-1184549185438247550.post-3058765323573730882017-12-09T17:28:00.000-05:002018-02-10T15:46:41.865-05:00Spare me the tedium of “simple” linting, please<p>In the development environments of singly-typed languages like
JavaScript and Python, one popular build step for improving code
quality is the “linting” or “style” tool. Tools like <a href="http://jshint.com/">jshint</a> and
<a href="https://github.com/PyCQA/pyflakes#pyflakes">pyflakes</a> point out misspellings and references to missing
variables, idioms that frequently lead to bugs like <code>==</code> usage, calls
with apparently the wrong number of arguments, and many other things.</p>
<p>Much of this is meant to mechanize the enforcement of a team’s style
guidelines—at their best, sincere and effective tactics for avoiding
common sources of bugs. Unfortunately, many of these guidelines can
seem excessively pedantic, forcing the programmer to deal with cases
that could not possibly happen.</p>
<p>Normally, it would make sense to tell the programmers to just suck it
up and follow the rules. However, this tactic can lead to a couple bad
outcomes.</p>
<ol>
<li>The lint check can lose support among developers, for being more
trouble than it’s worth. If programmers feel that a lint is causing
more code quality issues than it’s solving, they’ll sensibly
support its removal.</li>
<li>Lints without developer support tend to be disabled sooner or
later, or simply mechanically suppressed at each point they would
be triggered, via “ignore” markers and the like. At that point, the
bug-catching benefits of the lint are completely eliminated; in the
worst case, you have a universally ignored warning, and are even
worse off than if the warning was simply disabled.</li>
<li>If the errors spotted by the lint are serious enough, but the lint
warns for too many exceptional cases, the developers might decide
to move the style rule to manual code review, with an opportunity
to argue for exceptional cases. This is labor-intensive, carries a
much longer feedback cycle, and makes it easy to accept an argument
for an exception to the style rule that is actually erroneous, as
it is not machine-checked.</li>
</ol>
<p>Many of these “too many false positives” warnings could be a lot
better if they simply had more information to work with about what
will happen at runtime. That way, they can avoid emitting the warning
where, according to the extra information, a construct that <em>appears</em>
dangerous will not be a problem in practice.</p>
<p>That is one thing that type systems are very good at. So lint users
are on the right track; the solution to their woes of needless ritual
is <em>more</em> static analysis, rather than less.</p>
<p>Let’s consider some common lints in JavaScript and see how the
knowledge derived from types can improve them, reducing their false
positive rate or simply making them more broadly useful.</p>
<h2 id="markdown-header-suspicious-truthiness-tests">Suspicious truthiness tests</h2>
<p>(This example has the benefit of <a href="https://medium.com/flow-type/linting-in-flow-7709d7a7e969#2677">a recent implementation</a> along the
semantic lines I describe, in Flow 0.52. I take no credit for any of
the specific lint suggestions I make in this article.)</p>
<p>In JavaScript, a common idiom for checking whether an “object” is not
null or undefined is to use its own “truthiness”.</p>
<div class="codehilite language-js"><pre><span></span><span class="k">if</span> <span class="p">(</span><span class="nx">o</span><span class="p">)</span> <span class="c1">// checking whether 'o' is defined</span>
<span class="k">if</span> <span class="p">(</span><span class="nx">o</span><span class="p">.</span><span class="nx">magic</span><span class="p">)</span> <span class="c1">// checking whether 'o's 'magic' property is defined</span>
</pre></div>
<p>This is concise and does the trick perfectly—if the value being tested
isn’t possibly a number, string, or boolean. If it can be only an
object, <code>null</code>, or <code>undefined</code>, then this is fine, because even the
empty object <code>{}</code> is truthy, while null and undefined are both
falsey. Unfortunately, in JavaScript, other classes of data have
“falsey” values among them, such as <code>0</code> for number.</p>
<p>The “lint” solution to this problem is to always compare to <code>null</code>
directly.</p>
<div class="codehilite language-js"><pre><span></span><span class="k">if</span> <span class="p">(</span><span class="nx">o</span> <span class="o">!=</span> <span class="kc">null</span><span class="p">)</span> <span class="c1">// not null or undefined</span>
<span class="k">if</span> <span class="p">(</span><span class="nx">o</span> <span class="o">!==</span> <span class="kc">null</span> <span class="o">&&</span> <span class="nx">o</span> <span class="o">!==</span> <span class="kc">undefined</span><span class="p">)</span> <span class="c1">// required in more pedantic code styles</span>
</pre></div>
<p>This might be encapsulated in a function, but still doesn’t approach
the gentle idiom afforded by exploiting objects’ universal
truthiness. But the object idiom is simply unsafe if the value could
<em>possibly</em> be something like a number. So the question of whether you
should allow this exception to the “always compare to <code>null</code>
explicitly” lint boils down to “can I be sure that this can only be an
object?” And if in review you decide “yes”, this is a decision that
must constantly be revisited as code changes elsewhere change the
potential types of the expression.</p>
<p>You want to mechanically rule out “possible bugs” that are not really
possible in <em>your</em> use of the idiom, so that the linter will not warn
about benign use of truthiness—it will save the warnings for code
where it could actually be a problem. Ruling out impossible cases so
that you only need cope with the possible is just the sort of
programming job where the type system fits in <em>perfectly</em>.</p>
<p>A type system can say “this expression is definitely an object, null,
or undefined”, and type-aware linting can use that information to
allow use of the truthiness test. If that data comes from an argument,
it can enforce that callers—wherever else in the program they might
be—will not violate the premise of its allowance by passing in
numbers or something else.</p>
<p>A type system can also say “this expression will definitely be an
object, <em>never</em> null”, <em>or vice versa</em>, thus taking the lint even
further—it can now tell the programmer that the <code>if</code> check is useless,
because it will always be truthy or falsey, respectively. This is just
the sort of premise that’s incredibly hard for humans to verify across
the whole program, continuously, but is child’s play for a
type-checker.</p>
<p>A type system such as Flow can even say “this expression could have
been something dangerous for the truthy test, like <code>number | object</code>,
but you ruled out the dangerous possibilities with earlier <code>if</code> tests,
so it’s fine here”. Manual exemption can regress in cases like this
with something as simple as reordering “if-else if-else” chains a
little carelessly—keep in mind here the decades of failure that “be
sufficiently careful” has had as a bug-avoidance tactic in
programming—but type-aware linting will catch this right away, waking
up to declare its previous exemption from the style rule null and
void.</p>
<p>The more precise your types in your program, the more understanding of
your use of this idiom—in only valid places—the type-aware linter will
be. It will not make the reasoning mistakes that human review would
make when allowing its use, so you can use it with more confidence,
knowing that the lint will only call out cases where there’s a genuine
concern of a non-null falsey value slipping in. And there is no need
to argue in code review about which cases need <code>!= null</code> and which
don’t, nor to revisit those decisions as the program evolves; as
circumstances change, the type checker will point out when the verbose
check becomes unnecessary, or when the succinct check becomes unsafe.</p>
<h2 id="markdown-header-references-to-undeclared-variables">References to undeclared variables</h2>
<p>It’s very common to mistake a variable name that isn’t defined for one
that is. The larger a program gets, the easier this mistake is to
make.</p>
<p>This mistake comes in a few forms. It may be a simple misspelling. It
may be a variable you thought was defined here, but is actually
defined somewhere else. It may be a variable defined later, but not
yet.</p>
<p>Whatever the meaning of the error, <a href="http://jshint.com/docs/options/#undef">linters can catch the problem</a>
via a relatively simple method. As the file is scanned, the linter
keeps track of what variables are currently in scope. When
encountering a variable reference, it checks the variable name against
its current working list of variables. If one is not in the list, the
linter reports it.</p>
<p>This is better than nothing. Compared to what you can do with a type
checker, though, it’s not very good at all.</p>
<p>Suppose that you have a few local functions defined at the top level
of your module, <code>foo</code>, <code>bar</code>, and <code>baz</code>. A linter will point out an
undeclared variable if you try to call <code>fop</code>, <code>gar</code>, or <code>bax</code>. So you
don’t have to wait for the browser reload or test cycle; you can
correct these errors right away.</p>
<p>Later on, your module is getting larger, so you decide to group some
functions into objects to clean up the top-level namespace. You decide
that <code>foo</code>, <code>bar</code>, and <code>baz</code> fit under the top-level object <code>q</code>.</p>
<div class="codehilite language-js"><pre><span></span><span class="kr">const</span> <span class="nx">q</span> <span class="o">=</span> <span class="p">{</span>
<span class="nx">foo</span><span class="p">(...)</span> <span class="p">{...}</span>
<span class="nx">bar</span><span class="p">(...)</span> <span class="p">{...}</span>
<span class="nx">baz</span><span class="p">(...)</span> <span class="p">{...}</span>
<span class="p">}</span>
</pre></div>
<p>During your refactoring, references to these three functions are
rewritten to <code>q.foo</code>, <code>q.bar</code>, and <code>q.baz</code>, respectively. This is a
nice way to avoid ad hoc name prefixes as a grouping mechanism; you’re
using a first-class language feature to do the grouping instead.</p>
<p>But let’s give a moment’s consideration to <code>q.fop</code>, <code>q.gar</code>, and
<code>q.bax</code>. The linter will verify that the reference to <code>q</code> is sound;
it’s declared at the module level as <code>const q</code>. However, the linter
will not then verify that the “member” references are sound; that is a
fact about the <em>structure</em> of <code>q</code>, not its mere <em>existence</em>.</p>
<p>When all you have is “simple” linting, this becomes a <em>tax on
modularity</em>, so to speak. If a variable is defined locally—<em>very</em>
locally—references to it are checked. If it is defined remotely,
whether in another module or simply “grouped” in a submodule,
references to it are not checked.</p>
<p>A type system cuts the modularity tax by tracking the names that are
beyond the purview of the simple linter. In the case of <code>q</code>,
type-checking tracks more than its existence; its statically-known
structural features are reqorded as part of its <em>type</em>.</p>
<ul>
<li>In this module, <code>q</code> is defined.<ul>
<li>It is an object,</li>
<li>and is known to have properties:<ul>
<li>foo,<ul>
<li>a function which is…;</li>
</ul>
</li>
<li>bar,<ul>
<li>a function which is…, and</li>
</ul>
</li>
<li>baz,<ul>
<li>a function which is…</li>
</ul>
</li>
</ul>
</li>
</ul>
</li>
</ul>
<p>This continues to work at whatever depth of recursion you like. It
works across modules, too: if I want to call <code>baz</code> from another
module, I can import this module in that one, perhaps as <code>mm</code>, and
then the reference <code>mm.q.baz</code> will be allowed, but <code>mm.q.bax</code> flagged
as an error.</p>
<p>An undeclared-variable linter is all well and good, but if you want to
take it to its logical conclusion, you need type checking.</p>
<h2 id="markdown-header-hasownproperty-guards"><code>hasOwnProperty</code> guards</h2>
<p>The best lints focus on elements of code hygiene we’re reluctant to
faithfully practice while writing code, but come back to fill us with
regret when we fail to do so. One example arises with using <code>for</code> to
iterate over objects in JavaScript; <a href="http://jshint.com/docs/options/#forin">the lint checks</a> that you’re
guarding the loop with <code>hasOwnProperty</code>.</p>
<p>The purpose of these guards is to handle non-plain objects; that is to
say, objects “with class”. The guards are always suggested by the
linter to avoid nasty surprises should you try iterating over the
properties of an object “with class”.</p>
<p>The irony of this check is that the intent of such code is usually to
work with plain objects only, that is, classy objects should not be
present in the first place! The construct is still perfectly usable
with classy objects; it’s just that more caution is called for when
using it in that fashion.</p>
<p>As such, there are two basic scenarios of <code>for</code> iteration over
objects.</p>
<ol>
<li>iteration over plain objects only, and</li>
<li>iteration over potentially classy objects.</li>
</ol>
<p>The focus of the <code>hasOwnProperty</code> guard ought to be #2, but this
concern bleeds over into #1 cases, needlessly or not depending on how
<a href="http://jshint.com/docs/options/#freeze">pessimistic</a> you are about <code>Object.prototype</code> extension. But this
question is moot for the linter, which can’t tell the difference
between the two scenarios in the first place.</p>
<p>By contrast, a type checker can make this distinction. It can decide
whether a variable is definitely a plain object, definitely not one,
or might or might not be one. With that information, depending on your
preferences, it could choose not to warn about a missing
<code>hasOwnProperty</code> guard if it knows it’s looking at scenario #1. So if
a <code>hasOwnProperty</code> is needless, it need not pollute your code for the
sake of the scenarios where it <em>will</em> be needed.</p>
<h2 id="markdown-header-computer-please-try-to-keep-up">Computer, please try to keep up</h2>
<p>Humans are pretty good at coming up with complex contractual
preconditions and postconditions for their programs, and making those
programs’ correct operation depend on the satisfaction of those
contracts. But humans are very bad at verifying those contracts, even
as they make them more complex and depend more on them.</p>
<p>“Simple” linting tools are good at checking for the fulfillment of
very simple rules to help ensure correctness, freeing the human mind
to focus on the more complex cases. What makes them a chore to deal
with—a reason to put solving linter warnings on the “technical debt
backlog” instead of addressing them immediately—is “<em>they don’t know
what we know</em>”; they can handle laborious low-level checks, but lack
the sophisticated analysis humans use to decide the difference between
a correct program and an incorrect one.</p>
<p>Happily, through static analysis, we can get much closer to
human-level understanding’s inferences about a program, while
preserving the tireless calculation of contractual fulfillment that
makes linting such a helpful companion in the development cycle, doing
the parts that humans are terrible at. When your linter is so “simple”
that it becomes a hindrance, it’s time to put it down, and pick up a
type system instead.</p>Stephen Compallhttp://www.blogger.com/profile/13346366374080232972noreply@blogger.com0tag:blogger.com,1999:blog-1184549185438247550.post-9173088284055972672017-11-28T21:53:00.001-05:002018-02-10T15:57:06.359-05:00Or, we could not, and say we don’t have to<p>I previously wrote <a href="https://typelevel.org/blog/2015/07/30/values-never-change-types.html">“Values never change types”</a>, whose central
thesis statement I hope is obvious. And this still holds, but there is
something I left unsaid: values do not have identity, so the notion of
“mutating” them is as nonsensical as “mutating 4”. And the formal
system of Scala types treats objects with identity similarly, by not
permitting them or their variable aliases to change type, even though
they are not quite values. But this is a design decision, and other
choices could have been made.</p>
<p>There are very good reasons <em>not</em> to make other choices, though. Other
type systems come with features that come very close to making the
opposite design choice; by imagining that they went just a little
farther down this garden path, we can see what might have been.</p>
<h2 id="markdown-header-refinement-flow-or-occurrence-typing-by-any-name">Refinement, flow, or occurrence typing, by any name</h2>
<p>In <a href="https://flow.org/en/docs/lang/refinements/">Flow</a> and <a href="http://www.typescriptlang.org/docs/handbook/advanced-types.html#type-guards-and-differentiating-types">TypeScript</a>, when you test properties of a value in
a variable, you can “change the type” of that variable. For example,
you could have a <code>let s: any</code>; if you write an <code>if</code> block that tests
whether <code>s</code> is a string, the type of <code>s</code>—at compile-time, mind
you—“changes” to <code>string</code> <strong>within the <code>if</code> body</strong>. Within the body of
that <code>if</code>, you could perform further tests to <em>refine</em> <code>s</code>’s type
further; you might also have various other <code>if</code> blocks alongside
checking for other types, so that <code>s</code> might variously “change” into
number, function, object with a <code>whatsit</code> property whose type is
another object with a <code>whosit</code> property, and so on.</p>
<p>So, instead of having a single type attached to a lexical variable
over its entire scope, a variable has <em>several</em> types, each tied to a
block of code that <em>uses</em> the variable. It is an order more
sophisticated, but still tied to the lexical structure of the program,
as if the variable has multiplied to honor all the faces it might
have. This is a great way to model <em>how people are writing programs</em>
at the type level without overly complicating the formal system, which
still must always obey a complete set of sound rules.</p>
<h2 id="markdown-header-contradictory-refinement">Contradictory refinement</h2>
<p>In the systems I’ve described, no further refinement can contradict a
prior one. So once you determine that a variable is a <code>string</code>, it’s
not going to turn out to be a number later; at most, it can get more
specific, like being proven to be a member of a known static set of
strings. So this way you know that inner blocks cannot know less than
outer blocks about the nature of a variable; that is what I mean by
“tied to the lexical structure.”</p>
<p>What “real JavaScript code” could be written that would violate this
assumption?</p>
<div class="codehilite language-javascript"><pre><span></span><span class="kd">function</span> <span class="nx">foo</span><span class="p">(</span><span class="nx">arg</span><span class="p">)</span> <span class="p">{</span>
<span class="kd">let</span> <span class="nx">s</span> <span class="o">=</span> <span class="nx">arg</span>
<span class="k">if</span> <span class="p">(</span><span class="k">typeof</span> <span class="nx">s</span> <span class="o">===</span> <span class="s1">'string'</span><span class="p">)</span> <span class="p">{</span> <span class="c1">// refines s to type string</span>
<span class="nx">s</span> <span class="o">=</span> <span class="p">{</span><span class="nx">teaTime</span><span class="o">:</span> <span class="nx">s</span><span class="p">}</span>
<span class="c1">// "point of no return"</span>
<span class="k">if</span> <span class="p">(</span><span class="nx">s</span><span class="p">.</span><span class="nx">teaTime</span> <span class="o">===</span> <span class="s2">"anytime"</span><span class="p">)</span> <span class="p">{</span>
<span class="nx">drinkTea</span><span class="p">(</span><span class="nx">s</span><span class="p">)</span>
<span class="p">...</span>
</pre></div>
<p>The first <code>if</code> test establishes a block in which <code>s</code>’s type is
<code>string</code>. Then we pull the rug out from under the type-checker by
assigning to <code>s</code>; with that assignment, it is no longer true that <code>s</code>
is a string. Why does this make type-checking more complex?</p>
<h2 id="markdown-header-lets-twist-and-tangle-the-program-to-support-our-beloved-mutation">Let’s twist and tangle the program to support our beloved mutation</h2>
<p>The type of the variable <code>s</code> no longer follows the block structure of
the program, in the way we usually perceive blocks in a structured
program. That’s because the fact established by the outer <code>if</code> test is
suddenly invalidated partway through the block. So our first problem
is one of raising the complexity burden on human interpretation of the
program—the reader can no longer assume that the specificity of a
variable’s type only increases as you move inward, reading the block
structure of the program—but it is not fatal in itself, at least for
this example. We can salvage the model via the Swiss Army knife of
semantic analysis, the continuation-passing style (CPS) transform.</p>
<div class="codehilite language-javascript"><pre><span></span><span class="kd">function</span> <span class="nx">foo</span><span class="p">(</span><span class="nx">arg</span><span class="p">,</span> <span class="nx">k0</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="p">(</span><span class="nx">s</span> <span class="p">=></span> <span class="k">if</span> <span class="p">(</span><span class="k">typeof</span> <span class="nx">s</span> <span class="o">===</span> <span class="s1">'string'</span><span class="p">)</span> <span class="p">{</span>
<span class="p">(</span><span class="nx">_</span> <span class="p">=></span> <span class="k">if</span> <span class="p">(</span><span class="nx">s</span><span class="p">.</span><span class="nx">teaTime</span> <span class="o">===</span> <span class="s2">"anytime"</span><span class="p">)</span> <span class="p">{</span>
<span class="nx">drinkTea</span><span class="p">(</span><span class="nx">s</span><span class="p">,</span> <span class="nx">k0</span><span class="p">)</span>
<span class="p">}</span>
<span class="p">)(</span><span class="nx">s</span> <span class="o">=</span> <span class="p">(</span><span class="nx">teaTime</span><span class="o">:</span> <span class="nx">s</span><span class="p">))</span>
<span class="p">}</span>
<span class="p">)(</span><span class="nx">arg</span><span class="p">)</span>
</pre></div>
<p>Now it is still possible for inner blocks to contradict outer blocks,
but at least this is only possible at the block level. So, by “merely”
revisualizing our programs in terms of the flow of continuations
rather than the visibly apparent block structure, we can sort of still
think of the type of variables as a “block”-level concern, as it was
before.</p>
<p>Unluckily, performing a CPS transform in your head with all the code
you see is a kind of “<a href="https://www.schoolofhaskell.com/user/edwardk/bound#de-bruijn-indices">reverse Turing Test</a>”, something that an AI
would have to not be able to do very well in order to fool us into
thinking it was human. So no matter what, we are stuck with a
significant new complication in our formal model.</p>
<p>But not a fatal one. Yet.</p>
<h2 id="markdown-header-loop-unrolling-according-to-the-phase-of-the-moon">Loop unrolling according to the phase of the moon</h2>
<p>What will prove fatal to the formal model of type-changing mutation is
delay. Let us see how the seeds of our destruction have already been
sown.</p>
<div class="codehilite language-javascript"><pre><span></span><span class="k">while</span> <span class="p">(</span><span class="nx">forSomeTimeNow</span><span class="p">())</span> <span class="p">{</span>
<span class="nx">log</span><span class="p">(</span><span class="nx">s</span><span class="p">.</span><span class="nx">substring</span><span class="p">(</span><span class="mi">1</span><span class="p">))</span>
<span class="k">if</span> <span class="p">(</span><span class="nx">itsAFullMoon</span><span class="p">())</span> <span class="p">{</span>
<span class="nx">s</span> <span class="o">=</span> <span class="p">{</span><span class="nx">teaTime</span><span class="o">:</span> <span class="nx">s</span><span class="p">}</span>
<span class="p">}</span>
<span class="nx">drinkTea</span><span class="p">(</span><span class="nx">s</span><span class="p">)</span>
<span class="p">}</span>
</pre></div>
<p>The first question, and the last, is “is it safe to <code>drinkTea</code>?”</p>
<p>One necessary prerequisite is that it has been a full moon. I’m using
this boolean expression to inject the Halting Problem—we cannot
determine in a Turing-complete language whether a boolean expression
will evaluate to true or false, generally—but it is probably
sufficient to say that it is nondeterministic, even if not
Turing-complete. (<a href="/2017/07/advanced-type-system-features-are.html">Pragmatists</a> love Turing completeness and
nondeterminism, because “more power is always better”.) So it’s hard
enough—by which I mean generally impossible—to say whether the <code>s</code>
assignment has happened.</p>
<p>The next prerequisite, which should drive us wholly into despair now
if hope yet remains, is that the moon has been full <em>once</em>. Eh? Here’s
where the tie of variable types to any semblance of code structure
breaks down completely, because <code>s</code> takes on a surprisingly large
number of types in this code sample.</p>
<p>To assign a precise type to this program, we have to accurately model
what is happening in it. So, suppose that prior to entering this loop,
the type of <code>s</code> is <code>string</code>. Each assignment to <code>s</code>—made each time the
<code>while</code>’s test is true and the moon is full—takes us one step down
this list of types.</p>
<ol>
<li>string</li>
<li>{teaTime: string}</li>
<li>{teaTime: {teaTime: string}}</li>
<li>{teaTime: {teaTime: {teaTime: string}}}</li>
<li>{teaTime: {teaTime: {teaTime: {teaTime: string}}}}</li>
<li>•••</li>
</ol>
<p>Now if you want to assign (potentially) all of these infinite
possibilities to the program, you have to go even further from the
block structure model. Imagine a third dimension of the program text: at
the surface, you see <code>s</code> having only the first two types above, but as
you look deeper, you see the branching possibilities—oh so many
iterations in, oh so many times the moon has been full—each assigning
different types to what is on the surface the same code. Looking at
this as two-dimensional text, you would only see the infinite
superimposition of all possible types of <code>s</code>, weighted according to
their probability.</p>
<p>Three dimensions might be too few for this code.</p>
<blockquote>
<p>Of course, there’s a well-known, sensible way to type this code,
<em>sans</em> the <code>log</code> call: abandon the folly of modeling mutation and
assign this recursive union type to <code>s</code> for at least the whole scope
of the <code>while</code> loop, if not an even wider scope:</p>
<p><code>type TeaTimeTower = string | {teaTime: TeaTimeTower}</code></p>
<p>And supposing the <code>drinkTea</code> function is so polymorphic, all is
well, and as a neat side bonus, easy to understand. But we aren’t
here to pursue sanity; we gave that up to try to model mutation.</p>
</blockquote>
<h2 id="markdown-header-the-devil-in-the-delay">The devil in the delay</h2>
<p>If fully desugared, <code>while</code> is a two-argument (not counting still
thinking in CPS) higher-order function, taking <code>test</code> functions as
arguments. Just like you’re writing Smalltalk.</p>
<div class="codehilite language-javascript"><pre><span></span><span class="k">while</span><span class="p">(()</span> <span class="p">=></span> <span class="nx">forSomeTimeNow</span><span class="p">(),</span>
<span class="p">()</span> <span class="p">=></span> <span class="p">{</span>
<span class="nx">log</span><span class="p">(</span><span class="nx">s</span><span class="p">.</span><span class="nx">substring</span><span class="p">(</span><span class="mi">1</span><span class="p">))</span>
<span class="k">if</span><span class="p">(</span><span class="nx">itsAFullMoon</span><span class="p">(),</span>
<span class="p">()</span> <span class="p">=></span> <span class="nx">s</span> <span class="o">=</span> <span class="p">{</span><span class="nx">teaTime</span><span class="o">:</span> <span class="nx">s</span><span class="p">})</span>
<span class="nx">drinkTea</span><span class="p">(</span><span class="nx">s</span><span class="p">)</span>
<span class="p">})</span>
</pre></div>
<p>The thing that makes so much trouble for flow analysis is this
delay. Type-changing requires us to contradict earlier refinements of
a variable’s type, not simply refine them further. But the ability to
capture a reference to a variable in a lambda means that we need a
<em>deep</em> understanding of how that lambda will be used. It might never
be invoked. It might be invoked later in the function, just when we
thought it was safe to contradict whatever refinement it was
type-checked with. It might be saved off in another variable or data
structure elsewhere in the program, making reasoning about when the
variable might be referenced in the future a futile endeavor.</p>
<p>Doing flow analysis with sources of delayed execution whose behavior
is 100% known, like <code>if</code> and <code>for</code>, is tricky enough. Doing it in the
presence of unknown, novel, potentially <em>nondeterministic</em> sources
of delay is <strong>intractable, if not impossible</strong>.</p>
<p>And that’s for the computer. How many dimensions does the model in
your head have, now? Zero, no, <em>negative</em> points for abandoning this
ivory-tower static analysis and declaring “common sense” the arbiter
of your programs’ correctness.</p>
<h2 id="markdown-header-did-anyone-else-see-you-come-here">Did anyone else see you come here?</h2>
<p>An object with known properties can be thought of as a group of named
variables. This is a longtime, straightforward way to represent
modules of functions, or clean up a global namespace by putting a lot
of related items laying on the floor into the same drawer.</p>
<p>Since we <strong>love</strong> mutation, and we love mutating variables, we should
love <strong>mutating object properties (and their types) even more,
right?</strong></p>
<div class="codehilite language-javascript"><pre><span></span><span class="kd">function</span> <span class="nx">schedule</span><span class="p">(</span><span class="nx">s</span><span class="p">)</span> <span class="p">{</span>
<span class="nx">s</span><span class="p">.</span><span class="nx">teaTime</span> <span class="o">=</span> <span class="s2">"anytime"</span>
<span class="p">}</span>
</pre></div>
<p>The type of <code>s</code> in the caller after <code>schedule</code> finishes is
straightforward: it’s whatever it was before, with the <code>teaTime</code>
field (whatever it might have been before, if anything) type set to
string, or perhaps the literal singleton type <code>"anytime"</code>.</p>
<p>But what <code>schedule</code> is so eager to forget will not be so easily
forgotten by the rest of the program.</p>
<p>Namely, the <em>contradicted, earlier type of</em> <code>s</code> is very hard to
reliably eradicate. This is an <strong>aliasing</strong> problem, and it brings
the excitement of data races in shared-mutable-state multithreaded
programs to the seemingly prosaic JavaScript execution model.</p>
<p>To type <code>s</code> in the aftermath of <code>schedule</code>, you must perfectly answer
the question, “who has a reference to <code>s</code>”? Suppose that <code>teaTime</code> was
a now-contradicted function. Any code that calls that function via
property lookup on its reference now takes on another dimension:
before <code>schedule</code> executes, it is safe, but afterwards it no longer
is, so it takes on the prerequisite “can only be called before calling
<code>schedule(s)</code>.” The dimensional multiplication directly results from
the multiplication of possible types for <code>s</code>.</p>
<p>The problem broadens virally when you try to model other variables
that are <em>not</em> <code>s</code>, but whose types will still change due to
<code>schedule</code> being called! Here is an example of such a variable.</p>
<div class="codehilite language-javascript"><pre><span></span><span class="kr">const</span> <span class="nx">drinks</span> <span class="o">=</span> <span class="p">{</span><span class="nx">coffee</span><span class="o">:</span> <span class="nx">coffeeMod</span><span class="p">;</span> <span class="nx">tea</span><span class="o">:</span> <span class="nx">s</span><span class="p">}</span>
<span class="c1">// where s is the value we’re talking about</span>
</pre></div>
<p>So all the analysis of references to <code>s</code> induced by the type mutation
means references to <code>drinks</code> must undergo the same ordeal. And
references to something that refers to <code>drinks</code>, and references to
something that refers to <em>that</em>, and so on, ad infinitum.</p>
<p>And that is assuming we can statically determine what object
identities will be flying around the program. As with so much else in
this article, this is generally impossible.</p>
<blockquote>
<p>By the way, the problem with lambdas is just a special case of this
one; it’s exactly that lambdas <em>alias</em> variables that causes so much
grief for our wannabe-mutating ventures.</p>
</blockquote>
<h2 id="markdown-header-a-different-kind-of-power-at-a-much-more-reasonable-price">A different kind of power, at a much more reasonable price</h2>
<p>Since we are only imagining this insanity, not attempting to truly
partake of it, we have something good to feel about, after all: the
grass is really quite brown on the other side, alas, but that on
<em>this</em> side is a little greener than first glance might indicate.</p>
<p>Type systems like those of Haskell, Java, OCaml, Scala, and many other
languages simply don’t permit the types of variables to change. When
you consider the introduction of type equalities in Haskell or Scala
GADTs, or more direct refinements made in <a href="https://docs.racket-lang.org/ts-guide/occurrence-typing.html">Typed Racket</a>, Flow, or
TypeScript; you can include all of these languages in the broader
category of type systems <strong>whose variable types can only complement,
never contradict.</strong></p>
<p>This is a powerful simplifying assumption, because under this
restriction, <em>none of the above problems matter</em>. “Functional types”
are not only powerful enough to model programs, and far easier to
understand for the human programmer, they are the only way out of the
quagmire of complexity wrought by trying to “model mutation”. Even
problems that almost look amenable to mutation analysis, like the <code>while</code>
example above, admit a simpler solution in an immutable type like the
recursive <code>TeaTimeTower</code>.</p>
<p>More power is sometimes worse.</p>
<p>When you forbid unneeded capabilities, you get back capabilities in
other areas. Sometimes this comes in the form of improved
understanding, such as we get for large programs by introducing type
restrictions. It makes sense to give up “power” that is not practical
to get benefits that are.</p>
<p>Take the forbidding of mutation. We take type-level immutability for
granted in the same way that many practitioners take value-level
<em>mutability</em> for granted. Perhaps one reason for resistance to
functional programming might be that we are so accustomed to the
drawbacks of unconstrained mutability that it does not seem quite as
insane at the value level as, seen above, it is at the type level.</p>
<p>But, familiarity cannot make the insane any less so.</p>
<p><em>This article was tested with Flow 0.57.2 and TypeScript 2.5.</em></p>Stephen Compallhttp://www.blogger.com/profile/13346366374080232972noreply@blogger.com0tag:blogger.com,1999:blog-1184549185438247550.post-2291965377444357842017-07-27T22:17:00.000-04:002017-07-27T22:17:25.415-04:00Advanced type system features are [usually] a good thing<blockquote>
<p><em>The desire to allow more programs to be typed—by assigning more
accurate types to their parts—is the main force driving research in
the field.</em> – Benjamin Pierce, <em>Types and Programming Languages</em></p>
</blockquote>
<p>Type system design is a major factor in whether you can write
programs, and how easily you can do so. Simplicity is an important
consideration, but that entails a trickier question to start: “what is
simplicity?”</p>
<p>In this article, I want to consider two questions about type system
simplicity by comparing two—relatively good as these things go—type
systems, Haskell’s and TypeScript’s.</p>
<p>First: <strong>what dangers come from an excess of pragmatism in type system
design?</strong> By <em>pragmatism</em>, I mean the elevation of design compromise
as the highest virtue. The pragmatist seeks to compromise even those
“pure” designs that are entirely suitable when considering the
practical constraints in play.</p>
<p>I don’t use this word, ‘pragmatist’, because I think it’s nice and
accurate. Nor do I think that it’s fair to those who don’t fit the
definition I’ve just given, yet still think of themselves as
“pragmatic” in the broader sense. I use this word because the people
I describe have claimed the mantle of “pragmatism”, and reserved it
for themselves, quite successfully in the world of programming.
And first, we must name the enemy.</p>
<p>Second: <strong>what is so compelling about advanced type system features?</strong>
New type systems are often beset with users requesting features like
rank-N types, higher-kinded types, GADTs, existential quantification,
&c. There are good, practical reasons these features are requested;
they result in a kind of “simplicity” that cannot be had simply by
having a small number of features.</p>
<h2 id="markdown-header-an-unsound-feature-in-typescript">An unsound feature in TypeScript</h2>
<p>Function parameters are a contravariant position; contravariance and
invariance are the only sound choices for them. So TypeScript’s
“<a href="http://www.typescriptlang.org/docs/handbook/type-compatibility.html#function-parameter-bivariance">function parameter bivariance</a>” is a deliberately unsound choice;
if you’re unfamiliar with it, I strongly recommend stopping now and
reading the linked documentation, along with the explanation of why
they do it; it’s a good piece of documentation, describing an
eminently practical circumstance in which it might be used.</p>
<p>However, this example is worth examining more closely. Think about it
from the perspective of a type system designer: how would you support
the call to <code>addCallback</code> below?</p>
<div class="codehilite language-typescript"><pre><span></span><span class="kr">enum</span> <span class="nx">EventFlag</span> <span class="p">{</span>
<span class="nx">MousePress</span><span class="p">,</span>
<span class="nx">KeyPress</span>
<span class="p">}</span>
<span class="kr">interface</span> <span class="nx">Event</span><span class="p">;</span>
<span class="kr">interface</span> <span class="nx">MouseEvent</span> <span class="kr">extends</span> <span class="nx">Event</span> <span class="p">{</span><span class="cm">/*...*/</span><span class="p">}</span>
<span class="kr">interface</span> <span class="nx">KeyEvent</span> <span class="kr">extends</span> <span class="nx">Event</span> <span class="p">{</span><span class="cm">/*...*/</span><span class="p">}</span>
<span class="kd">function</span> <span class="nx">addCallback</span><span class="p">(</span>
<span class="nx">flag</span>: <span class="kt">EventFlag</span><span class="p">,</span>
<span class="nx">callback</span><span class="o">:</span> <span class="p">(</span><span class="nx">Event</span><span class="p">)</span> <span class="o">=></span> <span class="k">void</span><span class="p">)</span><span class="o">:</span> <span class="k">void</span> <span class="p">{</span>
<span class="c1">// ...</span>
<span class="p">}</span>
<span class="nx">addCallback</span><span class="p">(</span><span class="nx">EventFlag</span><span class="p">.</span><span class="nx">MousePress</span><span class="p">,</span> <span class="p">(</span><span class="nx">e</span>: <span class="kt">MouseEvent</span><span class="p">)</span> <span class="o">=></span>
<span class="p">{</span> <span class="p">}</span> <span class="c1">// handle e</span>
<span class="p">);</span>
</pre></div>
<h2 id="markdown-header-the-temptation-of-pragmatism">The temptation of pragmatism</h2>
<p>TypeScript’s design choice to support this sort of call is unsound.
This is explained by the documentation; again, please refer to that if
you haven’t yet.</p>
<p>There is always the temptation to poke a hole in the type system when
dealing with the problem, “how do I express this?” That’s because you
can then do something you <em>want</em>, without having gone through the
bother of proving that it’s safe. “You can’t do whatever you feel like
doing” is <em>exactly</em> what a sound type system <em>must</em> say. The benefits
of soundness diffuse across your program, filling in the negative
space of the tests that you no longer need to write; they can seem far
away when confronted with a problem <em>here</em> to be solved <em>now</em>.</p>
<p>In this way, unsound features are the greatest ally of the pragmatist.
They’re an asymmetric weapon, because sound features can never say
“just do what you like, here; don’t worry about the distant
consequences”.</p>
<p>We who have a strong distaste for pragmatism must make do instead with
research.</p>
<h2 id="markdown-header-a-sound-alternative-in-haskell">A sound alternative, in Haskell</h2>
<p>Haskell is a testbed for many advanced type system features,
demarcated by extension flags. One of the joys of working with
Haskell is <a href="https://ocharles.org.uk/blog/pages/2014-12-01-24-days-of-ghc-extensions.html">learning about a new extension</a>, what it means, and
thinking of ways to use it.</p>
<p>Many of these features are guarded by an extension flag; we’re going
to call on one such feature by placing at the top of the Haskell
source file</p>
<div class="codehilite language-haskell"><pre><span></span><span class="cm">{-# LANGUAGE GADTs #-}</span>
</pre></div>
<p>One of the things this enables is that you can attach type parameters
to enum members. <code>EventFlag</code> gets a type parameter indicating the
associated type of event.</p>
<div class="codehilite language-haskell"><pre><span></span><span class="kr">data</span> <span class="kt">EventFlag</span> <span class="n">e</span> <span class="kr">where</span>
<span class="kt">MousePress</span> <span class="ow">::</span> <span class="kt">EventFlag</span> <span class="kt">MouseEvent</span>
<span class="kt">KeyPress</span> <span class="ow">::</span> <span class="kt">EventFlag</span> <span class="kt">KeyEvent</span>
<span class="c1">-- MouseEvent and KeyEvent can be</span>
<span class="c1">-- related types, but don't have to be</span>
<span class="kr">data</span> <span class="kt">MouseEvent</span> <span class="ow">=</span> <span class="c1">-- ...</span>
<span class="kr">data</span> <span class="kt">KeyEvent</span> <span class="ow">=</span> <span class="c1">-- ...</span>
<span class="nf">addCallback</span> <span class="ow">::</span> <span class="kt">EventFlag</span> <span class="n">e</span>
<span class="ow">-></span> <span class="p">(</span><span class="n">e</span> <span class="ow">-></span> <span class="kt">IO</span> <span class="nb">()</span><span class="p">)</span>
<span class="ow">-></span> <span class="kt">IO</span> <span class="nb">()</span>
</pre></div>
<p><code>e</code> is a type parameter; when you pass an <code>EventFlag</code> to
<code>addCallback</code>, the callback type (<code>e -> IO ()</code> above) changes to
reflect what event type is expected.</p>
<div class="codehilite language-haskell"><pre><span></span><span class="nf">λ</span><span class="o">></span> <span class="kt">:</span><span class="n">t</span> <span class="n">addCallback</span> <span class="kt">MousePress</span>
<span class="p">(</span><span class="kt">MouseEvent</span> <span class="ow">-></span> <span class="kt">IO</span> <span class="nb">()</span><span class="p">)</span> <span class="ow">-></span> <span class="kt">IO</span> <span class="nb">()</span>
<span class="nf">λ</span><span class="o">></span> <span class="kt">:</span><span class="n">t</span> <span class="n">addCallback</span> <span class="kt">KeyPress</span>
<span class="p">(</span><span class="kt">KeyEvent</span> <span class="ow">-></span> <span class="kt">IO</span> <span class="nb">()</span><span class="p">)</span> <span class="ow">-></span> <span class="kt">IO</span> <span class="nb">()</span>
</pre></div>
<p>This is a better design in two ways.</p>
<ol>
<li><em>It is sound</em>; you cannot screw up the relationship between the
<code>EventFlag</code> argument and the event type that will be passed to the
callback.</li>
<li><em>It is more convenient</em>; if you pass a lambda as the callback
argument, it will simply “know” that the argument type is
<code>KeyEvent</code> or <code>MouseEvent</code>; your editor’s coding assistance can act
accordingly, without you having to declare the lambda’s argument
type at all.</li>
</ol>
<p>I would go so far as to say that this makes this <code>addCallback</code>
<strong>simpler</strong>; it’s easier and safer to use, and can even be implemented
safely. By contrast, function parameter covariance requires you, the
user of the function, to think through in your head whether it’s
<em>really OK</em>, without type-checker assistance, and even then the
library function can’t offer any help to callers if they declare the
lambda argument type wrong.</p>
<h2 id="markdown-header-whats-simpler">What’s simpler?</h2>
<p>A type system without powerful features for polymorphism makes it
difficult or impossible to describe many programs and libraries in
fully sound ways. A more powerful type system simplifies the task of
the programmer—its features give you a richer language with which to
describe generic APIs.</p>
<p>When the core of a type system doesn’t give you a way to type an
interface, you might follow the pragmatist’s advice, and poke a hole
in the type system. After that, you won’t be able to generally trust
the conclusions of the type checker throughout the program, anymore.</p>
<p>Instead, you might look at the leading edge of type system research,
for a sound way with which to express the API. This is not so
expedient, but yields APIs that are safer and more convenient to use
and implement.</p>
<p>With an unsound feature, the pragmatist can offer you the world, but
cannot offer the confidence that your programs don’t “go wrong”.
Powerful type system features might bend your mind, but promise to
preserve that confidence which makes type systems, type systems.</p>
<p><em>This article was tested with TypeScript 2.4 and GHC 8.0.2.</em></p>Stephen Compallhttp://www.blogger.com/profile/13346366374080232972noreply@blogger.com0tag:blogger.com,1999:blog-1184549185438247550.post-84312225584850577022017-04-15T10:11:00.000-04:002017-04-15T10:30:40.331-04:00Why I didn't sign the Scala CLA<p><em>I wrote this shortly
after
<a href="https://github.com/scala/scala/pull/4666#issuecomment-128675239">I opted not to sign the Scala CLA</a> in 2015. Since
Scala still requires a CLA in its contribution process, and even
contributing to Typelevel
Scala
<a href="https://github.com/typelevel/scala#relationship-with-lightbend-scala">effectively requires assent to the same unjust mechanism</a>,
I have decided to publish it at last.</em></p>
<p>One of the most important advantages of Free, Open Source Software
(FOSS) is that it returns power to the community of users. With
proprietary software, power is always concentrated in the hands of the
maintainer, i.e. the copyright holder.</p>
<p>The [more] equal status of maintainer and user in FOSS creates a
natural check. It keeps honest, well-intentioned maintainers honest,
and permits the community to reform around new maintainership should a
formerly good situation change. And circumstances can always change.</p>
<p>This equal status does not fall out of the sky; it is mediated by a
legal constitution: the license(s) of the software and documentation
developed by the project. When users accept the license terms—by
redistributing the code or changes thereto—they agree to this
constitution. When maintainers accept contributions under that
license, as in an ordinary CLA-less project,
under
<a href="https://opensource.com/law/11/7/trouble-harmony-part-1">inbound=outbound</a>,
they agree to the very same constitution as the users.</p>
<p>A project with a CLA or ©AA is different. There is one legal
constitution for the users, and one for the maintainers. This
arrangement always privileges the maintainers by</p>
<ol>
<li>removing privileges from the users and reserving them for the
maintainers, and</li>
<li>removing risk from the maintainers and reserving it for the
users.</li>
</ol>
<p>Despite fine words
in <a href="http://www.lightbend.com/contribute/cla/scala">the Scala CLA</a>
about “being for your protection as well as ours” (to paraphrase), the
terms that follow are, with few exceptions, utterly and
unapologetically nonreciprocal.</p>
<p>I believe this situation is acceptable in some cases; the only such
agreements I have signed without regret are with the FSF. But no CLA
or ©AA I have ever seen makes the strong reciprocal promises that the
FSF does, and it is anyway unreasonable to expect any contributor to
so carefully evaluate the likely future behavior of each organization
maintaining some software they might like to contribute to. For
myself, I decided that, given my past regrets, and the degree to which
EPFL’s agreement transfers power to its own hands and risk back to the
contributors’, there was no way I would come to trust EPFL
sufficiently to sign.</p>
<p>This is not to say that EPFL would be an ill-behaved caretaker! But by
what means could I make that determination? Moreover, why is it even
necessary?</p>
<p>The closest thing to an acceptable rationale for the Scala CLA is that
it addresses legal concerns left unmentioned by the license,
e.g. patent grants. These are important concerns, too frequently
unaddressed by projects using minimalist licenses such as
Scala <a href="http://www.scala-lang.org/license.html">uses</a>. But the
appropriate place to do this is to address these concerns in the basic
legal constitution for all: the license. If these guarantees are so
important that EPFL must have them, then why should we, as
contributors, not ask them of EPFL, via inbound=outbound? If these
terms would make the license “too complex”, no longer minimal, what
about their placement in a CLA will make them any better understood?</p>
<p>It’s my hope that Scala will abandon the CLA, and switch to a
lightweight option that holds true to the principles of FOSS
projects. A couple options are</p>
<ol>
<li>A
formal
<a href="https://spreadsheets.google.com/spreadsheet/viewform?hl=en_US&formkey=dFFjXzBzM1VwekFlOWFWMjFFRjJMRFE6MQ#gid=0">license-assent-only</a> mechanism,
like Selenium’s.</li>
<li>A
<a href="https://www.kernel.org/doc/html/latest/process/submitting-patches.html#sign-your-work-the-developer-s-certificate-of-origin">Developer Certificate of Origin</a>,
like the Linux kernel.</li>
</ol>
<p>This may or may not be coupled with the switch to a longer license
that incorporates stronger patent protections,
like
<a href="https://opensource.org/licenses/Apache-2.0">Apache License 2.0</a>. This
should alleviate the concerns that are currently addressed by the CLA,
but in a way that is equitable to the Scala project, all of its
contributors, and all of its users.</p>Stephen Compallhttp://www.blogger.com/profile/13346366374080232972noreply@blogger.com1tag:blogger.com,1999:blog-1184549185438247550.post-85407868200786016192017-04-09T19:06:00.002-04:002018-02-10T16:14:26.282-05:00...and the glorious subst to come<p>If you’re interested in design with zero-cost type tagging, or some
cases of <code>AnyVal</code> I didn’t cover in the first article, or you’re
looking for something else I missed, check here. There’s a lot more I
didn’t have room
for
<a href="/2017/04/the-high-cost-of-anyval-subclasses.html">in the first article</a>. Consider
this “bonus content”.</p>
<h2 id="markdown-header-unidirectional-subst">Unidirectional subst</h2>
<p>We saw earlier that though <code>subst</code> <em>appears</em> to substitute in only one
direction, that direction can easily be reversed. This is due to the
symmetry of type equality—if <code>A = B</code>, then surely also <code>B = A</code>.</p>
<p>Suppose that <code>apply</code> implemented some per-<code>String</code> validation
logic. In that case, you wouldn’t want users of the <code>Label</code> API to be
able to circumvent this validation, wholesale; this is easy to do with
the <code>subst</code> I have shown, and we saw it already when we tagged a whole
list and function, both designed only for plain <code>String</code>s!</p>
<p>We can get an idea of how to fix this by comparing <code>Leibniz</code> and
<code>Liskov</code>. Looking at the signature of <code>Liskov.subst</code>, you decide to
introduce <code>widen</code>, replacing <code>subst</code>.</p>
<div class="codehilite language-scala"><pre><span></span><span class="c1">// in LabelImpl</span>
<span class="k">def</span> <span class="n">widen</span><span class="o">[</span><span class="kt">F</span><span class="o">[</span><span class="kt">+</span><span class="k">_</span><span class="o">]](</span><span class="n">ft</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">T</span><span class="o">])</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">String</span><span class="o">]</span>
<span class="c1">// in val Label</span>
<span class="k">override</span> <span class="k">def</span> <span class="n">widen</span><span class="o">[</span><span class="kt">F</span><span class="o">[</span><span class="kt">+</span><span class="k">_</span><span class="o">]](</span><span class="n">ft</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">T</span><span class="o">])</span> <span class="k">=</span> <span class="n">ft</span>
</pre></div>
<p>With this design, you can untag a tagged list.</p>
<div class="codehilite language-scala"><pre><span></span><span class="n">scala</span><span class="o">></span> <span class="nc">Label</span><span class="o">.</span><span class="n">widen</span><span class="o">(</span><span class="n">taggedList</span><span class="o">)</span>
<span class="n">res0</span><span class="k">:</span> <span class="kt">List</span><span class="o">[</span><span class="kt">String</span><span class="o">]</span> <span class="k">=</span> <span class="nc">List</span><span class="o">(</span><span class="n">hello</span><span class="o">,</span> <span class="n">world</span><span class="o">)</span>
</pre></div>
<p>You can <em>tag</em> a function that takes an untagged list as parameter.</p>
<div class="codehilite language-scala"><pre><span></span><span class="n">scala</span><span class="o">></span> <span class="k">def</span> <span class="n">report</span><span class="o">(</span><span class="n">xs</span><span class="k">:</span> <span class="kt">List</span><span class="o">[</span><span class="kt">String</span><span class="o">])</span><span class="k">:</span> <span class="kt">Unit</span> <span class="o">=</span> <span class="o">()</span>
<span class="n">report</span><span class="k">:</span> <span class="o">(</span><span class="kt">xs:</span> <span class="kt">List</span><span class="o">[</span><span class="kt">String</span><span class="o">])</span><span class="nc">Unit</span>
<span class="n">scala</span><span class="o">></span> <span class="k">def</span> <span class="n">cwiden</span><span class="o">[</span><span class="kt">F</span><span class="o">[</span><span class="kt">-</span><span class="k">_</span><span class="o">]](</span><span class="n">fs</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">String</span><span class="o">])</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">Label</span><span class="o">]</span> <span class="k">=</span>
<span class="nc">Label</span><span class="o">.</span><span class="n">widen</span><span class="o">[</span><span class="kt">Lambda</span><span class="o">[</span><span class="kt">`+x`</span> <span class="k">=></span> <span class="kt">F</span><span class="o">[</span><span class="kt">x</span><span class="o">]</span> <span class="k">=></span> <span class="kt">F</span><span class="o">[</span><span class="kt">Label</span><span class="o">]]](</span><span class="n">identity</span><span class="o">)(</span><span class="n">fs</span><span class="o">)</span>
<span class="n">cwiden</span><span class="k">:</span> <span class="err">[</span><span class="kt">F</span><span class="o">[</span><span class="kt">-</span><span class="k">_</span><span class="o">]</span><span class="err">]</span><span class="o">(</span><span class="n">fs</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">String</span><span class="o">])</span><span class="n">F</span><span class="o">[</span><span class="kt">Label</span><span class="o">]</span>
<span class="n">scala</span><span class="o">></span> <span class="n">cwiden</span><span class="o">[</span><span class="kt">Lambda</span><span class="o">[</span><span class="kt">`-x`</span> <span class="k">=></span> <span class="kt">List</span><span class="o">[</span><span class="kt">x</span><span class="o">]</span> <span class="k">=></span> <span class="kt">Unit</span><span class="o">]](</span><span class="n">report</span><span class="o">)</span>
<span class="n">res1</span><span class="k">:</span> <span class="kt">List</span><span class="o">[</span><span class="kt">Label</span><span class="o">]</span> <span class="k">=></span> <span class="nc">Unit</span> <span class="k">=</span> <span class="nc">$$Lambda$3263</span><span class="o">/</span><span class="mi">1163097357</span><span class="k">@</span><span class="mi">7</span><span class="n">e4f65b7</span>
</pre></div>
<p>However, logically, this kind of “tagging” is just a delayed
“untagging” of the <code>T</code>s involved, so your validation rules are
preserved.</p>
<p>What’s happening? With <code>subst</code>, we selectively revealed a type
equality. <code>widen</code> is deliberately less revealing; it selectively
reveals a <em>subtyping relationship</em>, namely, <code>T <: String</code>.</p>
<div class="codehilite language-scala"><pre><span></span><span class="n">scala</span><span class="o">></span> <span class="k">import</span> <span class="nn">scalaz.Liskov</span><span class="o">,</span> <span class="nc">Liskov</span><span class="o">.<~<</span>
<span class="n">scala</span><span class="o">></span> <span class="nc">Label</span><span class="o">.</span><span class="n">widen</span><span class="o">[</span><span class="kt">Lambda</span><span class="o">[</span><span class="kt">`+x`</span> <span class="k">=></span> <span class="o">(</span><span class="kt">Label</span> <span class="kt"><~<</span> <span class="kt">x</span><span class="o">)]](</span><span class="nc">Liskov</span><span class="o">.</span><span class="n">refl</span><span class="o">)</span>
<span class="n">res2</span><span class="k">:</span> <span class="kt">scalaz.Liskov</span><span class="o">[</span><span class="kt">Label.T</span>,<span class="kt">String</span><span class="o">]</span> <span class="k">=</span> <span class="n">scalaz</span><span class="o">.</span><span class="nc">Liskov$$anon$3</span><span class="k">@</span><span class="mi">58</span><span class="n">e8db18</span>
</pre></div>
<h2 id="markdown-header-cheap-tagging-with-validation">Cheap tagging with validation</h2>
<p>You can think of <code>+</code> or <code>-</code> in the signatures of <code>widen</code> and <code>cwiden</code>
above as a kind of constraint on the <code>F</code> that those functions take; by
contrast, <code>subst</code> took any <code>F</code> without bounds on its argument.</p>
<p>There are other interesting choices of constraint, like <code>Foldable</code>.</p>
<div class="codehilite language-scala"><pre><span></span><span class="k">import</span> <span class="nn">scalaz.</span><span class="o">{</span><span class="nc">Failure</span><span class="o">,</span> <span class="nc">Foldable</span><span class="o">,</span> <span class="nc">Success</span><span class="o">,</span> <span class="nc">ValidationNel</span><span class="o">}</span>
<span class="k">import</span> <span class="nn">scalaz.syntax.std.option._</span>
<span class="k">import</span> <span class="nn">scalaz.syntax.foldable._</span>
<span class="c1">// in LabelImpl, alongside def widen:</span>
<span class="k">def</span> <span class="n">narrow</span><span class="o">[</span><span class="kt">F</span><span class="o">[</span><span class="k">_</span><span class="o">]</span><span class="kt">:</span> <span class="kt">Foldable</span><span class="o">](</span><span class="n">fs</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">String</span><span class="o">])</span>
<span class="k">:</span> <span class="kt">ValidationNel</span><span class="o">[</span><span class="kt">Err</span>, <span class="kt">F</span><span class="o">[</span><span class="kt">T</span><span class="o">]]</span>
<span class="c1">// in val Label</span>
<span class="k">override</span> <span class="k">def</span> <span class="n">narrow</span><span class="o">[</span><span class="kt">F</span><span class="o">[</span><span class="k">_</span><span class="o">]</span><span class="kt">:</span> <span class="kt">Foldable</span><span class="o">](</span><span class="n">fs</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">String</span><span class="o">])</span> <span class="k">=</span>
<span class="n">fs</span><span class="o">.</span><span class="n">foldMap</span><span class="o">{</span><span class="n">string</span> <span class="k">=></span>
<span class="c1">// return errors if not OK, INil() if OK</span>
<span class="o">}.</span><span class="n">toNel</span> <span class="n">cata</span> <span class="o">(</span><span class="nc">Failure</span><span class="o">(</span><span class="k">_</span><span class="o">),</span> <span class="nc">Success</span><span class="o">(</span><span class="n">fs</span><span class="o">))</span>
</pre></div>
<p>This is interesting because if you pass anything and get back a
<code>Success</code>, the succeeding value is just the argument you passed in, no
reallocation necessary. (To reallocate, we would need <code>Traverse</code>
instead of <code>Foldable</code>.)</p>
<h2 id="markdown-header-unidirectional-without-subtyping">Unidirectional without subtyping</h2>
<p>If you prefer to avoid subtyping, you can also constrain <code>subst</code>
variants with typeclasses indicating directionality. For Scalaz or
Cats, providing both of these would be a sufficient substitute for the
<code>widen[F[+_]]</code> introduced above.</p>
<div class="codehilite language-scala"><pre><span></span> <span class="k">def</span> <span class="n">widen</span><span class="o">[</span><span class="kt">F</span><span class="o">[</span><span class="k">_</span><span class="o">]</span><span class="kt">:</span> <span class="kt">Functor</span><span class="o">](</span><span class="n">ft</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">T</span><span class="o">])</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">String</span><span class="o">]</span>
<span class="k">def</span> <span class="n">cwiden</span><span class="o">[</span><span class="kt">F</span><span class="o">[</span><span class="k">_</span><span class="o">]</span><span class="kt">:</span> <span class="kt">Contravariant</span><span class="o">](</span><span class="n">fs</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">String</span><span class="o">])</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">T</span><span class="o">]</span>
</pre></div>
<h2 id="markdown-header-t-string-translucency"><code>T = String</code> translucency</h2>
<p><code>subst</code> and <code>widen</code> are very powerful, but maybe you’re bothered by the fact
that <code>T</code> erases to <code>Object</code>, and you would rather “untagging” happen
automatically.</p>
<p>Thus far, you’ve been selectively revealing aspects of the type
relationship between <code>T</code> and <code>String</code>. What if you were to <em>globally</em>
reveal part of it?</p>
<p>To be clear, we must not globally reveal <code>T = String</code>; then there
would be no usable distinction. But you can reveal weaker properties.</p>
<div class="codehilite language-scala"><pre><span></span><span class="c1">// in LabelImpl</span>
<span class="k">type</span> <span class="kt">T</span> <span class="k"><:</span> <span class="kt">String</span>
</pre></div>
<p>Now, widening happens automatically.</p>
<div class="codehilite language-scala"><pre><span></span><span class="n">scala</span><span class="o">></span> <span class="n">taggedList</span><span class="k">:</span> <span class="kt">List</span><span class="o">[</span><span class="kt">String</span><span class="o">]</span>
<span class="n">res0</span><span class="k">:</span> <span class="kt">List</span><span class="o">[</span><span class="kt">String</span><span class="o">]</span> <span class="k">=</span> <span class="nc">List</span><span class="o">(</span><span class="n">hello</span><span class="o">,</span> <span class="n">world</span><span class="o">)</span>
<span class="n">scala</span><span class="o">></span> <span class="n">report</span><span class="k">:</span> <span class="o">(</span><span class="kt">List</span><span class="o">[</span><span class="kt">Label</span><span class="o">]</span> <span class="o">=></span> <span class="nc">Unit</span><span class="o">)</span>
<span class="n">res1</span><span class="k">:</span> <span class="kt">List</span><span class="o">[</span><span class="kt">Label</span><span class="o">]</span> <span class="k">=></span> <span class="nc">Unit</span> <span class="k">=</span> <span class="nc">$$Lambda$3348</span><span class="o">/</span><span class="mi">1710049434</span><span class="k">@</span><span class="mi">4320749</span><span class="n">b</span>
</pre></div>
<p>Narrowing is still forbidden; <code>T</code> and <code>String</code> are still separate.</p>
<div class="codehilite language-scala"><pre><span></span><span class="n">scala</span><span class="o">></span> <span class="o">(</span><span class="n">taggedList</span><span class="k">:</span> <span class="kt">List</span><span class="o">[</span><span class="kt">String</span><span class="o">])</span><span class="k">:</span> <span class="kt">List</span><span class="o">[</span><span class="kt">Label</span><span class="o">]</span>
<span class="o"><</span><span class="n">console</span><span class="k">>:</span><span class="mi">23</span><span class="k">:</span> <span class="kt">error:</span> <span class="k">type</span> <span class="kt">mismatch</span><span class="o">;</span>
<span class="n">found</span> <span class="k">:</span> <span class="kt">List</span><span class="o">[</span><span class="kt">String</span><span class="o">]</span>
<span class="n">required</span><span class="k">:</span> <span class="kt">List</span><span class="o">[</span><span class="kt">hcavsc.translucent.Labels.Label</span><span class="o">]</span>
<span class="o">(</span><span class="n">which</span> <span class="n">expands</span> <span class="n">to</span><span class="o">)</span> <span class="nc">List</span><span class="o">[</span><span class="kt">hcavsc.translucent.Labels.Label.T</span><span class="o">]</span>
<span class="o">(</span><span class="n">taggedList</span><span class="k">:</span> <span class="kt">List</span><span class="o">[</span><span class="kt">String</span><span class="o">])</span><span class="k">:</span> <span class="kt">List</span><span class="o">[</span><span class="kt">Label</span><span class="o">]</span>
<span class="o">^</span>
</pre></div>
<p>Moreover, erasure looks like <code>AnyVal</code> subclassing erasure again.</p>
<div class="codehilite language-java"><pre><span></span><span class="c1">// javap -c -cp target/scala-2.12/classes hcavsc.translucent.MyFirstTests</span>
<span class="kd">public</span> <span class="n">java</span><span class="o">.</span><span class="na">lang</span><span class="o">.</span><span class="na">String</span> <span class="nf">combineLabels</span><span class="o">(</span><span class="n">java</span><span class="o">.</span><span class="na">lang</span><span class="o">.</span><span class="na">String</span><span class="o">,</span> <span class="n">java</span><span class="o">.</span><span class="na">lang</span><span class="o">.</span><span class="na">String</span><span class="o">);</span>
</pre></div>
<p>However, this makes it very difficult for typeclass resolution to
reliably distinguish <code>String</code> and <code>T</code>. It’s also easy to accidentally
untag. That’s why we took this out of Scalaz’s <code>Tag</code>s; discriminating
typeclass instances is a very useful feature of tags. If these aren’t
concerns for you, globally revealed tag subtyping may be the most
convenient for you.</p>
<h2 id="markdown-header-boxing-ints">Boxing <code>Int</code>s</h2>
<p><code>AnyVal</code> might seem to have better, more justifiable boxing behavior
in the cast of primitive types like <code>Int</code>. When putting than <code>AnyVal</code>
wrapper around <code>Int</code>, the custom box replaces the plain <code>Integer</code> box,
rather than adding another layer.</p>
<div class="codehilite language-scala"><pre><span></span><span class="k">final</span> <span class="k">class</span> <span class="nc">MagicInt</span><span class="o">(</span><span class="k">val</span> <span class="n">x</span><span class="k">:</span> <span class="kt">Int</span><span class="o">)</span> <span class="k">extends</span> <span class="nc">AnyVal</span>
<span class="k">val</span> <span class="n">x</span> <span class="k">=</span> <span class="mi">42</span>
<span class="k">val</span> <span class="n">y</span> <span class="k">=</span> <span class="mi">84</span>
<span class="c1">// javap -c -cp target/scala-2.12/classes hcavsc.intsav.BytecodeTests</span>
<span class="nc">List</span><span class="o">(</span><span class="n">x</span><span class="o">,</span> <span class="n">y</span><span class="o">)</span>
<span class="c1">// skipping some setup bytecode</span>
<span class="mi">13</span><span class="k">:</span> <span class="kt">newarray</span> <span class="kt">int</span>
<span class="mi">15</span><span class="k">:</span> <span class="kt">dup</span>
<span class="mi">16</span><span class="k">:</span> <span class="kt">iconst_0</span>
<span class="mi">17</span><span class="k">:</span> <span class="kt">iload_1</span>
<span class="mi">18</span><span class="k">:</span> <span class="kt">iastore</span>
<span class="mi">19</span><span class="k">:</span> <span class="kt">dup</span>
<span class="mi">20</span><span class="k">:</span> <span class="kt">iconst_1</span>
<span class="mi">21</span><span class="k">:</span> <span class="kt">iload_2</span>
<span class="mi">22</span><span class="k">:</span> <span class="kt">iastore</span>
<span class="mi">23</span><span class="k">:</span> <span class="kt">invokevirtual</span> <span class="k">#</span><span class="err">25</span> <span class="c1">// Method scala/Predef$.wrapIntArray:([I)Lscala/collection/mutable/WrappedArray;</span>
<span class="mi">26</span><span class="k">:</span> <span class="kt">invokevirtual</span> <span class="k">#</span><span class="err">29</span> <span class="c1">// Method scala/collection/immutable/List$.apply:(Lscala/collection/Seq;)Lscala/collection/immutable/List;</span>
<span class="nc">List</span><span class="o">(</span><span class="k">new</span> <span class="nc">MagicInt</span><span class="o">(</span><span class="n">x</span><span class="o">),</span> <span class="k">new</span> <span class="nc">MagicInt</span><span class="o">(</span><span class="n">y</span><span class="o">))</span>
<span class="c1">// skipping more setup</span>
<span class="mi">37</span><span class="k">:</span> <span class="kt">anewarray</span> <span class="k">#</span><span class="err">31</span> <span class="c1">// class hcavsc/intsav/MagicInt</span>
<span class="mi">40</span><span class="k">:</span> <span class="kt">dup</span>
<span class="mi">41</span><span class="k">:</span> <span class="kt">iconst_0</span>
<span class="mi">42</span><span class="k">:</span> <span class="kt">new</span> <span class="k">#</span><span class="err">31</span> <span class="c1">// class hcavsc/intsav/MagicInt</span>
<span class="mi">45</span><span class="k">:</span> <span class="kt">dup</span>
<span class="mi">46</span><span class="k">:</span> <span class="kt">iload_1</span>
<span class="mi">47</span><span class="k">:</span> <span class="kt">invokespecial</span> <span class="k">#</span><span class="err">35</span> <span class="c1">// Method hcavsc/intsav/MagicInt."<init>":(I)V</span>
<span class="mi">50</span><span class="k">:</span> <span class="kt">aastore</span>
<span class="mi">51</span><span class="k">:</span> <span class="kt">dup</span>
<span class="mi">52</span><span class="k">:</span> <span class="kt">iconst_1</span>
<span class="mi">53</span><span class="k">:</span> <span class="kt">new</span> <span class="k">#</span><span class="err">31</span> <span class="c1">// class hcavsc/intsav/MagicInt</span>
<span class="mi">56</span><span class="k">:</span> <span class="kt">dup</span>
<span class="mi">57</span><span class="k">:</span> <span class="kt">iload_2</span>
<span class="mi">58</span><span class="k">:</span> <span class="kt">invokespecial</span> <span class="k">#</span><span class="err">35</span> <span class="c1">// Method hcavsc/intsav/MagicInt."<init>":(I)V</span>
<span class="mi">61</span><span class="k">:</span> <span class="kt">aastore</span>
<span class="mi">62</span><span class="k">:</span> <span class="kt">invokevirtual</span> <span class="k">#</span><span class="err">39</span> <span class="c1">// Method scala/Predef$.genericWrapArray:(Ljava/lang/Object;)Lscala/collection/mutable/WrappedArray;</span>
<span class="mi">65</span><span class="k">:</span> <span class="kt">invokevirtual</span> <span class="k">#</span><span class="err">29</span> <span class="c1">// Method scala/collection/immutable/List$.apply:(Lscala/collection/Seq;)Lscala/collection/immutable/List;</span>
</pre></div>
<p>By contrast, the opaque <code>T</code> to <code>Integer</code> when we <code>apply(i: Int):
T</code>. It then remains in that box until we deliberately get the <code>Int</code>
back.</p>
<div class="codehilite language-scala"><pre><span></span><span class="c1">// MagicInt is defined like Label,</span>
<span class="c1">// but over Int instead of String</span>
<span class="k">val</span> <span class="n">x</span> <span class="k">=</span> <span class="nc">MagicInt</span><span class="o">(</span><span class="mi">42</span><span class="o">)</span>
<span class="c1">// javap -c -cp target/scala-2.12/classes hcavsc.ints.OtherTests</span>
<span class="mi">0</span><span class="k">:</span> <span class="kt">getstatic</span> <span class="k">#</span><span class="err">21</span> <span class="c1">// Field hcavsc/ints/MagicInts$.MODULE$:Lhcavsc/ints/MagicInts$;</span>
<span class="mi">3</span><span class="k">:</span> <span class="kt">invokevirtual</span> <span class="k">#</span><span class="err">25</span> <span class="c1">// Method hcavsc/ints/MagicInts$.MagicInt:()Lhcavsc/ints/MagicInts$MagicIntImpl;</span>
<span class="mi">6</span><span class="k">:</span> <span class="kt">bipush</span> <span class="err">42</span>
<span class="err">8</span><span class="kt">:</span> <span class="kt">invokevirtual</span> <span class="k">#</span><span class="err">29</span> <span class="c1">// Method hcavsc/ints/MagicInts$MagicIntImpl.apply:(I)Ljava/lang/Object;</span>
<span class="c1">// javap -c -cp target/scala-2.12/classes 'hcavsc.ints.MagicInts$$anon$1'</span>
<span class="n">public</span> <span class="n">java</span><span class="o">.</span><span class="n">lang</span><span class="o">.</span><span class="nc">Object</span> <span class="n">apply</span><span class="o">(</span><span class="n">int</span><span class="o">);</span>
<span class="nc">Code</span><span class="k">:</span>
<span class="err">0</span><span class="kt">:</span> <span class="kt">aload_0</span>
<span class="mi">1</span><span class="k">:</span> <span class="kt">iload_1</span>
<span class="mi">2</span><span class="k">:</span> <span class="kt">invokevirtual</span> <span class="k">#</span><span class="err">23</span> <span class="c1">// Method apply:(I)I</span>
<span class="mi">5</span><span class="k">:</span> <span class="kt">invokestatic</span> <span class="k">#</span><span class="err">29</span> <span class="c1">// Method scala/runtime/BoxesRunTime.boxToInteger:(I)Ljava/lang/Integer;</span>
<span class="mi">8</span><span class="k">:</span> <span class="kt">areturn</span>
<span class="nc">List</span><span class="o">(</span><span class="n">x</span><span class="o">,</span> <span class="n">x</span><span class="o">)</span>
<span class="c1">// skipping setup as before</span>
<span class="mi">19</span><span class="k">:</span> <span class="kt">anewarray</span> <span class="k">#</span><span class="err">4</span> <span class="c1">// class java/lang/Object</span>
<span class="mi">22</span><span class="k">:</span> <span class="kt">dup</span>
<span class="mi">23</span><span class="k">:</span> <span class="kt">iconst_0</span>
<span class="mi">24</span><span class="k">:</span> <span class="kt">aload_1</span>
<span class="mi">25</span><span class="k">:</span> <span class="kt">aastore</span>
<span class="mi">26</span><span class="k">:</span> <span class="kt">dup</span>
<span class="mi">27</span><span class="k">:</span> <span class="kt">iconst_1</span>
<span class="mi">28</span><span class="k">:</span> <span class="kt">aload_1</span>
<span class="mi">29</span><span class="k">:</span> <span class="kt">aastore</span>
<span class="mi">30</span><span class="k">:</span> <span class="kt">invokevirtual</span> <span class="k">#</span><span class="err">43</span> <span class="c1">// Method scala/Predef$.genericWrapArray:(Ljava/lang/Object;)Lscala/collection/mutable/WrappedArray;</span>
<span class="mi">33</span><span class="k">:</span> <span class="kt">invokevirtual</span> <span class="k">#</span><span class="err">46</span> <span class="c1">// Method scala/collection/immutable/List$.apply:(Lscala/collection/Seq;)Lscala/collection/immutable/List;</span>
</pre></div>
<p>While the boxing in the above example happened in <code>MagicInt.apply</code>,
there’s nothing special about that function’s boxing; the standard
<code>Int</code> boxing serves just as well.</p>
<div class="codehilite language-scala"><pre><span></span><span class="c1">// javap -c -cp target/scala-2.12/classes hcavsc.ints.OtherTests</span>
<span class="k">val</span> <span class="n">xs</span> <span class="k">=</span> <span class="nc">List</span><span class="o">(</span><span class="mi">42</span><span class="o">)</span>
<span class="mi">44</span><span class="k">:</span> <span class="kt">newarray</span> <span class="kt">int</span>
<span class="mi">46</span><span class="k">:</span> <span class="kt">dup</span>
<span class="mi">47</span><span class="k">:</span> <span class="kt">iconst_0</span>
<span class="mi">48</span><span class="k">:</span> <span class="kt">bipush</span> <span class="err">42</span>
<span class="err">50</span><span class="kt">:</span> <span class="kt">iastore</span>
<span class="mi">51</span><span class="k">:</span> <span class="kt">invokevirtual</span> <span class="k">#</span><span class="err">50</span> <span class="c1">// Method scala/Predef$.wrapIntArray:([I)Lscala/collection/mutable/WrappedArray;</span>
<span class="mi">54</span><span class="k">:</span> <span class="kt">invokevirtual</span> <span class="k">#</span><span class="err">46</span> <span class="c1">// Method scala/collection/immutable/List$.apply:(Lscala/collection/Seq;)Lscala/collection/immutable/List;</span>
<span class="mi">57</span><span class="k">:</span> <span class="kt">astore_2</span>
<span class="k">val</span> <span class="n">mxs</span> <span class="k">=</span> <span class="nc">MagicInt</span><span class="o">.</span><span class="n">subst</span><span class="o">(</span><span class="n">xs</span><span class="o">)</span>
<span class="mi">58</span><span class="k">:</span> <span class="kt">getstatic</span> <span class="k">#</span><span class="err">21</span> <span class="c1">// Field hcavsc/ints/MagicInts$.MODULE$:Lhcavsc/ints/MagicInts$;</span>
<span class="mi">61</span><span class="k">:</span> <span class="kt">invokevirtual</span> <span class="k">#</span><span class="err">25</span> <span class="c1">// Method hcavsc/ints/MagicInts$.MagicInt:()Lhcavsc/ints/MagicInts$MagicIntImpl;</span>
<span class="mi">64</span><span class="k">:</span> <span class="kt">aload_2</span>
<span class="mi">65</span><span class="k">:</span> <span class="kt">invokevirtual</span> <span class="k">#</span><span class="err">54</span> <span class="c1">// Method hcavsc/ints/MagicInts$MagicIntImpl.subst:(Ljava/lang/Object;)Ljava/lang/Object;</span>
<span class="k">val</span> <span class="n">y</span><span class="k">:</span> <span class="kt">MagicInt</span> <span class="o">=</span> <span class="n">mxs</span><span class="o">.</span><span class="n">head</span>
<span class="mi">73</span><span class="k">:</span> <span class="kt">invokevirtual</span> <span class="k">#</span><span class="err">60</span> <span class="c1">// Method scala/collection/immutable/List.head:()Ljava/lang/Object;</span>
<span class="mi">76</span><span class="k">:</span> <span class="kt">astore</span> <span class="err">4</span>
</pre></div>
<p>This is nice for two reasons:</p>
<ol>
<li><code>subst</code> still doesn’t imply any additional boxing beyond what the
underlying primitive type implies.</li>
<li>Where the primitive boxing is optimized, you get to keep those
optimizations; <code>AnyVal</code> subclass boxing effectively turns off these
optimizations. For
example,
<a href="https://docs.oracle.com/javase/8/docs/api/java/lang/Integer.html#valueOf-int-"><code>Integer</code> boxing is optimized</a>,
but <code>MagicInt</code>’s <code>AnyVal</code> class is not.</li>
</ol>
<p>The one remaining problem with the tag version of <code>MagicInt</code> is that
its erasure is still <code>Object</code>.</p>
<div class="codehilite language-scala"><pre><span></span><span class="k">def</span> <span class="n">myId</span><span class="o">(</span><span class="n">x</span><span class="k">:</span> <span class="kt">MagicInt</span><span class="o">)</span><span class="k">:</span> <span class="kt">MagicInt</span>
<span class="c1">// javap -c -cp target/scala-2.12/classes hcavsc.ints.OtherTests</span>
<span class="n">public</span> <span class="k">abstract</span> <span class="n">java</span><span class="o">.</span><span class="n">lang</span><span class="o">.</span><span class="nc">Object</span> <span class="n">myId</span><span class="o">(</span><span class="n">java</span><span class="o">.</span><span class="n">lang</span><span class="o">.</span><span class="nc">Object</span><span class="o">);</span>
</pre></div>
<p>However, if you
use
<a href="#markdown-header-t-string-translucency">the “translucent” variant</a>
where it is always known that <code>type T <: Int</code>, the erasure is the same
as <code>Int</code> itself.</p>
<div class="codehilite language-java"><pre><span></span><span class="c1">// javap -c -cp target/scala-2.12/classes hcavsc.translucentints.OtherTests</span>
<span class="kd">public</span> <span class="kd">abstract</span> <span class="kt">int</span> <span class="nf">myId</span><span class="o">(</span><span class="kt">int</span><span class="o">);</span>
</pre></div>
<p>(The boxing/unboxing of <code>MagicInt</code> changes to match.) Unfortunately,
there’s no way to tell Scala what the erasure ought to be without
exposing that extra type information, which may be quite inconvenient.</p>
<h2 id="markdown-header-would-you-box-a-javascript-string">Would you box a JavaScript string?</h2>
<p>Maybe if we weren’t working with types. Since we are working with
types, we don’t have to box our strings in JavaScript in order to keep
track of what sort of strings they are. But Scala might want to,
anyway.</p>
<div class="codehilite language-scala"><pre><span></span><span class="k">val</span> <span class="n">x</span> <span class="k">=</span> <span class="k">new</span> <span class="nc">Label</span><span class="o">(</span><span class="s">"hi"</span><span class="o">)</span>
<span class="n">js</span><span class="o">.</span><span class="nc">Array</span><span class="o">(</span><span class="n">x</span><span class="o">,</span> <span class="n">x</span><span class="o">)</span>
<span class="c1">// sbt fastOptJS output</span>
<span class="o">[</span><span class="kt">new</span> <span class="kt">$c_Lhcavsc_av_Label</span><span class="o">()</span><span class="kt">.init___T</span><span class="o">(</span><span class="err">"</span><span class="kt">hi</span><span class="err">"</span><span class="o">)</span>,
<span class="kt">new</span> <span class="kt">$c_Lhcavsc_av_Label</span><span class="o">()</span><span class="kt">.init___T</span><span class="o">(</span><span class="err">"</span><span class="kt">hi</span><span class="err">"</span><span class="o">)];</span>
</pre></div>
<p>Surely it doesn’t have to for our tag-like <code>Label</code>. And indeed it
doesn’t.</p>
<div class="codehilite language-scala"><pre><span></span><span class="k">val</span> <span class="n">h</span> <span class="k">=</span> <span class="nc">Label</span><span class="o">(</span><span class="s">"hi"</span><span class="o">)</span>
<span class="c1">// compiles to</span>
<span class="k">var</span> <span class="n">h</span> <span class="k">=</span> <span class="s">"hi"</span><span class="o">;</span>
<span class="c1">// fastOptJS is smart enough to know</span>
<span class="c1">// that apply can be elided</span>
<span class="k">val</span> <span class="n">hs</span> <span class="k">=</span> <span class="n">js</span><span class="o">.</span><span class="nc">Array</span><span class="o">(</span><span class="n">h</span><span class="o">,</span> <span class="n">h</span><span class="o">)</span>
<span class="c1">// compiles to</span>
<span class="k">var</span> <span class="n">hs</span> <span class="k">=</span> <span class="o">[</span><span class="kt">h</span>, <span class="kt">h</span><span class="o">];</span>
<span class="k">val</span> <span class="n">strs</span> <span class="k">=</span> <span class="nc">Label</span><span class="o">.</span><span class="n">subst</span><span class="o">[</span><span class="kt">Lambda</span><span class="o">[</span><span class="kt">x</span> <span class="k">=></span> <span class="kt">js.Array</span><span class="o">[</span><span class="kt">x</span><span class="o">]</span> <span class="k">=></span> <span class="kt">js.Array</span><span class="o">[</span><span class="kt">String</span><span class="o">]]](</span><span class="n">identity</span><span class="o">)(</span><span class="n">hs</span><span class="o">)</span>
<span class="n">strs</span><span class="o">(</span><span class="mi">0</span><span class="o">)</span> <span class="o">+</span> <span class="n">strs</span><span class="o">(</span><span class="mi">1</span><span class="o">)</span>
<span class="c1">// compiles to</span>
<span class="o">((</span><span class="s">""</span> <span class="o">+</span> <span class="nc">$as_T</span><span class="o">(</span><span class="n">hs</span><span class="o">[</span><span class="err">0</span><span class="o">]))</span> <span class="o">+</span> <span class="n">hs</span><span class="o">[</span><span class="err">1</span><span class="o">])</span>
<span class="c1">// fastOptJS is smart enough to know</span>
<span class="c1">// that subst, too, can be elided</span>
</pre></div>
<p>The possible existence of <code>subst</code> tells us something about the deeper
meaning of our abstract type definition, <code>type T = String</code>, that holds
true no matter how much of this equality we hide behind existential
layers. It is this: the compiler cannot predict when the fact that <code>T
= String</code> will be visible, and when it will not be. It must therefore
not generate code that would “go wrong” in contexts where this is
revealed.</p>
<p>For example, at one point, we saw that</p>
<div class="codehilite language-scala"><pre><span></span><span class="nc">Label</span><span class="o">.</span><span class="n">subst</span><span class="o">(</span><span class="nc">Monoid</span><span class="o">[</span><span class="kt">String</span><span class="o">])</span>
</pre></div>
<p>would yield indeed produce a suitable <code>Monoid[Label]</code>. This means not
only is the value’s type reinterpreted, but also, by consequence, its
members.</p>
<div class="codehilite language-scala"><pre><span></span><span class="n">scala</span><span class="o">></span> <span class="k">val</span> <span class="n">labelMonoid</span> <span class="k">=</span> <span class="nc">Label</span><span class="o">.</span><span class="n">subst</span><span class="o">(</span><span class="nc">Monoid</span><span class="o">[</span><span class="kt">String</span><span class="o">])</span>
<span class="n">labelMonoid</span><span class="k">:</span> <span class="kt">scalaz.Monoid</span><span class="o">[</span><span class="kt">Label.T</span><span class="o">]</span> <span class="k">=</span> <span class="n">scalaz</span><span class="o">.</span><span class="n">std</span><span class="o">.</span><span class="nc">StringInstances$stringInstance</span><span class="n">$</span><span class="k">@</span><span class="mi">6</span><span class="n">f612117</span>
<span class="n">scala</span><span class="o">></span> <span class="n">labelMonoid</span><span class="o">.</span><span class="n">zero</span>
<span class="n">res0</span><span class="k">:</span> <span class="kt">hcavsc.subst.Labels.Label.T</span> <span class="o">=</span> <span class="s">""</span>
<span class="n">scala</span><span class="o">></span> <span class="n">labelMonoid</span><span class="o">.</span><span class="n">append</span> <span class="k">_</span>
<span class="n">res1</span><span class="k">:</span> <span class="o">(</span><span class="kt">Label.T</span><span class="o">,</span> <span class="o">=></span> <span class="nc">Label</span><span class="o">.</span><span class="n">T</span><span class="o">)</span> <span class="k">=></span> <span class="nc">Label</span><span class="o">.</span><span class="n">T</span> <span class="k">=</span> <span class="nc">$$Lambda$3184</span><span class="o">/</span><span class="mi">987934553</span><span class="k">@</span><span class="mi">3</span><span class="n">af2619b</span>
</pre></div>
<p>However, in <code>subst</code>, we have charged the compiler with doing this
arbitrarily complex substitution with 100% accuracy and in constant
time. There are no opportunities to generate “wrappers”, not for these
structures that merely employ <code>Label</code> in their types. And, by
consequence, there’s nowhere to put code that would use some means to
treat <code>Label</code> and <code>String</code> differently based on runtime choices.</p>
<p>If you wish to automatically add “wrappers”, you have a difficult
problem already with parametric polymorphism. With higher-kinded
types, you have an intractable problem.</p>
<h2 id="markdown-header-speaking-of-higher-kinded-types">Speaking of higher-kinded types…</h2>
<p>Type tagging works perfectly well with parameterized types.</p>
<div class="codehilite language-scala"><pre><span></span><span class="k">type</span> <span class="kt">KWConcrete</span><span class="o">[</span><span class="kt">W</span>, <span class="kt">A</span>, <span class="kt">B</span><span class="o">]</span> <span class="k">=</span> <span class="nc">Kleisli</span><span class="o">[(</span><span class="kt">W</span>, <span class="kt">?</span><span class="o">)</span>, <span class="kt">A</span>, <span class="kt">B</span><span class="o">]</span>
<span class="k">sealed</span> <span class="k">abstract</span> <span class="k">class</span> <span class="nc">KWImpl</span> <span class="o">{</span>
<span class="k">type</span> <span class="kt">T</span><span class="o">[</span><span class="kt">W</span>, <span class="kt">A</span>, <span class="kt">B</span><span class="o">]</span>
<span class="k">def</span> <span class="n">subst</span><span class="o">[</span><span class="kt">F</span><span class="o">[</span><span class="k">_</span><span class="o">[</span><span class="k">_</span>, <span class="k">_</span>, <span class="k">_</span><span class="o">]]](</span><span class="n">fk</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">KWConcrete</span><span class="o">])</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">T</span><span class="o">]</span>
<span class="o">}</span>
<span class="k">val</span> <span class="nc">KW</span><span class="k">:</span> <span class="kt">KWImpl</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">KWImpl</span> <span class="o">{</span>
<span class="k">type</span> <span class="kt">T</span><span class="o">[</span><span class="kt">W</span>, <span class="kt">A</span>, <span class="kt">B</span><span class="o">]</span> <span class="k">=</span> <span class="nc">KWConcrete</span><span class="o">[</span><span class="kt">W</span>, <span class="kt">A</span>, <span class="kt">B</span><span class="o">]</span>
<span class="k">override</span> <span class="k">def</span> <span class="n">subst</span><span class="o">[</span><span class="kt">F</span><span class="o">[</span><span class="k">_</span><span class="o">[</span><span class="k">_</span>, <span class="k">_</span>, <span class="k">_</span><span class="o">]]](</span><span class="n">fk</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">KWConcrete</span><span class="o">])</span> <span class="k">=</span> <span class="n">fk</span>
<span class="o">}</span>
<span class="k">type</span> <span class="kt">KW</span><span class="o">[</span><span class="kt">W</span>, <span class="kt">A</span>, <span class="kt">B</span><span class="o">]</span> <span class="k">=</span> <span class="nc">KW</span><span class="o">.</span><span class="n">T</span><span class="o">[</span><span class="kt">W</span>, <span class="kt">A</span>, <span class="kt">B</span><span class="o">]</span>
</pre></div>
<p>This is nice for a few reasons.</p>
<ol>
<li>You can still “add a type parameter” to do abstraction on your
tagged types.</li>
<li>You can hide much of the complexity of a monad transformer stack,
allowing it to infer more
easily
<a href="http://typelevel.org/blog/2013/09/11/using-scalaz-Unapply.html">with <code>Unapply</code></a> or
<a href="http://www.scala-lang.org/news/2.12.0#partial-unification-for-type-constructor-inference"><code>-Ypartial-unification</code></a>. This
is because, unlike standalone type aliases, <code>scalac</code> can’t dealias
your abstraction away. (Warning: this doesn’t apply if you make the
type <code>T</code> “translucent”; hide your types to keep them safe from
<code>scalac</code>’s prying expander.)</li>
<li>You can use <code>subst</code>
to
<a href="https://downloads.haskell.org/~ghc/8.0.2/docs/html/users_guide/glasgow_exts.html#generalised-derived-instances-for-newtypes">“GND”</a> your
<code>Monad</code> and other typeclass instances.</li>
</ol>
<div class="codehilite language-scala"><pre><span></span><span class="k">implicit</span> <span class="k">def</span> <span class="n">monadKW</span><span class="o">[</span><span class="kt">W:</span> <span class="kt">Monoid</span>, <span class="kt">A</span><span class="o">]</span><span class="k">:</span> <span class="kt">Monad</span><span class="o">[</span><span class="kt">KW</span><span class="o">[</span><span class="kt">W</span>, <span class="kt">A</span>, <span class="kt">?</span><span class="o">]]</span> <span class="k">=</span> <span class="o">{</span>
<span class="k">type</span> <span class="kt">MF</span><span class="o">[</span><span class="kt">KWC</span><span class="o">[</span><span class="k">_</span>, <span class="k">_</span>, <span class="k">_</span><span class="o">]]</span> <span class="k">=</span> <span class="nc">Monad</span><span class="o">[</span><span class="kt">KWC</span><span class="o">[</span><span class="kt">W</span>, <span class="kt">A</span>, <span class="kt">?</span><span class="o">]]</span>
<span class="c1">// KW.subst[MF](implicitly) with better inference</span>
<span class="nc">KW</span><span class="o">.</span><span class="n">subst</span><span class="o">[</span><span class="kt">MF</span><span class="o">](</span><span class="nc">Kleisli</span><span class="o">.</span><span class="n">kleisliMonadReader</span><span class="o">[(</span><span class="kt">W</span>, <span class="kt">?</span><span class="o">)</span>, <span class="kt">A</span><span class="o">])</span>
<span class="o">}</span>
</pre></div>
<p><a href="/2016/12/tagless-final-effects-la-ermine-writers.html">“Tagless final effects à la Ermine Writers”</a> develops
this kind of type abstraction in another direction.</p>
<p>For the derivation of <code>subst</code>’s weird signature above,
see
<a href="http://typelevel.org/blog/2014/09/20/higher_leibniz.html#higher-kinded-leibniz">“Higher Leibniz”</a>.</p>
<h2 id="markdown-header-why-is-the-labelimpl-ascription-so-important">Why is the <code>: LabelImpl</code> ascription so important?</h2>
<p>Suppose that you ignored my comments and defined the concrete
<code>LabelImpl</code> without an ascription.</p>
<div class="codehilite language-scala"><pre><span></span><span class="k">val</span> <span class="nc">Label</span> <span class="k">=</span> <span class="k">new</span> <span class="nc">LabelImpl</span> <span class="o">{</span>
<span class="c1">// ...implementation continues as before</span>
</pre></div>
<p>Then, the abstraction would disappear; you would no longer have a “new
type”.</p>
<div class="codehilite language-scala"><pre><span></span><span class="n">scala</span><span class="o">></span> <span class="k">val</span> <span class="n">lbl</span><span class="k">:</span> <span class="kt">Label</span> <span class="o">=</span> <span class="s">"hi"</span>
<span class="n">lbl</span><span class="k">:</span> <span class="kt">Label</span> <span class="o">=</span> <span class="n">hi</span>
<span class="n">scala</span><span class="o">></span> <span class="n">lbl</span><span class="k">:</span> <span class="kt">String</span>
<span class="n">res0</span><span class="k">:</span> <span class="kt">String</span> <span class="o">=</span> <span class="n">hi</span>
<span class="n">scala</span><span class="o">></span> <span class="n">implicitly</span><span class="o">[</span><span class="kt">Label</span> <span class="kt">=:=</span> <span class="kt">String</span><span class="o">]</span>
<span class="n">res1</span><span class="k">:</span> <span class="o">=:=[</span><span class="kt">Label</span>,<span class="kt">String</span><span class="o">]</span> <span class="k">=</span> <span class="o"><</span><span class="n">function1</span><span class="o">></span>
</pre></div>
<p>Why did it break so hard? Well, the inferred type of <code>val Label</code> is
different from the one you were ascribing.</p>
<div class="codehilite language-scala"><pre><span></span><span class="n">scala</span><span class="o">></span> <span class="nc">Label</span>
<span class="n">res2</span><span class="k">:</span> <span class="kt">LabelImpl</span><span class="o">{</span><span class="k">type</span> <span class="kt">T</span> <span class="o">=</span> <span class="kt">String</span><span class="o">}</span> <span class="k">=</span> <span class="n">hcavsc</span><span class="o">.</span><span class="n">broken</span><span class="o">.</span><span class="nc">Labels$$anon$1</span><span class="k">@</span><span class="mi">48</span><span class="n">cd7b32</span>
</pre></div>
<p>That means that <code>Label.T</code> is no longer <strong>existential</strong>; it’s known,
and known to be <code>String</code>. Accordingly, type <code>Label</code> <em>also</em> expands to
<code>String</code>, <em>and vice versa</em>.</p>
<p>If you want it a new type, you must keep it existential.</p>
<h2 id="markdown-header-some-background">Some background</h2>
<p>The unboxed tagging technique is based
on <a href="https://github.com/scalaz/scalaz/pull/1306">cast-free type tags</a>
in the upcoming Scalaz 7.3.0. That, in turn, was based
on
<a href="https://bitbucket.org/ermine-language/ermine-scala/pull-requests/1/remove-dynamicf-from-f0-formats">use of existential types in Ermine's implementation</a> to
hide expansions from <code>scalac</code>.</p>
<p>This is also a specialization of the type-member based MTL encoding I
used
in
<a href="/2016/12/tagless-final-effects-la-ermine-writers.html">"Tagless final effects à la Ermine Writers"</a>. The
essential difference is that individual program elements were
universally quantified over the expansion of the abstract type, where
here, the entire program is universally quantified over that
expansion, because the existential quantifier is globally bound.</p>
<p>I’m certainly not the first person to explore this technique; for
example,
<a href="https://groups.google.com/d/msg/scala-user/GvzJAUDq1e0/mdk8MdfTAwAJ">Julian Michael wrote about it</a> several
months before this article.</p>
<p>And, of course, if you are an ML (OCaml, SML, &c) fan, you’re probably
thinking “yeah, so what? <a href="https://realworldocaml.org/v1/en/html/files-modules-and-programs.html#nested-modules">I do this all the time.</a>” Sorry. We can be a
little slow on the uptake in Scala world, where we greatly undervalue
the ideas of the functional languages before us.</p>
<p><em>This article was tested with Scala 2.12.1, Scalaz 7.2.10, Scala.js
0.6.13, and Kind Projector 0.9.3. The code
is
<a href="https://code.launchpad.net/~scompall/+junk/high-cost-of-anyval-subclasses">available in compilable form for your own experiments via Bazaar</a>.</em></p>Stephen Compallhttp://www.blogger.com/profile/13346366374080232972noreply@blogger.com2tag:blogger.com,1999:blog-1184549185438247550.post-2583915808556347712017-04-09T19:06:00.000-04:002018-02-10T16:28:52.628-05:00The High Cost of AnyVal subclasses...<p>The claim of a multi-paradigm language is to harmoniously serve
various approaches to programming. The <code>AnyVal</code> subclass feature forms
a strong counterargument to Scala’s multiparadigm claim.</p>
<p><code>AnyVal</code> subclasses penalize parametric-polymorphic, type-safe
programming, in order to better support type-unsafe programming
styles, such as those making use of <code>isInstanceOf</code>. They sneakily
shift the blame for their performance problems onto type safety and
polymorphism. I will provide an existence proof that the blame ought
to land squarely on <code>AnyVal</code> subclasses, but I cannot stop this
blame-shifting from lending further credence
to
<a href="https://www.reddit.com/r/haskell/comments/1pjjy5/odersky_the_trouble_with_types_strange_loop_2013/cd3bgcu/">the witticism “If scala was the only language I had to think in, I’d think functional programming was a bad idea that didn’t scale, too.”</a></p>
<p>Moreover, by creating the false impression that the “newtype problem”
has been solved in Scala, <code>AnyVal</code> subclasses obscure solutions that
better serve polymorphic, type-safe programming. While I describe such
a solution in this article, I have no illusions that I alone can
reverse the upward trend of the <code>AnyVal</code> meme.</p>
<p>Scala, today, has the potential to better support type-safe
programming, and it has since before the advent of <code>AnyVal</code>
subclasses. In this article, we will focus on how the language could
reveal this potential, becoming a better foundation for polymorphic,
type-safe programming than it advertises today.</p>
<h2 id="markdown-header-a-string-reference-must-be-boxed">A <code>String</code> reference must be boxed</h2>
<p>Suppose that you want a “wrapper” around <code>String</code>s with a unique type
so that they can’t be accidentally confused with arbitrary
<code>String</code>s. This is a common use case for a <em>newtype</em>, a wrapper with
intentionally incompatible type that exists only at compile time. (The
name <a href="https://wiki.haskell.org/Newtype">“newtype”</a> comes from the
Haskell keyword for its version of this feature.)</p>
<p>You decide to use <code>extends AnyVal</code>, since you have heard that this is
a compile-time-only class that doesn’t get allocated on the heap.</p>
<div class="codehilite language-scala"><pre><span></span><span class="k">class</span> <span class="nc">Label</span><span class="o">(</span><span class="k">val</span> <span class="n">str</span><span class="k">:</span> <span class="kt">String</span><span class="o">)</span> <span class="k">extends</span> <span class="nc">AnyVal</span>
<span class="k">object</span> <span class="nc">Label</span> <span class="o">{</span>
<span class="k">def</span> <span class="n">apply</span><span class="o">(</span><span class="n">s</span><span class="k">:</span> <span class="kt">String</span><span class="o">)</span><span class="k">:</span> <span class="kt">Label</span> <span class="o">=</span>
<span class="k">new</span> <span class="nc">Label</span><span class="o">(</span><span class="n">s</span><span class="o">)</span>
<span class="o">}</span>
</pre></div>
<p>This seems to do the trick with your first several tests.</p>
<div class="codehilite language-scala"><pre><span></span><span class="k">class</span> <span class="nc">MyFirstTests</span> <span class="o">{</span>
<span class="k">def</span> <span class="n">combineLabels</span><span class="o">(</span><span class="n">l</span><span class="k">:</span> <span class="kt">Label</span><span class="o">,</span> <span class="n">r</span><span class="k">:</span> <span class="kt">Label</span><span class="o">)</span><span class="k">:</span> <span class="kt">Label</span> <span class="o">=</span>
<span class="nc">Label</span><span class="o">(</span><span class="n">l</span><span class="o">.</span><span class="n">str</span> <span class="o">+</span> <span class="n">r</span><span class="o">.</span><span class="n">str</span><span class="o">)</span>
<span class="k">def</span> <span class="n">printLabels</span><span class="o">()</span><span class="k">:</span> <span class="kt">Unit</span> <span class="o">=</span> <span class="o">{</span>
<span class="k">val</span> <span class="n">fst</span> <span class="k">=</span> <span class="nc">Label</span><span class="o">(</span><span class="s">"hello"</span><span class="o">)</span>
<span class="k">val</span> <span class="n">snd</span> <span class="k">=</span> <span class="nc">Label</span><span class="o">(</span><span class="s">"world"</span><span class="o">)</span>
<span class="n">println</span><span class="o">(</span><span class="n">fst</span><span class="o">.</span><span class="n">str</span><span class="o">)</span>
<span class="n">println</span><span class="o">(</span><span class="n">snd</span><span class="o">.</span><span class="n">str</span><span class="o">)</span>
<span class="o">}</span>
<span class="o">}</span>
</pre></div>
<p>As reported by <code>javap</code>, the <code>new Label</code> goes away for <code>Label.apply</code>.</p>
<div class="codehilite language-java"><pre><span></span><span class="c1">// javap -c -cp target/scala-2.12/classes hcavsc.av.Label$</span>
<span class="kd">public</span> <span class="n">java</span><span class="o">.</span><span class="na">lang</span><span class="o">.</span><span class="na">String</span> <span class="nf">apply</span><span class="o">(</span><span class="n">java</span><span class="o">.</span><span class="na">lang</span><span class="o">.</span><span class="na">String</span><span class="o">);</span>
<span class="n">Code</span><span class="o">:</span>
<span class="mi">0</span><span class="o">:</span> <span class="n">aload_1</span>
<span class="mi">1</span><span class="o">:</span> <span class="n">areturn</span>
</pre></div>
<p>It vanishes for the signature of <code>combineLabels</code> too, meaning that we
can write some functions over <code>Label</code>s without allocating them.</p>
<div class="codehilite language-java"><pre><span></span><span class="c1">// javap -cp target/scala-2.12/classes hcavsc.av.MyFirstTests</span>
<span class="kd">public</span> <span class="n">java</span><span class="o">.</span><span class="na">lang</span><span class="o">.</span><span class="na">String</span> <span class="nf">combineLabels</span><span class="o">(</span><span class="n">java</span><span class="o">.</span><span class="na">lang</span><span class="o">.</span><span class="na">String</span><span class="o">,</span> <span class="n">java</span><span class="o">.</span><span class="na">lang</span><span class="o">.</span><span class="na">String</span><span class="o">);</span>
</pre></div>
<p>You can even use <code>Label</code> in a <code>case class</code>, and it will be <code>String</code> at
runtime.</p>
<div class="codehilite language-scala"><pre><span></span><span class="k">case</span> <span class="k">class</span> <span class="nc">Labelled</span><span class="o">[</span><span class="kt">A</span><span class="o">](</span><span class="n">lbl</span><span class="k">:</span> <span class="kt">Label</span><span class="o">,</span> <span class="n">a</span><span class="k">:</span> <span class="kt">A</span><span class="o">)</span>
<span class="c1">// javap -p -cp target/scala-2.12/classes hcavsc.av.Labelled</span>
<span class="k">private</span> <span class="k">final</span> <span class="n">java</span><span class="o">.</span><span class="n">lang</span><span class="o">.</span><span class="nc">String</span> <span class="n">lbl</span><span class="o">;</span>
<span class="k">private</span> <span class="k">final</span> <span class="n">A</span> <span class="n">a</span><span class="o">;</span>
</pre></div>
<p>But then, you decide that you want a <code>List</code> of <code>Label</code>s.</p>
<div class="codehilite language-scala"><pre><span></span><span class="c1">// add to printLabels</span>
<span class="k">val</span> <span class="n">lbls</span> <span class="k">=</span> <span class="nc">List</span><span class="o">(</span><span class="n">fst</span><span class="o">,</span> <span class="n">snd</span><span class="o">)</span>
<span class="c1">// javap -c -cp target/scala-2.12/classes hcavsc.av.MyFirstTests</span>
<span class="mi">24</span><span class="k">:</span> <span class="kt">iconst_2</span>
<span class="mi">25</span><span class="k">:</span> <span class="kt">anewarray</span> <span class="k">#</span><span class="err">56</span> <span class="c1">// class hcavsc/av/Label</span>
<span class="mi">28</span><span class="k">:</span> <span class="kt">dup</span>
<span class="mi">29</span><span class="k">:</span> <span class="kt">iconst_0</span>
<span class="mi">30</span><span class="k">:</span> <span class="kt">new</span> <span class="k">#</span><span class="err">56</span> <span class="c1">// class hcavsc/av/Label</span>
<span class="mi">33</span><span class="k">:</span> <span class="kt">dup</span>
<span class="mi">34</span><span class="k">:</span> <span class="kt">aload_1</span>
<span class="mi">35</span><span class="k">:</span> <span class="kt">invokespecial</span> <span class="k">#</span><span class="err">59</span> <span class="c1">// Method hcavsc/av/Label."<init>":(Ljava/lang/String;)V</span>
<span class="mi">38</span><span class="k">:</span> <span class="kt">aastore</span>
<span class="mi">39</span><span class="k">:</span> <span class="kt">dup</span>
<span class="mi">40</span><span class="k">:</span> <span class="kt">iconst_1</span>
<span class="mi">41</span><span class="k">:</span> <span class="kt">new</span> <span class="k">#</span><span class="err">56</span> <span class="c1">// class hcavsc/av/Label</span>
<span class="mi">44</span><span class="k">:</span> <span class="kt">dup</span>
<span class="mi">45</span><span class="k">:</span> <span class="kt">aload_2</span>
<span class="mi">46</span><span class="k">:</span> <span class="kt">invokespecial</span> <span class="k">#</span><span class="err">59</span> <span class="c1">// Method hcavsc/av/Label."<init>":(Ljava/lang/String;)V</span>
<span class="mi">49</span><span class="k">:</span> <span class="kt">aastore</span>
<span class="mi">50</span><span class="k">:</span> <span class="kt">invokevirtual</span> <span class="k">#</span><span class="err">63</span> <span class="c1">// Method scala/Predef$.genericWrapArray:(Ljava/lang/Object;)Lscala/collection/mutable/WrappedArray;</span>
<span class="mi">53</span><span class="k">:</span> <span class="kt">invokevirtual</span> <span class="k">#</span><span class="err">66</span> <span class="c1">// Method scala/collection/immutable/List$.apply:(Lscala/collection/Seq;)Lscala/collection/immutable/List;</span>
</pre></div>
<p>Huh. Didn’t expect those two <code>new</code>s to be there. Ah well, maybe now
that they’re in the list,</p>
<div class="codehilite language-scala"><pre><span></span><span class="n">lbls</span><span class="o">.</span><span class="n">map</span><span class="o">{</span><span class="n">x</span> <span class="k">=></span> <span class="nc">Label</span><span class="o">(</span><span class="n">x</span><span class="o">.</span><span class="n">str</span> <span class="o">+</span> <span class="s">"Aux"</span><span class="o">)}</span>
<span class="c1">// javap -c -cp target/scala-2.12/classes hcavsc.av.MyFirstTests</span>
<span class="n">public</span> <span class="n">static</span> <span class="k">final</span> <span class="n">java</span><span class="o">.</span><span class="n">lang</span><span class="o">.</span><span class="nc">Object</span> <span class="nc">$anonfun$printLabels$1$adapted</span><span class="o">(</span><span class="n">java</span><span class="o">.</span><span class="n">lang</span><span class="o">.</span><span class="nc">Object</span><span class="o">);</span>
<span class="nc">Code</span><span class="k">:</span>
<span class="err">0</span><span class="kt">:</span> <span class="kt">new</span> <span class="k">#</span><span class="err">61</span> <span class="c1">// class hcavsc/av/Label</span>
<span class="mi">3</span><span class="k">:</span> <span class="kt">dup</span>
<span class="mi">4</span><span class="k">:</span> <span class="kt">aload_0</span>
<span class="mi">5</span><span class="k">:</span> <span class="kt">checkcast</span> <span class="k">#</span><span class="err">61</span> <span class="c1">// class hcavsc/av/Label</span>
<span class="mi">8</span><span class="k">:</span> <span class="kt">invokevirtual</span> <span class="k">#</span><span class="err">117</span> <span class="c1">// Method hcavsc/av/Label.str:()Ljava/lang/String;</span>
<span class="mi">11</span><span class="k">:</span> <span class="kt">invokestatic</span> <span class="k">#</span><span class="err">119</span> <span class="c1">// Method $anonfun$printLabels$1:(Ljava/lang/String;)Ljava/lang/String;</span>
<span class="mi">14</span><span class="k">:</span> <span class="kt">invokespecial</span> <span class="k">#</span><span class="err">64</span> <span class="c1">// Method hcavsc/av/Label."<init>":(Ljava/lang/String;)V</span>
<span class="mi">17</span><span class="k">:</span> <span class="kt">areturn</span>
</pre></div>
<p>OK, sure, so you took it out and put it back, so it unboxed and then
boxed again. How about a tuple, instead?</p>
<div class="codehilite language-scala"><pre><span></span><span class="c1">// add to printLabels</span>
<span class="o">(</span><span class="n">fst</span><span class="o">,</span> <span class="n">snd</span><span class="o">)</span>
<span class="c1">// javap -c -cp target/scala-2.12/classes hcavsc.av.MyFirstTests</span>
<span class="mi">73</span><span class="k">:</span> <span class="kt">new</span> <span class="k">#</span><span class="err">103</span> <span class="c1">// class scala/Tuple2</span>
<span class="mi">76</span><span class="k">:</span> <span class="kt">dup</span>
<span class="mi">77</span><span class="k">:</span> <span class="kt">new</span> <span class="k">#</span><span class="err">61</span> <span class="c1">// class hcavsc/av/Label</span>
<span class="mi">80</span><span class="k">:</span> <span class="kt">dup</span>
<span class="mi">81</span><span class="k">:</span> <span class="kt">aload_1</span>
<span class="mi">82</span><span class="k">:</span> <span class="kt">invokespecial</span> <span class="k">#</span><span class="err">64</span> <span class="c1">// Method hcavsc/av/Label."<init>":(Ljava/lang/String;)V</span>
<span class="mi">85</span><span class="k">:</span> <span class="kt">new</span> <span class="k">#</span><span class="err">61</span> <span class="c1">// class hcavsc/av/Label</span>
<span class="mi">88</span><span class="k">:</span> <span class="kt">dup</span>
<span class="mi">89</span><span class="k">:</span> <span class="kt">aload_2</span>
<span class="mi">90</span><span class="k">:</span> <span class="kt">invokespecial</span> <span class="k">#</span><span class="err">64</span> <span class="c1">// Method hcavsc/av/Label."<init>":(Ljava/lang/String;)V</span>
<span class="mi">93</span><span class="k">:</span> <span class="kt">invokespecial</span> <span class="k">#</span><span class="err">106</span> <span class="c1">// Method scala/Tuple2."<init>":(Ljava/lang/Object;Ljava/lang/Object;)Vf</span>
</pre></div>
<p>Two more <code>new</code>s. Fine. How about the <code>identity</code> method?</p>
<div class="codehilite language-scala"><pre><span></span><span class="c1">// add to printLabels</span>
<span class="n">identity</span><span class="o">(</span><span class="n">fst</span><span class="o">)</span>
<span class="c1">// javap -c -cp target/scala-2.12/classes hcavsc.av.MyFirstTests</span>
<span class="mi">97</span><span class="k">:</span> <span class="kt">getstatic</span> <span class="k">#</span><span class="err">59</span> <span class="c1">// Field scala/Predef$.MODULE$:Lscala/Predef$;</span>
<span class="mi">100</span><span class="k">:</span> <span class="kt">new</span> <span class="k">#</span><span class="err">61</span> <span class="c1">// class hcavsc/av/Label</span>
<span class="mi">103</span><span class="k">:</span> <span class="kt">dup</span>
<span class="mi">104</span><span class="k">:</span> <span class="kt">aload_1</span>
<span class="mi">105</span><span class="k">:</span> <span class="kt">invokespecial</span> <span class="k">#</span><span class="err">64</span> <span class="c1">// Method hcavsc/av/Label."<init>":(Ljava/lang/String;)V</span>
<span class="mi">108</span><span class="k">:</span> <span class="kt">invokevirtual</span> <span class="k">#</span><span class="err">109</span> <span class="c1">// Method scala/Predef$.identity:(Ljava/lang/Object;)Ljava/lang/Object;</span>
</pre></div>
<p>So there seems to be an impressive collection of things that will
cause an <code>AnyVal</code> subclass to box. You assume there’s a good reason
they implemented it this way;
we’ll
<a href="#markdown-header-what-can-you-do-with-a-box-what-can-you-do-without-a-box">get into that later</a>.</p>
<h2 id="markdown-header-no-boxing-with-type-tags">No boxing with type tags</h2>
<p>However, you decide to look for an alternative <code>newtype</code> mechanism
that doesn’t box, under the theory that <code>scalac</code>’s reasons for boxing
<code>AnyVal</code> subclasses don’t apply to the use cases you have in mind for
<code>Label</code> and similar things in your codebase.</p>
<p>You have heard that Scalaz’s “type tags” are a kind of newtype with no
boxing. You could just pull in <code>scalaz-core</code> and see if you can get
them to work, but decide to implement <code>Label</code> directly using the same
technique as Scalaz tags, instead.</p>
<div class="codehilite language-scala"><pre><span></span><span class="k">object</span> <span class="nc">Labels</span> <span class="o">{</span>
<span class="k">sealed</span> <span class="k">abstract</span> <span class="k">class</span> <span class="nc">LabelImpl</span> <span class="o">{</span>
<span class="k">type</span> <span class="kt">T</span>
<span class="k">def</span> <span class="n">apply</span><span class="o">(</span><span class="n">s</span><span class="k">:</span> <span class="kt">String</span><span class="o">)</span><span class="k">:</span> <span class="kt">T</span>
<span class="k">def</span> <span class="n">unwrap</span><span class="o">(</span><span class="n">lbl</span><span class="k">:</span> <span class="kt">T</span><span class="o">)</span><span class="k">:</span> <span class="kt">String</span>
<span class="o">}</span>
<span class="c1">// do not forget `: LabelImpl`; it is key</span>
<span class="k">val</span> <span class="nc">Label</span><span class="k">:</span> <span class="kt">LabelImpl</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">LabelImpl</span> <span class="o">{</span>
<span class="k">type</span> <span class="kt">T</span> <span class="o">=</span> <span class="nc">String</span>
<span class="k">override</span> <span class="k">def</span> <span class="n">apply</span><span class="o">(</span><span class="n">s</span><span class="k">:</span> <span class="kt">String</span><span class="o">)</span> <span class="k">=</span> <span class="n">s</span>
<span class="k">override</span> <span class="k">def</span> <span class="n">unwrap</span><span class="o">(</span><span class="n">lbl</span><span class="k">:</span> <span class="kt">T</span><span class="o">)</span> <span class="k">=</span> <span class="n">lbl</span>
<span class="o">}</span>
<span class="k">type</span> <span class="kt">Label</span> <span class="o">=</span> <span class="nc">Label</span><span class="o">.</span><span class="n">T</span>
<span class="o">}</span>
<span class="k">import</span> <span class="nn">Labels._</span>
</pre></div>
<p>While regretting that the compiler no longer makes your <code>Label</code> type
very convenient to define, you press on. First, to confirm, you can’t
treat an arbitrary <code>String</code> as a <code>Label</code>:</p>
<div class="codehilite language-scala"><pre><span></span><span class="n">scala</span><span class="o">></span> <span class="s">"hi there"</span><span class="k">:</span> <span class="kt">Label</span>
<span class="o"><</span><span class="n">console</span><span class="k">>:</span><span class="mi">15</span><span class="k">:</span> <span class="kt">error:</span> <span class="k">type</span> <span class="kt">mismatch</span><span class="o">;</span>
<span class="n">found</span> <span class="k">:</span> <span class="kt">String</span><span class="o">(</span><span class="err">"</span><span class="kt">hi</span> <span class="kt">there</span><span class="err">"</span><span class="o">)</span>
<span class="kt">required:</span> <span class="kt">hcavsc.subst.Labels.Label</span>
<span class="o">(</span><span class="n">which</span> <span class="n">expands</span> <span class="n">to</span><span class="o">)</span> <span class="n">hcavsc</span><span class="o">.</span><span class="n">subst</span><span class="o">.</span><span class="nc">Labels</span><span class="o">.</span><span class="nc">Label</span><span class="o">.</span><span class="n">T</span>
<span class="s">"hi there"</span><span class="k">:</span> <span class="kt">Label</span>
<span class="o">^</span>
</pre></div>
<p>So far, so good. Then, why not retry some of the earlier experiments
that caused the <code>AnyVal</code>-based label to box?</p>
<div class="codehilite language-scala"><pre><span></span><span class="c1">// javap -c -cp target/scala-2.12/classes hcavsc.subst.MyFirstTests</span>
<span class="k">val</span> <span class="n">fst</span> <span class="k">=</span> <span class="nc">Label</span><span class="o">(</span><span class="s">"hello"</span><span class="o">)</span>
<span class="k">val</span> <span class="n">snd</span> <span class="k">=</span> <span class="nc">Label</span><span class="o">(</span><span class="s">"world"</span><span class="o">)</span>
<span class="n">identity</span><span class="o">(</span><span class="n">fst</span><span class="o">)</span>
<span class="mi">24</span><span class="k">:</span> <span class="kt">getstatic</span> <span class="k">#</span><span class="err">43</span> <span class="c1">// Field scala/Predef$.MODULE$:Lscala/Predef$;</span>
<span class="mi">27</span><span class="k">:</span> <span class="kt">aload_1</span>
<span class="mi">28</span><span class="k">:</span> <span class="kt">invokevirtual</span> <span class="k">#</span><span class="err">47</span> <span class="c1">// Method scala/Predef$.identity:(Ljava/lang/Object;)Ljava/lang/Object;</span>
<span class="o">(</span><span class="n">fst</span><span class="o">,</span> <span class="n">snd</span><span class="o">)</span>
<span class="mi">32</span><span class="k">:</span> <span class="kt">new</span> <span class="k">#</span><span class="err">49</span> <span class="c1">// class scala/Tuple2</span>
<span class="mi">35</span><span class="k">:</span> <span class="kt">dup</span>
<span class="mi">36</span><span class="k">:</span> <span class="kt">aload_1</span>
<span class="mi">37</span><span class="k">:</span> <span class="kt">aload_2</span>
<span class="mi">38</span><span class="k">:</span> <span class="kt">invokespecial</span> <span class="k">#</span><span class="err">53</span> <span class="c1">// Method scala/Tuple2."<init>":(Ljava/lang/Object;Ljava/lang/Object;)V</span>
<span class="k">val</span> <span class="n">lbls</span> <span class="k">=</span> <span class="nc">List</span><span class="o">(</span><span class="n">fst</span><span class="o">,</span> <span class="n">snd</span><span class="o">)</span>
<span class="mi">48</span><span class="k">:</span> <span class="kt">iconst_2</span>
<span class="mi">49</span><span class="k">:</span> <span class="kt">anewarray</span> <span class="k">#</span><span class="err">4</span> <span class="c1">// class java/lang/Object</span>
<span class="mi">52</span><span class="k">:</span> <span class="kt">dup</span>
<span class="mi">53</span><span class="k">:</span> <span class="kt">iconst_0</span>
<span class="mi">54</span><span class="k">:</span> <span class="kt">aload_1</span>
<span class="mi">55</span><span class="k">:</span> <span class="kt">aastore</span>
<span class="mi">56</span><span class="k">:</span> <span class="kt">dup</span>
<span class="mi">57</span><span class="k">:</span> <span class="kt">iconst_1</span>
<span class="mi">58</span><span class="k">:</span> <span class="kt">aload_2</span>
<span class="mi">59</span><span class="k">:</span> <span class="kt">aastore</span>
<span class="mi">60</span><span class="k">:</span> <span class="kt">invokevirtual</span> <span class="k">#</span><span class="err">62</span> <span class="c1">// Method scala/Predef$.genericWrapArray:(Ljava/lang/Object;)Lscala/collection/mutable/WrappedArray;</span>
<span class="mi">63</span><span class="k">:</span> <span class="kt">invokevirtual</span> <span class="k">#</span><span class="err">65</span> <span class="c1">// Method scala/collection/immutable/List$.apply:(Lscala/collection/Seq;)Lscala/collection/immutable/List;</span>
<span class="n">lbls</span><span class="o">.</span><span class="n">map</span><span class="o">{</span><span class="n">x</span> <span class="k">=></span> <span class="nc">Label</span><span class="o">(</span><span class="nc">Label</span><span class="o">.</span><span class="n">unwrap</span><span class="o">(</span><span class="n">x</span><span class="o">)</span> <span class="o">+</span> <span class="s">"Aux"</span><span class="o">)}</span>
<span class="n">public</span> <span class="n">static</span> <span class="k">final</span> <span class="n">java</span><span class="o">.</span><span class="n">lang</span><span class="o">.</span><span class="nc">Object</span> <span class="nc">$anonfun$printLabels$1</span><span class="o">(</span><span class="n">java</span><span class="o">.</span><span class="n">lang</span><span class="o">.</span><span class="nc">Object</span><span class="o">);</span>
<span class="nc">Code</span><span class="k">:</span>
<span class="err">0</span><span class="kt">:</span> <span class="kt">getstatic</span> <span class="k">#</span><span class="err">26</span> <span class="c1">// Field hcavsc/subst/Labels$.MODULE$:Lhcavsc/subst/Labels$;</span>
<span class="mi">3</span><span class="k">:</span> <span class="kt">invokevirtual</span> <span class="k">#</span><span class="err">30</span> <span class="c1">// Method hcavsc/subst/Labels$.Label:()Lhcavsc/subst/Labels$LabelImpl;</span>
<span class="mi">6</span><span class="k">:</span> <span class="kt">new</span> <span class="k">#</span><span class="err">104</span> <span class="c1">// class java/lang/StringBuilder</span>
<span class="mi">9</span><span class="k">:</span> <span class="kt">dup</span>
<span class="mi">10</span><span class="k">:</span> <span class="kt">invokespecial</span> <span class="k">#</span><span class="err">106</span> <span class="c1">// Method java/lang/StringBuilder."<init>":()V</span>
<span class="mi">13</span><span class="k">:</span> <span class="kt">getstatic</span> <span class="k">#</span><span class="err">26</span> <span class="c1">// Field hcavsc/subst/Labels$.MODULE$:Lhcavsc/subst/Labels$;</span>
<span class="mi">16</span><span class="k">:</span> <span class="kt">invokevirtual</span> <span class="k">#</span><span class="err">30</span> <span class="c1">// Method hcavsc/subst/Labels$.Label:()Lhcavsc/subst/Labels$LabelImpl;</span>
<span class="mi">19</span><span class="k">:</span> <span class="kt">aload_0</span>
<span class="mi">20</span><span class="k">:</span> <span class="kt">invokevirtual</span> <span class="k">#</span><span class="err">110</span> <span class="c1">// Method hcavsc/subst/Labels$LabelImpl.unwrap:(Ljava/lang/Object;)Ljava/lang/String;</span>
<span class="mi">23</span><span class="k">:</span> <span class="kt">invokevirtual</span> <span class="k">#</span><span class="err">114</span> <span class="c1">// Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;</span>
<span class="mi">26</span><span class="k">:</span> <span class="kt">ldc</span> <span class="k">#</span><span class="err">116</span> <span class="c1">// String Aux</span>
<span class="mi">28</span><span class="k">:</span> <span class="kt">invokevirtual</span> <span class="k">#</span><span class="err">114</span> <span class="c1">// Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;</span>
<span class="mi">31</span><span class="k">:</span> <span class="kt">invokevirtual</span> <span class="k">#</span><span class="err">120</span> <span class="c1">// Method java/lang/StringBuilder.toString:()Ljava/lang/String;</span>
<span class="mi">34</span><span class="k">:</span> <span class="kt">invokevirtual</span> <span class="k">#</span><span class="err">36</span> <span class="c1">// Method hcavsc/subst/Labels$LabelImpl.apply:(Ljava/lang/String;)Ljava/lang/Object;</span>
<span class="mi">37</span><span class="k">:</span> <span class="kt">areturn</span>
</pre></div>
<p>No allocation there. Hmm. Well, maybe our concrete <code>LabelImpl</code>
instance is doing some secret boxing?</p>
<div class="codehilite language-java"><pre><span></span><span class="c1">// javap -c -cp target/scala-2.12/classes 'hcavsc.subst.Labels$$anon$1'</span>
<span class="kd">public</span> <span class="n">java</span><span class="o">.</span><span class="na">lang</span><span class="o">.</span><span class="na">String</span> <span class="nf">apply</span><span class="o">(</span><span class="n">java</span><span class="o">.</span><span class="na">lang</span><span class="o">.</span><span class="na">String</span><span class="o">);</span>
<span class="n">Code</span><span class="o">:</span>
<span class="mi">0</span><span class="o">:</span> <span class="n">aload_1</span>
<span class="mi">1</span><span class="o">:</span> <span class="n">areturn</span>
<span class="kd">public</span> <span class="n">java</span><span class="o">.</span><span class="na">lang</span><span class="o">.</span><span class="na">Object</span> <span class="nf">apply</span><span class="o">(</span><span class="n">java</span><span class="o">.</span><span class="na">lang</span><span class="o">.</span><span class="na">String</span><span class="o">);</span>
<span class="n">Code</span><span class="o">:</span>
<span class="mi">0</span><span class="o">:</span> <span class="n">aload_0</span>
<span class="mi">1</span><span class="o">:</span> <span class="n">aload_1</span>
<span class="mi">2</span><span class="o">:</span> <span class="n">invokevirtual</span> <span class="err">#</span><span class="mi">27</span> <span class="c1">// Method apply:(Ljava/lang/String;)Ljava/lang/String;</span>
<span class="mi">5</span><span class="o">:</span> <span class="n">areturn</span>
<span class="kd">public</span> <span class="n">java</span><span class="o">.</span><span class="na">lang</span><span class="o">.</span><span class="na">String</span> <span class="nf">unwrap</span><span class="o">(</span><span class="n">java</span><span class="o">.</span><span class="na">lang</span><span class="o">.</span><span class="na">String</span><span class="o">);</span>
<span class="n">Code</span><span class="o">:</span>
<span class="mi">0</span><span class="o">:</span> <span class="n">aload_1</span>
<span class="mi">1</span><span class="o">:</span> <span class="n">areturn</span>
<span class="kd">public</span> <span class="n">java</span><span class="o">.</span><span class="na">lang</span><span class="o">.</span><span class="na">String</span> <span class="nf">unwrap</span><span class="o">(</span><span class="n">java</span><span class="o">.</span><span class="na">lang</span><span class="o">.</span><span class="na">Object</span><span class="o">);</span>
<span class="n">Code</span><span class="o">:</span>
<span class="mi">0</span><span class="o">:</span> <span class="n">aload_0</span>
<span class="mi">1</span><span class="o">:</span> <span class="n">aload_1</span>
<span class="mi">2</span><span class="o">:</span> <span class="n">checkcast</span> <span class="err">#</span><span class="mi">21</span> <span class="c1">// class java/lang/String</span>
<span class="mi">5</span><span class="o">:</span> <span class="n">invokevirtual</span> <span class="err">#</span><span class="mi">23</span> <span class="c1">// Method unwrap:(Ljava/lang/String;)Ljava/lang/String;</span>
<span class="mi">8</span><span class="o">:</span> <span class="n">areturn</span>
</pre></div>
<p>No boxing there. That makes sense; in that context, <code>Label</code> <em>is</em>
<code>String</code>; the fact that our <code>Label</code>-using code doesn’t know that is
irrelevant, because we hid that information using existential types.</p>
<p>So, it <em>is</em> possible to have a <code>newtype</code> mechanism that doesn’t
box. You don’t have to wait for the JVM to
deliver
<a href="http://openjdk.java.net/jeps/169">its own brand of value types</a>; you
can even implement it yourself, in Scala, today. They must have had
another reason for all this boxing, because “we have to because JVM”
is denied by the behavior of Scala-JVM itself.</p>
<p>You aren’t sure what those reasons are, but you decide to port the
rest of your code to use the existential <code>Label</code>. Befitting an unboxed
newtype, the runtime representation of <code>List[Label]</code> is exactly the
same as the underlying <code>List[String]</code>, as well as every <code>Option</code>,
<code>Either</code>, and whatever else you can think up.</p>
<p>You notice that the erasure for <code>Label</code> is different, but this seems
significantly less serious than the boxing problem, so leave it for
now. (We
will
<a href="/2017/04/and-glorious-subst-to-come.html#markdown-header-t-string-translucency">dig into related design decisions later</a>.)</p>
<h2 id="markdown-header-what-can-you-do-with-a-box-what-can-you-do-without-a-box">What can you do with a box? What can you do without a box?</h2>
<p>Let’s start with a quick comparison of boxing <code>AnyVal</code> and the “type
tagging” mechanism we’ve just seen.</p>
<table>
<thead>
<tr>
<th>Capability</th>
<th><code>AnyVal</code> subclass</th>
<th>Type tag</th>
</tr>
</thead>
<tbody>
<tr>
<td>Defining methods</td>
<td>normal <code>override</code>; virtual method dispatch available</td>
<td><code>implicit class</code> enrichment only</td>
</tr>
<tr>
<td><code>lbl.getClass</code></td>
<td><code>Label</code></td>
<td><code>String</code></td>
</tr>
<tr>
<td>Cast <code>Any</code> to <code>Label</code></td>
<td>checked at runtime</td>
<td>unchecked; no wrapper left at runtime</td>
</tr>
<tr>
<td><code>isInstanceOf</code></td>
<td>checked at runtime</td>
<td>unchecked; same reason casting doesn’t work</td>
</tr>
<tr>
<td>Adding type parameters to methods</td>
<td>boxing/unbox penalty</td>
<td>no boxing penalty</td>
</tr>
<tr>
<td>Wrapping a <code>List</code></td>
<td>O(n): box every element and reallocate list itself</td>
<td>O(1), with <code>subst</code>: no allocation, output list <code>eq</code> to input list</td>
</tr>
<tr>
<td>Unwrapping a list</td>
<td>O(n): reallocate list, unbox each element</td>
<td>O(1): <code>eq</code> output with <code>subst</code>. Also possible to make unwrapping a <code><:</code> (free liftable automatic upcast)</td>
</tr>
<tr>
<td>Coinductive type class instances</td>
<td>works; boxing penalty applies</td>
<td>works; no boxing penalty</td>
</tr>
<tr>
<td>Wrapping whole program parts</td>
<td>each function must be wrapped to add per-value wrapping/unwrapping</td>
<td>O(1): just works with <code>subst</code></td>
</tr>
</tbody>
</table>
<p>I detect from this matrix a particular theme: <code>AnyVal</code> subclasses give
up a lot of capability in the type-safe arena. Consider rewriting a
loop that uses <code>Label</code> as state as a <code>foldLeft</code>: you must contend with
a new boxing/unboxing penalty, since the state parameter in a
<code>foldLeft</code> is type-parametric. It’s more fodder for the persistent
higher-order function skeptics among us.</p>
<p>While we know that adding type parameters to our functions improves
type-safety, the skeptic will note the boxing penalty, and attribute
it to parametric polymorphism. But we know the true culprit.</p>
<p>If <code>AnyVal</code> subclassing taxes type-safe programming in these ways,
what is it spending the money on? Simple: support for <code>isInstanceOf</code>,
“safe” casting, implementing interfaces, overriding <code>AnyRef</code> methods
like <code>toString</code>, and the like.</p>
<p>As type-safe, parametrically-polymorphic programmers, we <em>avoid</em> these
features, as a matter of principle and of practice. Some, like checked
casting, are simply not type-safe. Some ruin free theorems, like
<code>toString</code>, and we would prefer <em>safe</em> mechanisms, like the <code>Show</code>
typeclass, to actually tell us <em>at compile time</em> if our programs make
sense. Yet, if we use <code>AnyVal</code> subclasses, we have to pay the price
for all the programmers that wish to write type-unsafe code, like
<code>List[Any] => List[Label]</code>. All is not well in Multiparadigmatic Land.</p>
<h2 id="markdown-header-when-will-our-methods-be-resolved">When will our methods be resolved?</h2>
<p>To showcase the relationship of the two approaches to
runtime-reflective programming versus statically-proven programming,
let’s consider stringification.</p>
<p>Scala provides the <code>toString</code> virtual method on <code>Any</code>. Calling this
method is dynamically resolved on the value itself; it is as if every
value must carry around a pointer to a function that, given itself,
returns a <code>String</code>. We can define this for our original <code>AnyVal</code>-based
<code>Label</code>, and so <code>toString</code> on <code>List</code> et al will also work.</p>
<div class="codehilite language-scala"><pre><span></span><span class="c1">// add to class Label</span>
<span class="k">override</span> <span class="k">def</span> <span class="n">toString</span> <span class="k">=</span> <span class="s">s"Label(</span><span class="si">$str</span><span class="s">)"</span>
<span class="n">scala</span><span class="o">></span> <span class="nc">List</span><span class="o">(</span><span class="n">fst</span><span class="o">,</span> <span class="n">snd</span><span class="o">).</span><span class="n">toString</span>
<span class="n">res0</span><span class="k">:</span> <span class="kt">String</span> <span class="o">=</span> <span class="nc">List</span><span class="o">(</span><span class="nc">Label</span><span class="o">(</span><span class="n">hello</span><span class="o">),</span> <span class="nc">Label</span><span class="o">(</span><span class="n">world</span><span class="o">))</span>
<span class="n">scala</span><span class="o">></span> <span class="nc">Some</span><span class="o">(</span><span class="n">fst</span><span class="o">).</span><span class="n">toString</span>
<span class="n">res1</span><span class="k">:</span> <span class="kt">String</span> <span class="o">=</span> <span class="nc">Some</span><span class="o">(</span><span class="nc">Label</span><span class="o">(</span><span class="n">hello</span><span class="o">))</span>
</pre></div>
<p>Moreover, this “works” even for the type <code>List[Any]</code>.</p>
<div class="codehilite language-scala"><pre><span></span><span class="n">scala</span><span class="o">></span> <span class="nc">List</span><span class="o">[</span><span class="kt">Any</span><span class="o">](</span><span class="n">fst</span><span class="o">,</span> <span class="s">"hi"</span><span class="o">).</span><span class="n">toString</span>
<span class="n">res2</span><span class="k">:</span> <span class="kt">String</span> <span class="o">=</span> <span class="nc">List</span><span class="o">(</span><span class="nc">Label</span><span class="o">(</span><span class="n">hello</span><span class="o">),</span> <span class="n">hi</span><span class="o">)</span>
</pre></div>
<p>You cannot override <code>toString</code> for our fully-erased <code>Label</code>. After
all, every <code>Label</code> is just a <code>String</code> at runtime!
(<a href="http://typelevel.org/blog/2017/02/13/more-types-than-classes.html">Different types, same class.</a>)</p>
<p>However, the type-safe programmer will recognize <code>List[Any]</code> as a type
that, if it occurs in her program, means “something has gone wrong
with this program”. Moreover, because <code>toString</code> doesn’t make sense
for all types, we use a <em>static</em> mechanism,
like
<a href="https://github.com/scalaz/scalaz/blob/v7.2.10/core/src/main/scala/scalaz/Show.scala">the <code>scalaz.Show</code> typeclass</a>. And
this works fine for <code>Label</code>, because it is statically resolved by
type, not dependent on an implicit runtime member of every <code>Label</code>; in
fact, it can only work <em>because</em> it is static!</p>
<div class="codehilite language-scala"><pre><span></span><span class="c1">// add to object Labels</span>
<span class="k">import</span> <span class="nn">scalaz.Show</span>
<span class="k">implicit</span> <span class="k">val</span> <span class="n">showLabel</span><span class="k">:</span> <span class="kt">Show</span><span class="o">[</span><span class="kt">Label</span><span class="o">]</span> <span class="k">=</span>
<span class="nc">Show</span> <span class="n">shows</span> <span class="o">{</span><span class="n">lbl</span> <span class="k">=></span>
<span class="s">s"Label(</span><span class="si">${</span><span class="nc">Label</span><span class="o">.</span><span class="n">unwrap</span><span class="o">(</span><span class="n">lbl</span><span class="o">)</span><span class="si">}</span><span class="s">)"</span><span class="o">}</span>
<span class="n">scala</span><span class="o">></span> <span class="k">import</span> <span class="nn">scalaz.syntax.show._</span><span class="o">,</span> <span class="n">scalaz</span><span class="o">.</span><span class="n">std</span><span class="o">.</span><span class="n">list</span><span class="o">.</span><span class="k">_</span><span class="o">,</span>
<span class="n">scalaz</span><span class="o">.</span><span class="n">std</span><span class="o">.</span><span class="n">option</span><span class="o">.</span><span class="k">_</span>
<span class="n">scala</span><span class="o">></span> <span class="nc">List</span><span class="o">(</span><span class="n">fst</span><span class="o">,</span> <span class="n">snd</span><span class="o">).</span><span class="n">shows</span>
<span class="n">res1</span><span class="k">:</span> <span class="kt">String</span> <span class="o">=</span> <span class="o">[</span><span class="kt">Label</span><span class="o">(</span><span class="kt">hello</span><span class="o">)</span>,<span class="kt">Label</span><span class="o">(</span><span class="kt">world</span><span class="o">)]</span>
<span class="n">scala</span><span class="o">></span> <span class="n">some</span><span class="o">(</span><span class="n">fst</span><span class="o">).</span><span class="n">shows</span>
<span class="n">res2</span><span class="k">:</span> <span class="kt">String</span> <span class="o">=</span> <span class="nc">Some</span><span class="o">(</span><span class="nc">Label</span><span class="o">(</span><span class="n">hello</span><span class="o">))</span>
</pre></div>
<p>So if you are doing this kind of programming, it doesn’t matter
whether you can’t override <code>toString</code>, or type test, &c; you weren’t
doing it anyway. But, aside from a little performance bump, what do
you gain from unboxed type-tagging?</p>
<h2 id="markdown-header-when-is-a-label-a-string-when-is-it-not">When is a <code>Label</code> a <code>String</code>? When is it not?</h2>
<p>You notice that <code>subst</code> is at the foundation of several Scalaz
constructs like <code>Leibniz</code> and <code>Liskov</code>, and plays a prominent role in
the <code>Tag</code> API as well. You decide to add this to your <code>LabelImpl</code> as
well.</p>
<div class="codehilite language-scala"><pre><span></span><span class="c1">// in LabelImpl</span>
<span class="k">def</span> <span class="n">subst</span><span class="o">[</span><span class="kt">F</span><span class="o">[</span><span class="k">_</span><span class="o">]](</span><span class="n">fs</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">String</span><span class="o">])</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">T</span><span class="o">]</span>
<span class="c1">// and in val Label</span>
<span class="k">override</span> <span class="k">def</span> <span class="n">subst</span><span class="o">[</span><span class="kt">F</span><span class="o">[</span><span class="k">_</span><span class="o">]](</span><span class="n">fs</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">String</span><span class="o">])</span> <span class="k">=</span> <span class="n">fs</span>
</pre></div>
<p>It’s interesting that you can use this to tag a whole <code>List[String]</code>
in constant time:</p>
<div class="codehilite language-scala"><pre><span></span><span class="n">scala</span><span class="o">></span> <span class="k">val</span> <span class="n">taggedList</span> <span class="k">=</span> <span class="nc">Label</span><span class="o">.</span><span class="n">subst</span><span class="o">(</span><span class="nc">List</span><span class="o">(</span><span class="s">"hello"</span><span class="o">,</span> <span class="s">"world"</span><span class="o">))</span>
<span class="n">taggedList</span><span class="k">:</span> <span class="kt">List</span><span class="o">[</span><span class="kt">Label.T</span><span class="o">]</span> <span class="k">=</span> <span class="nc">List</span><span class="o">(</span><span class="n">hello</span><span class="o">,</span> <span class="n">world</span><span class="o">)</span>
</pre></div>
<p>It’s <em>also</em> interesting that you can use this to <em>untag</em> a whole list
in constant time.</p>
<div class="codehilite language-scala"><pre><span></span><span class="n">scala</span><span class="o">></span> <span class="nc">Label</span><span class="o">.</span><span class="n">subst</span><span class="o">[</span><span class="kt">Lambda</span><span class="o">[</span><span class="kt">x</span> <span class="k">=></span> <span class="kt">List</span><span class="o">[</span><span class="kt">x</span><span class="o">]</span> <span class="k">=></span> <span class="kt">List</span><span class="o">[</span><span class="kt">String</span><span class="o">]]](</span><span class="n">identity</span><span class="o">)(</span><span class="n">taggedList</span><span class="o">)</span>
<span class="n">res0</span><span class="k">:</span> <span class="kt">List</span><span class="o">[</span><span class="kt">String</span><span class="o">]</span> <span class="k">=</span> <span class="nc">List</span><span class="o">(</span><span class="n">hello</span><span class="o">,</span> <span class="n">world</span><span class="o">)</span>
</pre></div>
<p>Functions and typeclass instance can be tagged or untagged, too.</p>
<div class="codehilite language-scala"><pre><span></span><span class="n">scala</span><span class="o">></span> <span class="nc">Label</span><span class="o">.</span><span class="n">subst</span><span class="o">[</span><span class="kt">Lambda</span><span class="o">[</span><span class="kt">x</span> <span class="k">=></span> <span class="o">(</span><span class="kt">x</span>, <span class="kt">Int</span><span class="o">)</span> <span class="k">=></span> <span class="kt">x</span><span class="o">]](</span><span class="k">_</span> <span class="n">substring</span> <span class="k">_</span><span class="o">)</span>
<span class="n">res1</span><span class="k">:</span> <span class="o">(</span><span class="kt">Label.T</span><span class="o">,</span> <span class="kt">Int</span><span class="o">)</span> <span class="k">=></span> <span class="nc">Label</span><span class="o">.</span><span class="n">T</span> <span class="k">=</span> <span class="nc">$$Lambda$3194</span><span class="o">/</span><span class="mi">964109489</span><span class="k">@</span><span class="mi">72557</span><span class="n">d64</span>
<span class="n">scala</span><span class="o">></span> <span class="k">import</span> <span class="nn">scalaz.Monoid</span><span class="o">,</span> <span class="n">scalaz</span><span class="o">.</span><span class="n">std</span><span class="o">.</span><span class="n">string</span><span class="o">.</span><span class="k">_</span>
<span class="n">scala</span><span class="o">></span> <span class="nc">Label</span><span class="o">.</span><span class="n">subst</span><span class="o">(</span><span class="nc">Monoid</span><span class="o">[</span><span class="kt">String</span><span class="o">])</span>
<span class="n">res3</span><span class="k">:</span> <span class="kt">scalaz.Monoid</span><span class="o">[</span><span class="kt">Label.T</span><span class="o">]</span> <span class="k">=</span> <span class="n">scalaz</span><span class="o">.</span><span class="n">std</span><span class="o">.</span><span class="nc">StringInstances$stringInstance</span><span class="n">$</span><span class="k">@</span><span class="mi">252798</span><span class="n">fe</span>
</pre></div>
<p>All of this works because <code>subst</code> is <em>really</em> evidence
that,
<a href="http://typelevel.org/blog/2014/07/02/type_equality_to_leibniz.html#leib-power">deep down, <code>String</code> and <code>Label</code> are the same</a>.</p>
<div class="codehilite language-scala"><pre><span></span><span class="n">scala</span><span class="o">></span> <span class="k">import</span> <span class="nn">scalaz.Leibniz</span><span class="o">,</span> <span class="nc">Leibniz</span><span class="o">.{===,</span> <span class="n">refl</span><span class="o">}</span>
<span class="n">scala</span><span class="o">></span> <span class="nc">Label</span><span class="o">.</span><span class="n">subst</span><span class="o">[</span><span class="kt">String</span> <span class="kt">===</span> <span class="kt">?</span><span class="o">](</span><span class="n">refl</span><span class="o">)</span>
<span class="n">res4</span><span class="k">:</span> <span class="kt">Leibniz</span><span class="o">[</span><span class="kt">Nothing</span>,<span class="kt">Any</span>,<span class="kt">String</span>,<span class="kt">Label.T</span><span class="o">]</span> <span class="k">=</span> <span class="n">scalaz</span><span class="o">.</span><span class="nc">Leibniz$$anon$2</span><span class="k">@</span><span class="mi">702</span><span class="n">af12c</span>
</pre></div>
<p>Yet, you ran an experiment earlier to prove that you can’t confuse
<code>String</code> and <code>Label</code>; indeed, this still holds true, despite the
presence of <code>subst</code>!</p>
<div class="codehilite language-scala"><pre><span></span><span class="n">scala</span><span class="o">></span> <span class="s">"still a string"</span><span class="k">:</span> <span class="kt">Label</span>
<span class="o"><</span><span class="n">console</span><span class="k">>:</span><span class="mi">21</span><span class="k">:</span> <span class="kt">error:</span> <span class="k">type</span> <span class="kt">mismatch</span><span class="o">;</span>
<span class="n">found</span> <span class="k">:</span> <span class="kt">String</span><span class="o">(</span><span class="err">"</span><span class="kt">still</span> <span class="kt">a</span> <span class="kt">string</span><span class="err">"</span><span class="o">)</span>
<span class="kt">required:</span> <span class="kt">hcavsc.subst.Labels.Label</span>
<span class="o">(</span><span class="n">which</span> <span class="n">expands</span> <span class="n">to</span><span class="o">)</span> <span class="n">hcavsc</span><span class="o">.</span><span class="n">subst</span><span class="o">.</span><span class="nc">Labels</span><span class="o">.</span><span class="nc">Label</span><span class="o">.</span><span class="n">T</span>
<span class="s">"still a string"</span><span class="k">:</span> <span class="kt">Label</span>
<span class="o">^</span>
<span class="n">scala</span><span class="o">></span> <span class="nc">Label</span><span class="o">(</span><span class="s">"still a label"</span><span class="o">)</span><span class="k">:</span> <span class="kt">String</span>
<span class="o"><</span><span class="n">console</span><span class="k">>:</span><span class="mi">21</span><span class="k">:</span> <span class="kt">error:</span> <span class="k">type</span> <span class="kt">mismatch</span><span class="o">;</span>
<span class="n">found</span> <span class="k">:</span> <span class="kt">hcavsc.subst.Labels.Label.T</span>
<span class="n">required</span><span class="k">:</span> <span class="kt">String</span>
<span class="nc">Label</span><span class="o">(</span><span class="s">"still a label"</span><span class="o">)</span><span class="k">:</span> <span class="kt">String</span>
<span class="o">^</span>
</pre></div>
<p>Here’s what’s happening: in a sense, <code>(new Label(_)): (String =>
Label)</code> and <code>(_.str): (Label => String)</code> witness that there’s a
<em>conversion</em> between the two types. <code>subst</code> witnesses that there’s
<em>identical runtime representation</em> between its own two types. You get
to <strong>selectively reveal</strong> this evidence when it makes writing your
program more convenient; the rest of the time, it is hidden.</p>
<p>But I would like to step one level up: this is a <em>design space</em>, and
<code>subst</code> as we have seen it isn’t appropriate for all designs. As the
author of your own abstract newtypes, you get to choose how much, if
any, of this underlying type equality to reveal.</p>
<h2 id="markdown-header-if-subst-is-the-right-choice">If <code>subst</code> is the right choice</h2>
<p>For various reasons, the above is how Scalaz <code>Tag</code> (<code>@@</code>) is
defined. If you wish these semantics, you might as well throw
everything else away and write</p>
<div class="codehilite language-scala"><pre><span></span><span class="k">sealed</span> <span class="k">trait</span> <span class="nc">LabelTag</span> <span class="c1">// no instances</span>
<span class="k">type</span> <span class="kt">Label</span> <span class="o">=</span> <span class="nc">String</span> <span class="o">@@</span> <span class="nc">LabelTag</span>
<span class="k">val</span> <span class="nc">Label</span> <span class="k">=</span> <span class="nc">Tag</span><span class="o">.</span><span class="n">of</span><span class="o">[</span><span class="kt">LabelTag</span><span class="o">]</span>
</pre></div>
<p>and take advantage of the convenient tools around <code>subst</code> defined in
<code>Tag.Of</code>. But it’s not the only choice! It’s one point in the design
space. To do right by your API users, it’s
worth
<a href="/2017/04/and-glorious-subst-to-come.html">exploring that design space a little more</a>.</p>
<h2 id="markdown-header-type-unsafe-code-isnt-type-safe">Type-unsafe code isn’t type-safe</h2>
<p>Unboxed existential tagging spreads through your codebase. You feel
free to apply it liberally, because you know you aren’t paying the
wrapping costs of <code>AnyVal</code> subclasses; all these new abstraction
layers are pure type-level, and fully erased.</p>
<p>You receive a “bug report” from a fellow developer that this
expression never seems to filter out the non-label <code>String</code>s.</p>
<div class="codehilite language-scala"><pre><span></span><span class="o">(</span><span class="n">xs</span><span class="k">:</span> <span class="kt">List</span><span class="o">[</span><span class="kt">Any</span><span class="o">]).</span><span class="n">collect</span><span class="o">{</span><span class="k">case</span> <span class="n">t</span><span class="k">:</span> <span class="kt">Label</span> <span class="o">=></span> <span class="n">t</span><span class="o">}</span>
<span class="o"><</span><span class="n">console</span><span class="k">>:</span><span class="mi">16</span><span class="k">:</span> <span class="kt">warning:</span> <span class="kt">abstract</span> <span class="k">type</span> <span class="kt">pattern</span>
<span class="n">hcavsc</span><span class="o">.</span><span class="n">translucent</span><span class="o">.</span><span class="nc">Labels</span><span class="o">.</span><span class="nc">Label</span><span class="o">.</span><span class="n">T</span>
<span class="o">(</span><span class="n">the</span> <span class="n">underlying</span> <span class="n">of</span> <span class="n">hcavsc</span><span class="o">.</span><span class="n">translucent</span><span class="o">.</span><span class="nc">Labels</span><span class="o">.</span><span class="nc">Label</span><span class="o">)</span>
<span class="n">is</span> <span class="n">unchecked</span> <span class="n">since</span> <span class="n">it</span> <span class="n">is</span> <span class="n">eliminated</span> <span class="n">by</span> <span class="n">erasure</span>
<span class="o">(</span><span class="n">xs</span><span class="k">:</span> <span class="kt">List</span><span class="o">[</span><span class="kt">Any</span><span class="o">]).</span><span class="n">collect</span><span class="o">{</span><span class="k">case</span> <span class="n">t</span><span class="k">:</span> <span class="kt">Label</span> <span class="o">=></span> <span class="n">t</span><span class="o">}</span>
<span class="o">^</span>
<span class="o"><</span><span class="n">console</span><span class="k">>:</span><span class="mi">16</span><span class="k">:</span> <span class="kt">warning:</span> <span class="kt">The</span> <span class="kt">outer</span> <span class="kt">reference</span>
<span class="n">in</span> <span class="k">this</span> <span class="k">type</span> <span class="kt">test</span> <span class="kt">cannot</span> <span class="kt">be</span> <span class="kt">checked</span> <span class="kt">at</span> <span class="kt">run</span> <span class="kt">time.</span>
<span class="o">(</span><span class="kt">xs:</span> <span class="kt">List</span><span class="o">[</span><span class="kt">Any</span><span class="o">]).</span><span class="n">collect</span><span class="o">{</span><span class="k">case</span> <span class="n">t</span><span class="k">:</span> <span class="kt">Label</span> <span class="o">=></span> <span class="n">t</span><span class="o">}</span>
<span class="o">^</span>
</pre></div>
<p>Your mind
on
<a href="http://typelevel.org/blog/2014/11/10/why_is_adt_pattern_matching_allowed.html">safe pattern matching practice</a>,
you add <code>def unapply(s: String): Option[T]</code> to <code>LabelImpl</code> and counsel
preference for the form <code>case Label(t) => ...</code>, as well as to not
ignore <code>-unchecked</code> warnings.</p>
<p>You get another bug report that this always seems to succeed.</p>
<div class="codehilite language-scala"><pre><span></span><span class="o">(</span><span class="n">s</span><span class="k">:</span> <span class="kt">String</span><span class="o">).</span><span class="n">asInstanceOf</span><span class="o">[</span><span class="kt">Label</span><span class="o">]</span>
</pre></div>
<p>Repeating your advice about warnings, you start to wonder, “where is
this kind of code coming from?”</p>
<p>Someone else complains that they want to make <code>T extends Ordered[T]</code>,
and can’t fathom where the code should go. You advise the static
approach of implementing the <code>Ordering</code> typeclass instance instead for
<code>T</code>, wonder how deep the object-orientation hole goes, and forward the
link about the typeclass pattern again, too.</p>
<h2 id="markdown-header-suppose-you-went-back-to-anyval">Suppose you went back to <code>AnyVal</code></h2>
<p>We’ve seen that <code>AnyVal</code> subclasses <em>could</em> have been incredibly
cheap, but aren’t, so as to support “features” like checked
casting. Who’s going to foot the bill?</p>
<ol>
<li>Oh, this allocates when passing through polymorphic contexts, but
not monomorphic ones? Avoid polymorphic code.</li>
<li>Oh, this extra type-safety adds all this allocation? Type safety is
expensive at runtime; we need to stick to <code>String</code>.</li>
<li>We can’t do any better; the JVM limits the possibilities. You have
to pay for runtime <em>class</em> wrapping if you want a wrapper <em>type</em>.</li>
</ol>
<p>In this article, I have demonstrated that none of these conclusions
are correct. However, only a tiny minority of Scala practitioners will
ever read this article, and I will not blame the rest for drawing
these seemingly straightforward inferences, ultimately faulty as they
are.</p>
<p>The real cost of <code>AnyVal</code> subclasses is not all the needless memory
allocation. The real cost is the damage to the practice of type-safe
programming in Scala. It’s in all the curious developers who sought to
add a little more type safety to their programs, only to find
themselves penalized by the runtime, once bitten. It’s in the
reinforcement of this attitude towards abstraction that they’ll
continue to carry with them, the next time an opportunity presents
itself. It’s a missed opportunity for pure type-level thinking, all so
that <code>asInstanceOf</code> “works”.</p>
<p><em>See
<a href="/2017/04/and-glorious-subst-to-come.html">“…and the glorious <code>subst</code> to come”</a> for
further development of the ideas in this article.</em></p>
<p><em>This article was tested with Scala 2.12.1, Scalaz 7.2.10, and Kind
Projector 0.9.3. The code
is
<a href="https://code.launchpad.net/~scompall/+junk/high-cost-of-anyval-subclasses">available in compilable form for your own experiments via Bazaar</a>.</em></p>Stephen Compallhttp://www.blogger.com/profile/13346366374080232972noreply@blogger.com19tag:blogger.com,1999:blog-1184549185438247550.post-730584572110145362016-12-03T10:30:00.001-05:002016-12-03T10:32:13.594-05:00Part 3: Working with the abstract F<p><em>This is the third of a four-part series on tagless-final effects.</em></p>
<h2 id="markdown-header-previously">Previously</h2>
<ol>
<li><a href="/2016/12/tagless-final-effects-la-ermine-writers.html">Introduction, motivation, and the core techniques</a>;</li>
<li><a href="/2016/12/part-2-role-of-monad.html">The role of <code>Monad</code></a>.</li>
</ol>
<h2 id="markdown-header-the-freedom-of-erased-abstraction">The freedom of erased abstraction</h2>
<p>There is something supremely elegant about the way values of the <code>F</code>-type flow between effectful programs and their interpreters.</p>
<p>Consider the pair of <code>copy</code> and the <code>IOFulFSAlg</code>.</p>
<ol>
<li>The first <code>F</code> is created by <code>IOFul</code>; in fact, the <code>copy</code> method cannot create any on its own. It knows that <code>F</code> = <code>Function0</code>, so can use <code>() => ...</code> to create its result. </li>
<li>This value flows to <code>copy</code>. But <code>copy</code> doesn’t know that <code>F</code> = <code>Function0</code>; the value is “actually” callable <code>likeSo()</code>, but that will not compile!</li>
<li><code>copy</code> passes it back to the interpreter’s <code>bind</code>. All of a sudden, callability is back, so it can be implemented!</li>
</ol>
<p>Each time <code>F</code> values cross the boundary between effectful program and interpreter, this knowledge appears and disappears in exactly the way that guides us to keep effectful programs properly abstract, that is, agnostic to the representation of the effects. </p>
<p>The way that representation appears and disappears in just the right places is a hallmark of parametric polymorphism. By contrast, consider “hiding behind a class’s public interface”, a hallmark of the object-oriented polymorphic way of thinking:</p>
<ol>
<li>If the interpreter is embedded directly within the <code>F</code> class, then it can only safely work with exactly one <code>F</code>, the receiver. <code>bind</code> and such must be implemented with casting, which is by definition unsafe. </li>
<li>If the interpreter is separate, it must cast every <code>F</code>. </li>
</ol>
<p>Regardless, the presence of a class implies manual wrapping and unwrapping, to apply the Adapter pattern; in Scala, this is a one-time cost. Even <code>AnyVal</code> subclasses box quite readily. </p>
<p>We can observe that there is no runtime wrapper quite readily in Scala by doing unsafe casting on the results of algebra methods.</p>
<div class="codehilite"><pre><span></span><span class="k">def</span> <span class="n">unsafeRead</span><span class="o">(</span><span class="n">source</span><span class="k">:</span> <span class="kt">String</span><span class="o">,</span> <span class="n">alg</span><span class="k">:</span> <span class="kt">FSAlg</span><span class="o">)</span><span class="k">:</span> <span class="kt">String</span> <span class="o">=</span>
<span class="n">alg</span><span class="o">.</span><span class="n">readFile</span><span class="o">(</span><span class="n">source</span><span class="o">)</span>
<span class="o">.</span><span class="n">asInstanceOf</span><span class="o">[()</span> <span class="k">=></span> <span class="kt">String</span><span class="o">]</span>
<span class="o">.</span><span class="n">apply</span><span class="o">()</span>
</pre></div>
<p>If we pass our <code>IOFulFSAlg</code> to this function, it will work! The <code>F</code> is really (!) what the interpreter thinks, a <code>Function0</code>. </p>
<p>However, if we pass the test interpreter, it will just crash. Effectful programs can only do this by cheating the interpreter out of its job; honest programs do not do this. </p>
<p>I explained all this to demonstrate that tagless-final relies on a purely type-level form of abstraction. It cannot be meaningfully enforced without a type checker with parametric polymorphism. If you do not have parametric polymorphism, it is difficult to say that abstraction is happening at all; it will certainly be extremely difficult for a programmer unversed in effect algebras to stick to the abstract interface, without the aid of enforcement. </p>
<p>In Scala, there’s a “hole” in this abstraction, demonstrated by the partially-checked cast to <code>() => String</code> above. Scala permits uncontrolled type tests in its pattern matching, not just <a href="http://typelevel.org/blog/2014/11/10/why_is_adt_pattern_matching_allowed.html#adts-use-type-tests">those useful for ADT pattern matching</a>, so it is also possible to use this to violate the abstraction even further. </p>
<div class="codehilite"><pre><span></span><span class="k">def</span> <span class="n">andInATest</span><span class="o">(</span><span class="n">alg</span><span class="k">:</span> <span class="kt">FSAlg</span><span class="o">)</span><span class="k">:</span> <span class="kt">Boolean</span> <span class="o">=</span>
<span class="n">alg</span><span class="o">.</span><span class="n">readFile</span><span class="o">(</span><span class="s">"irrelevant"</span><span class="o">)</span> <span class="k">match</span> <span class="o">{</span>
<span class="k">case</span> <span class="k">_:</span> <span class="kt">Function0</span><span class="o">[</span><span class="k">_</span><span class="o">]</span> <span class="k">=></span> <span class="kc">false</span>
<span class="k">case</span> <span class="k">_:</span> <span class="kt">Function1</span><span class="o">[</span><span class="k">_</span>, <span class="k">_</span><span class="o">]</span> <span class="k">=></span> <span class="kc">true</span>
<span class="o">}</span>
</pre></div>
<p>Parametricity does not let us determine more about <code>F</code> than that explicitly provided in <code>FSAlg</code>; the <code>alg</code> certainly did not supply information about how to break the abstraction. </p>
<p>This is why “runtime type information” or “reified generics” are neither benign nor harmless. I’ve lost the absolute guarantee that the effectful program isn’t breaking the purely type-level abstraction. </p>
<p>Luckily, the compiler doesn’t encourage this sort of thing either. In the <a href="https://imgur.com/a04WoHn">Scalazzi Safe Scala Subset</a> we take it back to a rule, by forbidding use of type tests. Thus, full abstraction is restored.</p>
<h2 id="markdown-header-is-the-type-parameter-really-necessary">Is the type parameter really necessary?</h2>
<p>The presence of a type parameter on the abstract type <code>F</code>—making <code>F</code> a “higher-kinded type”—gets in the way of implementing this in Java. Perhaps the best way to see why this type parameter is so important is to see a case where it is not. </p>
<p>Java programmers confronted with a constellation of methods that produce substrings will often, “for performance”, pass around a <code>StringBuilder</code> or <code>Writer</code> as an argument, changing all the functions into <code>void</code>-returning mutations. </p>
<p>Tagless-final style offers a far more elegant way to get this runtime optimization.</p>
<div class="codehilite"><pre><span></span><span class="k">trait</span> <span class="nc">StrAlg</span> <span class="o">{</span>
<span class="k">type</span> <span class="kt">S</span>
<span class="k">def</span> <span class="n">fromString</span><span class="o">(</span><span class="n">str</span><span class="k">:</span> <span class="kt">String</span><span class="o">)</span><span class="k">:</span> <span class="kt">S</span>
<span class="k">def</span> <span class="n">append</span><span class="o">(</span><span class="n">l</span><span class="k">:</span> <span class="kt">S</span><span class="o">,</span> <span class="n">r</span><span class="k">:</span> <span class="kt">S</span><span class="o">)</span><span class="k">:</span> <span class="kt">S</span>
<span class="o">}</span>
<span class="k">object</span> <span class="nc">SBStrAlg</span> <span class="k">extends</span> <span class="nc">StrAlg</span> <span class="o">{</span>
<span class="k">type</span> <span class="kt">S</span> <span class="o">=</span> <span class="nc">StringBuilder</span> <span class="k">=></span> <span class="nc">Unit</span>
<span class="k">def</span> <span class="n">fromString</span><span class="o">(</span><span class="n">str</span><span class="k">:</span> <span class="kt">String</span><span class="o">)</span> <span class="k">=</span>
<span class="n">sb</span> <span class="k">=></span> <span class="n">sb</span><span class="o">.</span><span class="n">append</span><span class="o">(</span><span class="n">str</span><span class="o">)</span>
<span class="k">def</span> <span class="n">append</span><span class="o">(</span><span class="n">l</span><span class="k">:</span> <span class="kt">S</span><span class="o">,</span> <span class="n">r</span><span class="k">:</span> <span class="kt">S</span><span class="o">)</span> <span class="k">=</span>
<span class="n">sb</span> <span class="k">=></span> <span class="o">{</span>
<span class="n">l</span><span class="o">(</span><span class="n">sb</span><span class="o">)</span>
<span class="n">r</span><span class="o">(</span><span class="n">sb</span><span class="o">)</span>
<span class="o">}</span>
<span class="o">}</span>
<span class="k">def</span> <span class="n">numbers</span><span class="o">(</span><span class="n">alg</span><span class="k">:</span> <span class="kt">StrAlg</span><span class="o">)</span><span class="k">:</span> <span class="kt">alg.S</span> <span class="o">=</span>
<span class="o">(</span><span class="mi">1</span> <span class="n">to</span> <span class="mi">100</span><span class="o">).</span><span class="n">foldLeft</span><span class="o">(</span><span class="n">alg</span><span class="o">.</span><span class="n">fromString</span><span class="o">(</span><span class="s">""</span><span class="o">)){</span>
<span class="o">(</span><span class="n">acc</span><span class="o">,</span> <span class="n">m</span><span class="o">)</span> <span class="k">=></span> <span class="n">alg</span><span class="o">.</span><span class="n">append</span><span class="o">(</span><span class="n">acc</span><span class="o">,</span> <span class="n">alg</span><span class="o">.</span><span class="n">fromString</span><span class="o">(</span><span class="n">m</span><span class="o">.</span><span class="n">toString</span><span class="o">))</span>
<span class="o">}</span>
</pre></div>
<p>And so <code>numbers</code> is freed from the admonition not to iteratively concatenate <code>String</code>s, even if you are too lazy to implement the more efficient interpreter later! We also have this nice fusion property: <code>numbers</code> is fully decoupled from what we do with its results, even if we arrange for <code>fromString</code> to write to an exotic output stream of some sort. </p>
<p>However, any <code>S</code> for a given interpreter is like any other <code>S</code>. There’s no behavioral way to distinguish between different sorts of <code>S</code> in our algebra. This is fine when we want to represent exactly one (stringish) thing, but a typical algebra needs more, and so does a typical effectful program. </p>
<p>Consider <code>FSAlg</code>. It returns two sort of results, <code>F[String]</code> and <code>F[Unit</code>], which is already one too many for the so-called “star-kinded” representation employed by <code>StrAlg</code>. Say we faked it with an empty string for the <code>F[Unit]</code> case.</p>
<p>How would you represent an effectful program that parses a <code>List[Int]</code> out of a file? With <code>FSAlg</code>, it is easy:</p>
<div class="codehilite"><pre><span></span><span class="k">def</span> <span class="n">listONums</span><span class="o">(</span><span class="n">source</span><span class="k">:</span> <span class="kt">String</span><span class="o">,</span> <span class="n">alg</span><span class="k">:</span> <span class="kt">FSAlg</span><span class="o">)</span><span class="k">:</span> <span class="kt">alg.F</span><span class="o">[</span><span class="kt">List</span><span class="o">[</span><span class="kt">Int</span><span class="o">]]</span>
</pre></div>
<p>How would you get this list, without a type parameter? Well, you’d have to interpret <code>F</code> to a <code>String</code>. But now, this function that returns <code>List[Int]</code> <em>runs the interpreter</em>, so it cannot be used as a component of abstract effectful programs. It does not compose. </p>
<p>Higher-kinded types like <code>FSAlg</code>’s <code>F</code> are the foundation of the appeal and useful applicability of the tagless-final pattern. If we don’t have them, or we stubbornly refuse to use them, we’re doomed from the start.</p>
<p><code>CanBuildFrom</code> has appeal to higher-kinded skeptics, but if you attempt to integrate something like it into <code>FSAlg</code>, yet still write signatures like <code>listONums</code>, you will never finish writing all the abstract types and <code>map</code> instances required to have a general-purpose algebra.</p>
<h2 id="markdown-header-is-copy-a-functional-program">Is <code>copy</code> a functional program?</h2>
<p>Suppose that we wrote a version of <code>copy</code>, or any effectful program, that directly referred to <code>IOFulFSAlg</code> to produce effects, rather than taking an algebra argument and leaving <code>F</code> abstract. It would be hard to argue that it is still a purely functional program. However, the case for its being functional is relatively simple in the abstract case. Since the only difference is taking an argument, why is that?</p>
<p>The usual way in which we make programs more functional is to divide a side-effecting program into two parts: one to make decisions and purely produce a value representing those decisions, and one to “interpret” that value. This forms an obvious, structural abstraction.</p>
<p>To accept <code>copy</code> as a pure function requires you to broaden your acceptance of abstraction to include the type level. Because <code>copy</code> does not only receive functions in an algebra as an argument, it also receives <em>a type</em>, <code>F</code>, as an argument. By means of <em>this</em> abstraction, we form an “effectful shell” of a shape that would not work without the ability to abstract at the type level.</p>
<h2 id="markdown-header-on-finally-on-tagless">On finally, on tagless</h2>
<p>The pure type-level approach is why this is <em>tagless</em>. Other approaches to custom algebras, such as using free monads, require a runtime “tag” to be created and picked up by the interpreter.</p>
<p>In tagless final style, we skip the tag step and just have the interpreter emit the <em>final</em> form of the effect right away.</p>
<h2 id="markdown-header-drawback-decomposition-required">Drawback: decomposition required</h2>
<p>One drawback of the tagless-final style is that it imposes a specific structure on the interpreters you write.</p>
<p>When you interpret a free monad structure, you have a few methods of interpretation available. One is “natural transformation”; this is similar to what you do with tagless-final, but with a chunk of boilerplate. However, you can also write the interpreter as a tail-recursive loop. This loop can conveniently do things like update state variables, notice when certain actions happen after certain other actions, and so on.</p>
<p>By contrast, tagless-final style requires you to take that interpretive logic and encode it in data structures, each returned by a method specific to that action. Each algebra method acts as an isolated component, with no relation to others that may be called in the same effectful program.</p>
<p>Luckily, while tagless-final requires you to have a uniform, functional representation of effects per interpreter, it doesn’t say anything else about what that structure is. So to the extent that you want the extra features of a free monad structure for interpretation power, you can incorporate one. Moreover, this remains invisible to the effectful programs themselves. The underlying style remains tagless; any tags present in the system are as you choose, for your interpreters’ convenience.</p>
<p><em>This article was tested with Scala 2.12.0.</em></p>Stephen Compallhttp://www.blogger.com/profile/13346366374080232972noreply@blogger.com0tag:blogger.com,1999:blog-1184549185438247550.post-68095013374322703112016-12-03T10:29:00.001-05:002016-12-03T10:32:09.292-05:00Part 2: The role of Monad<p><em>This is the second of a four-part series on tagless-final effects.</em></p>
<h2 id="markdown-header-previously">Previously</h2>
<ol>
<li><a href="/2016/12/tagless-final-effects-la-ermine-writers.html">Introduction, motivation, and the core techniques</a>.</li>
</ol>
<h2 id="markdown-header-the-role-of-monad">The role of <code>Monad</code></h2>
<p>A useful effectful program needs some way of not only producing <code>F</code> effects, but combining and manipulating them. There are no methods on <code>F</code> from the perspective of effectful programs to satisfy this need; the interpreter must provide.</p>
<p>Suppose we have a basic <code>copy</code> method.</p>
<div class="codehilite"><pre><span></span><span class="k">def</span> <span class="n">copy</span><span class="o">(</span><span class="n">source</span><span class="k">:</span> <span class="kt">String</span><span class="o">,</span> <span class="n">dest</span><span class="k">:</span> <span class="kt">String</span><span class="o">,</span> <span class="n">alg</span><span class="k">:</span> <span class="kt">FSAlg</span><span class="o">)</span>
<span class="k">:</span> <span class="kt">alg.F</span><span class="o">[</span><span class="kt">Unit</span><span class="o">]</span> <span class="k">=</span> <span class="o">{</span>
<span class="n">alg</span><span class="o">.</span><span class="n">readFile</span><span class="o">(</span><span class="n">source</span><span class="o">)</span>
<span class="c1">// somehow get the String from F[String]...</span>
<span class="n">alg</span><span class="o">.</span><span class="n">writeFile</span><span class="o">(</span><span class="n">dest</span><span class="o">,</span> <span class="n">contents</span><span class="o">)</span>
<span class="o">}</span>
</pre></div>
<p>It is tempting to include an evaluator in the algebra, but resist! <strong>Don’t include the runner in the algebra!</strong></p>
<p>The appeal of this temptation lies with its similarity to an imperative style. You write</p>
<div class="codehilite"><pre><span></span><span class="k">val</span> <span class="n">contents</span> <span class="k">=</span> <span class="n">alg</span><span class="o">.</span><span class="n">run</span><span class="o">(</span><span class="n">alg</span><span class="o">.</span><span class="n">readFile</span><span class="o">(</span><span class="n">source</span><span class="o">))</span>
<span class="n">alg</span><span class="o">.</span><span class="n">writeFile</span><span class="o">(</span><span class="n">dest</span><span class="o">,</span> <span class="n">contents</span><span class="o">)</span>
</pre></div>
<p>However, not only is this no longer functional programming, it is <em>very, very</em> hard to think about <em>when</em> this <code>readFile</code> side effect happens in relation to the other side effects in a side-effecting interpreter.</p>
<ol>
<li>It could happen before <em>all</em> of the <code>F</code>-controlled side effects. </li>
<li>It could happen before one or more of the side effects you expect to happen first.</li>
<li>it could happen after one or more of the side effects you expect to happen later.</li>
<li>Any mix of the above can happen in the same interpreter.</li>
</ol>
<p>Instead, we can supply <em>combinators</em> in the algebra to allow effects to be sequenced. Here’s a combinator for <code>FSAlg</code> that will allow <code>copy</code> to be written.</p>
<div class="codehilite"><pre><span></span><span class="k">def</span> <span class="n">bind</span><span class="o">[</span><span class="kt">A</span>, <span class="kt">B</span><span class="o">](</span><span class="n">first</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">A</span><span class="o">])(</span><span class="n">next</span><span class="k">:</span> <span class="kt">A</span> <span class="o">=></span> <span class="n">F</span><span class="o">[</span><span class="kt">B</span><span class="o">])</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">B</span><span class="o">]</span>
</pre></div>
<p>This is called <em>monadic bind</em>, and can be written for both <code>FSAlg</code> interpreters.</p>
<div class="codehilite"><pre><span></span><span class="c1">// IOFul</span>
<span class="k">def</span> <span class="n">bind</span><span class="o">[</span><span class="kt">A</span>, <span class="kt">B</span><span class="o">](</span><span class="n">first</span><span class="k">:</span> <span class="o">()</span> <span class="o">=></span> <span class="n">A</span><span class="o">)(</span><span class="n">next</span><span class="k">:</span> <span class="kt">A</span> <span class="o">=></span> <span class="o">()</span> <span class="k">=></span> <span class="n">B</span><span class="o">)</span><span class="k">:</span> <span class="o">()</span> <span class="o">=></span> <span class="n">B</span> <span class="k">=</span>
<span class="o">()</span> <span class="k">=></span> <span class="n">next</span><span class="o">(</span><span class="n">first</span><span class="o">())()</span> <span class="c1">// we can't call `first` now; that would</span>
<span class="c1">// break the delay of its side-effects</span>
<span class="c1">// TestMap</span>
<span class="k">def</span> <span class="n">bind</span><span class="o">[</span><span class="kt">A</span>, <span class="kt">B</span><span class="o">](</span><span class="n">first</span><span class="k">:</span> <span class="kt">Directory</span> <span class="o">=></span> <span class="o">(</span><span class="nc">Either</span><span class="o">[</span><span class="kt">Err</span>, <span class="kt">A</span><span class="o">],</span> <span class="nc">Directory</span><span class="o">))</span>
<span class="o">(</span><span class="n">next</span><span class="k">:</span> <span class="kt">A</span> <span class="o">=></span> <span class="nc">Directory</span> <span class="k">=></span> <span class="o">(</span><span class="nc">Either</span><span class="o">[</span><span class="kt">Err</span>, <span class="kt">B</span><span class="o">],</span> <span class="nc">Directory</span><span class="o">))</span>
<span class="k">:</span> <span class="kt">Directory</span> <span class="o">=></span> <span class="o">(</span><span class="nc">Either</span><span class="o">[</span><span class="kt">Err</span>, <span class="kt">B</span><span class="o">],</span> <span class="nc">Directory</span><span class="o">)</span> <span class="k">=</span>
<span class="n">dir</span> <span class="k">=></span> <span class="o">{</span>
<span class="k">val</span> <span class="o">(</span><span class="n">ea</span><span class="o">,</span> <span class="n">dir2</span><span class="o">)</span> <span class="k">=</span> <span class="n">first</span><span class="o">(</span><span class="n">dir</span><span class="o">)</span>
<span class="n">ea</span> <span class="k">match</span> <span class="o">{</span>
<span class="k">case</span> <span class="nc">Left</span><span class="o">(</span><span class="n">err</span><span class="o">)</span> <span class="k">=></span> <span class="o">(</span><span class="nc">Left</span><span class="o">(</span><span class="n">err</span><span class="o">),</span> <span class="n">dir2</span><span class="o">)</span>
<span class="k">case</span> <span class="nc">Right</span><span class="o">(</span><span class="n">a</span><span class="o">)</span> <span class="k">=></span> <span class="n">next</span><span class="o">(</span><span class="n">a</span><span class="o">)(</span><span class="n">dir2</span><span class="o">)</span>
<span class="o">}</span>
<span class="o">}</span>
</pre></div>
<p>With this function in the algebra, we can implement an effectful <code>copy</code> in a functional way. </p>
<div class="codehilite"><pre><span></span><span class="k">def</span> <span class="n">copy</span><span class="o">(</span><span class="n">source</span><span class="k">:</span> <span class="kt">String</span><span class="o">,</span> <span class="n">dest</span><span class="k">:</span> <span class="kt">String</span><span class="o">,</span> <span class="n">alg</span><span class="k">:</span> <span class="kt">FSAlg</span><span class="o">)</span>
<span class="k">:</span> <span class="kt">alg.F</span><span class="o">[</span><span class="kt">Unit</span><span class="o">]</span> <span class="k">=</span>
<span class="n">alg</span><span class="o">.</span><span class="n">bind</span><span class="o">(</span><span class="n">alg</span><span class="o">.</span><span class="n">readFile</span><span class="o">(</span><span class="n">source</span><span class="o">)){</span>
<span class="n">contents</span> <span class="k">=></span>
<span class="n">alg</span><span class="o">.</span><span class="n">writeFile</span><span class="o">(</span><span class="n">dest</span><span class="o">,</span> <span class="n">contents</span><span class="o">)</span>
<span class="o">}</span>
</pre></div>
<p>The broad applicability of this pattern to sequential effect problems—of both the pure and side sort, as exemplified by our two interpreters—is why <code>Monad</code> is so commonly used for problems like this.</p>
<p>Chances are that your effectful programs will need to perform <code>F</code> effects in this sequential way, where later effects (e.g. <code>writeFile</code>) need to be calculated based on the resulting values (e.g. <code>contents</code>) of earlier effects (e.g. <code>readFile</code>). So, as a design shortcut, you ought to incorporate <code>Monad</code> into your algebra.</p>
<h2 id="markdown-header-dont-reinvent-the-monad-wheel">Don’t reinvent the <code>Monad</code> wheel</h2>
<p>It may be tempting to avoid incorporating a library of functional abstractions such as <a href="https://github.com/scalaz/scalaz#scalaz">Scalaz</a> or <a href="https://github.com/non/cats#cats">Cats</a> into your program. This is a mistake; these libraries incorporate a large number of functions that are useful for working with abstract effects, as well as a large number of pre-built and tested implementations of <code>bind</code> and many similar combinators, ready for reuse in your interpreter. </p>
<p>These libraries are foundational because they cover so many common tasks for algebraic abstractions. For example, take a pure effectful program that reads a list of files.</p>
<div class="codehilite"><pre><span></span><span class="k">def</span> <span class="n">readFiles</span><span class="o">(</span><span class="n">names</span><span class="k">:</span> <span class="kt">List</span><span class="o">[</span><span class="kt">String</span><span class="o">],</span> <span class="n">alg</span><span class="k">:</span> <span class="kt">FSAlg</span><span class="o">)</span>
<span class="k">:</span> <span class="kt">alg.F</span><span class="o">[</span><span class="kt">List</span><span class="o">[</span><span class="kt">String</span><span class="o">]]</span> <span class="k">=</span>
<span class="n">names</span><span class="o">.</span><span class="n">map</span><span class="o">(</span><span class="n">alg</span><span class="o">.</span><span class="n">readFile</span><span class="o">(</span><span class="k">_</span><span class="o">))</span>
<span class="c1">// ↑</span>
<span class="c1">// [error] type mismatch;</span>
<span class="c1">// found : List[alg.F[String]]</span>
<span class="c1">// required: alg.F[List[String]]</span>
</pre></div>
<p>This doesn’t work because the <code>F</code>s must be sequenced and returned from the method, not dropped on the floor. (<code>foreach</code> is completely useless in these programs for a similar reason.) An author of a pure effectful program can solve this problem herself, presuming the algebra includes the other essential monad function <code>point</code>.</p>
<div class="codehilite"><pre><span></span><span class="c1">// in algebra</span>
<span class="k">def</span> <span class="n">point</span><span class="o">[</span><span class="kt">A</span><span class="o">](</span><span class="n">a</span><span class="k">:</span> <span class="kt">A</span><span class="o">)</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">A</span><span class="o">]</span>
<span class="c1">// readFiles</span>
<span class="n">names</span><span class="o">.</span><span class="n">foldRight</span><span class="o">(</span><span class="n">alg</span><span class="o">.</span><span class="n">point</span><span class="o">(</span><span class="nc">List</span><span class="o">.</span><span class="n">empty</span><span class="o">[</span><span class="kt">String</span><span class="o">])){</span>
<span class="o">(</span><span class="n">name</span><span class="o">,</span> <span class="n">rightF</span><span class="o">)</span> <span class="k">=></span>
<span class="n">alg</span><span class="o">.</span><span class="n">bind</span><span class="o">(</span><span class="n">alg</span><span class="o">.</span><span class="n">readFile</span><span class="o">(</span><span class="n">name</span><span class="o">)){</span><span class="n">hContents</span> <span class="k">=></span>
<span class="n">alg</span><span class="o">.</span><span class="n">bind</span><span class="o">(</span><span class="n">rightF</span><span class="o">){</span><span class="n">rest</span> <span class="k">=></span>
<span class="n">alg</span><span class="o">.</span><span class="n">point</span><span class="o">(</span><span class="n">hContents</span> <span class="o">::</span> <span class="n">rest</span><span class="o">)</span>
<span class="o">}}}</span>
</pre></div>
<p>With Scalaz, not only does abstract monad syntax give you a nicer way to write this fold and list reconstitution: </p>
<div class="codehilite"><pre><span></span><span class="n">names</span><span class="o">.</span><span class="n">foldRight</span><span class="o">(</span><span class="nc">List</span><span class="o">.</span><span class="n">empty</span><span class="o">[</span><span class="kt">String</span><span class="o">].</span><span class="n">point</span><span class="o">[</span><span class="kt">alg.F</span><span class="o">]){</span>
<span class="o">(</span><span class="n">name</span><span class="o">,</span> <span class="n">rightF</span><span class="o">)</span> <span class="k">=></span>
<span class="k">for</span> <span class="o">{</span>
<span class="n">hContents</span> <span class="k"><-</span> <span class="n">alg</span><span class="o">.</span><span class="n">readFile</span><span class="o">(</span><span class="n">name</span><span class="o">)</span>
<span class="n">rest</span> <span class="k"><-</span> <span class="n">rightF</span>
<span class="o">}</span> <span class="k">yield</span> <span class="n">hContents</span> <span class="o">::</span> <span class="n">rest</span>
<span class="o">}</span>
</pre></div>
<p>But you wouldn’t bother, because the library already includes this function, for <code>List</code> and several other types.</p>
<div class="codehilite"><pre><span></span><span class="n">names</span><span class="o">.</span><span class="n">traverse</span><span class="o">(</span><span class="n">alg</span><span class="o">.</span><span class="n">readFile</span><span class="o">(</span><span class="k">_</span><span class="o">))</span>
</pre></div>
<p>Now all we did was change <code>map</code> to <code>traverse</code>.</p>
<p>Using a good foundational functional library is especially important for newcomers to monadic abstraction, because it contains in so many common patterns a demonstration of the proper way to work with <code>F</code>.</p>
<h2 id="markdown-header-monad-reuse-in-fsalg">Monad reuse in <code>FSAlg</code></h2>
<p>Let’s rewrite what we have so far to incorporate a standard <code>Monad</code> into <code>FSAlg</code>.</p>
<p>First, we eliminate <code>bind</code> and <code>point</code>, substituting a <code>Monad</code> typeclass instance into <code>FSAlg</code>. </p>
<div class="codehilite"><pre><span></span><span class="k">import</span> <span class="nn">scalaz.Monad</span>
<span class="k">implicit</span> <span class="k">val</span> <span class="n">M</span><span class="k">:</span> <span class="kt">Monad</span><span class="o">[</span><span class="kt">F</span><span class="o">]</span>
</pre></div>
<p>For <code>IOFulFSAlg</code>, Scalaz already includes an implementation, which we can find by importing.</p>
<div class="codehilite"><pre><span></span><span class="k">val</span> <span class="n">M</span> <span class="k">=</span> <span class="o">{</span>
<span class="k">import</span> <span class="nn">scalaz.std.function._</span>
<span class="nc">Monad</span><span class="o">[</span><span class="kt">F</span><span class="o">]</span>
<span class="o">}</span>
</pre></div>
<p>Scalaz does have a <code>Monad</code> for the test algebra’s <code>F</code>, but using it will require some rewriting of our existing interpreter functions, so let’s just port over the previous <code>bind</code> implementation.</p>
<div class="codehilite"><pre><span></span><span class="k">val</span> <span class="n">M</span> <span class="k">=</span> <span class="k">new</span> <span class="nc">Monad</span><span class="o">[</span><span class="kt">F</span><span class="o">]</span> <span class="o">{</span>
<span class="c1">// bind as above under TestMap</span>
<span class="o">}</span>
<span class="o">[</span><span class="kt">error</span><span class="o">]</span> <span class="k">object</span> <span class="nc">creation</span> <span class="n">impossible</span><span class="o">,</span> <span class="n">since</span> <span class="n">method</span> <span class="n">point</span> <span class="n">in</span>
<span class="k">trait</span> <span class="nc">Applicative</span> <span class="n">of</span> <span class="k">type</span> <span class="err">[</span><span class="kt">A</span><span class="err">]</span><span class="o">(</span><span class="kt">a:</span> <span class="o">=></span> <span class="kt">A</span><span class="o">)</span><span class="n">tfe</span><span class="o">.</span><span class="nc">TestMapAlg</span><span class="o">.</span><span class="n">F</span><span class="o">[</span><span class="kt">A</span><span class="o">]</span>
<span class="n">is</span> <span class="n">not</span> <span class="n">defined</span>
<span class="k">val</span> <span class="n">M</span> <span class="k">=</span> <span class="k">new</span> <span class="nc">Monad</span><span class="o">[</span><span class="kt">F</span><span class="o">]</span> <span class="o">{</span>
<span class="o">^</span>
</pre></div>
<p>We didn’t get around to implementing <code>point</code>, the “effect-free” combinator, and the compiler asks for that now. </p>
<div class="codehilite"><pre><span></span><span class="k">def</span> <span class="n">point</span><span class="o">[</span><span class="kt">A</span><span class="o">](</span><span class="n">a</span><span class="k">:</span> <span class="o">=></span> <span class="n">A</span><span class="o">)</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">A</span><span class="o">]</span> <span class="k">=</span>
<span class="n">d</span> <span class="k">=></span> <span class="o">(</span><span class="nc">Right</span><span class="o">(</span><span class="n">a</span><span class="o">),</span> <span class="n">d</span><span class="o">)</span>
</pre></div>
<p><code>copy</code> can be written in a method-calling style.</p>
<div class="codehilite"><pre><span></span><span class="k">def</span> <span class="n">copy</span><span class="o">(</span><span class="n">source</span><span class="k">:</span> <span class="kt">String</span><span class="o">,</span> <span class="n">dest</span><span class="k">:</span> <span class="kt">String</span><span class="o">,</span> <span class="n">alg</span><span class="k">:</span> <span class="kt">FSAlg</span><span class="o">)</span>
<span class="k">:</span> <span class="kt">alg.F</span><span class="o">[</span><span class="kt">Unit</span><span class="o">]</span> <span class="k">=</span> <span class="o">{</span>
<span class="k">import</span> <span class="nn">alg.M.monadSyntax._</span>
<span class="n">alg</span><span class="o">.</span><span class="n">readFile</span><span class="o">(</span><span class="n">source</span><span class="o">).</span><span class="n">flatMap</span><span class="o">{</span>
<span class="n">contents</span> <span class="k">=></span> <span class="n">alg</span><span class="o">.</span><span class="n">writeFile</span><span class="o">(</span><span class="n">dest</span><span class="o">,</span> <span class="n">contents</span><span class="o">)</span>
<span class="o">}</span>
<span class="o">}</span>
</pre></div>
<p>But it could also be written by calling <code>alg.M.bind</code> directly. Implementers’ choice. </p>
<p>For the <code>traverse</code> method, we need two <code>import</code>s, using the “à la carte” import style, and to pass along the <code>Monad</code>.</p>
<div class="codehilite"><pre><span></span><span class="k">import</span> <span class="nn">scalaz.syntax.traverse._</span>
<span class="k">import</span> <span class="nn">scalaz.std.list._</span>
<span class="c1">// and in the method</span>
<span class="n">names</span><span class="o">.</span><span class="n">traverse</span><span class="o">(</span><span class="n">alg</span><span class="o">.</span><span class="n">readFile</span><span class="o">(</span><span class="k">_</span><span class="o">))(</span><span class="n">alg</span><span class="o">.</span><span class="n">M</span><span class="o">)</span>
</pre></div>
<p>(Use the type-parameter style for algebra definition to avoid this unfortunate failure of implicit resolution.)</p>
<h2 id="markdown-header-finding-more-functional-combinators-like-catching-errors">Finding more functional combinators, like catching errors</h2>
<p>In the type signatures for <code>FSAlg</code>, we haven’t really accounted for the fact that these functions can fail in real-world interpreters, and even in the test interpreter. Well, we have, in the design of the <code>F</code> choices, but that just delays any error until the caller runs the <code>F</code>. </p>
<ol>
<li>For <code>IOFulFSAlg</code>, calling the <code>F</code> function will throw, effectively halting the sequence. </li>
<li>For <code>TestDir</code>, errors are represented with <code>Left</code>; reading the <code>bind</code> implementation, you can see that the <code>Left</code> case means that <code>next</code> is not called, effectively short-circuiting the program just like our <code>IOFul</code> does with exceptions. </li>
</ol>
<p>This isn’t part of the tagless-final pattern; it’s a design choice. When we didn’t include error reporting in the return type of functions like <code>writeFile</code> that certainly can fail in practice, we implied that every interpreter’s <code>F</code> would account for errors. That’s not a convention, it’s an unavoidable outcome of this design decision.</p>
<p>If we want to write effectful programs that can handle errors, which is also a choice itself, we have a couple options.</p>
<h3 id="markdown-header-1-the-explicit-strategy">1. The “explicit” strategy</h3>
<p>For functions that can fail, include a representation of error cases inside the <code>F</code>. So <code>FSAlg</code> might have a different signature for <code>readFile</code>:</p>
<div class="codehilite"><pre><span></span><span class="k">def</span> <span class="n">readFile</span><span class="o">(</span><span class="n">name</span><span class="k">:</span> <span class="kt">String</span><span class="o">)</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">Either</span><span class="o">[</span><span class="kt">Err</span>, <span class="kt">String</span><span class="o">]]</span>
</pre></div>
<p>This is the “explicit” strategy, and has a few major advantages:</p>
<ol>
<li><code>F</code> can be simpler, because it need not model errors.</li>
<li>The algebra can have a mix of failing and non-failing functions. The user of the algebra can tell which is which by looking at the return types.</li>
<li>Effectful program authors can delineate which parts of their program may have unhandled errors. </li>
</ol>
<h3 id="markdown-header-2-the-implicit-strategy">2. The “implicit” strategy</h3>
<p>Incorporate an error-recovery function to convert an error into something that can be handled. Choose an error type, such as <code>E</code>, and add such a function as this:</p>
<div class="codehilite"><pre><span></span><span class="k">def</span> <span class="n">catchError</span><span class="o">[</span><span class="kt">A</span><span class="o">](</span><span class="n">mayFail</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">A</span><span class="o">],</span> <span class="n">recover</span><span class="k">:</span> <span class="kt">E</span> <span class="o">=></span> <span class="n">F</span><span class="o">[</span><span class="kt">A</span><span class="o">])</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">A</span><span class="o">]</span>
</pre></div>
<p>Assuming that you have incorporated <code>Monad</code> or at least its weaker relative <code>Functor</code> into your algebra, as we have, this is precisely equivalent in power to </p>
<div class="codehilite"><pre><span></span><span class="k">def</span> <span class="n">catchError</span><span class="o">[</span><span class="kt">A</span><span class="o">](</span><span class="n">mayFail</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">A</span><span class="o">])</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">Either</span><span class="o">[</span><span class="kt">E</span>, <span class="kt">A</span><span class="o">]]</span>
</pre></div>
<p>except that its similarity to the <code>try</code>/<code>catch</code> form is more obvious. Alternatively, you might provide for some kind of filtering, so you drop some errors but not others; perhaps <code>recover</code> might be a <code>PartialFunction</code>, or you might take an additional argument that somehow explains to the interpreter which errors you want to handle or let go. </p>
<p>You may also wish to include the equivalent of <code>try</code>/<code>finally</code>, <code>bracket</code>:</p>
<div class="codehilite"><pre><span></span><span class="k">def</span> <span class="n">bracket</span><span class="o">[</span><span class="kt">A</span><span class="o">](</span><span class="n">first</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">A</span><span class="o">],</span> <span class="n">cleanup</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">Unit</span><span class="o">])</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">A</span><span class="o">]</span>
</pre></div>
<p>This “implicit” strategy has its own set of advantages:</p>
<ol>
<li>The potential for failure need not be noted on each algebra method; it is assumed.</li>
<li>Effectful programs can allow errors to percolate up “automatically”, so to speak. (Doing this with the “explicit” variant is possible, but a little tricky. </li>
</ol>
<p>Unfortunately, there is no way to type-check that an effectful program in the “implicit” style has handled all errors, because <code>F</code>, no matter what, represents a program that might fail in this design. </p>
<h3 id="markdown-header-3-the-someone-elses-problem-strategy">3. The “someone else’s problem” strategy</h3>
<p>As with #2, but <em>don’t</em> provide any means of recovery. </p>
<p>This is a question of delineated responsibility. For many effectful programs, it simply isn’t meaningful to recover from errors originating in the interpreter, and it’s always more appropriate for them to be handled by the invoker of the program, as an “early termination”. </p>
<p>In such a situation, you can communicate this by leaving error-catching out of the interpreter. Though you might want to at least document that you intended to leave the functionality out, and didn’t simply forget it!</p>
<p><em>None</em> of these strategies is more or less pure; they all preserve the purity of effectful programs. </p>
<p>There is a broader problem here, though: how can you find type signatures like <code>catchError</code>, that convert concepts that <em>seem</em> to require side effects or special support, into plain algebra calls that work for pure FP programs? One great resource is the Haskell <code>base</code> library. Haskell requires all programs, even effectful ones, to be written using only pure constructs and ordinary functions, so many such problems have been solved there. <code>catchError</code> comes from <a href="">the <code>MonadError</code> typeclass</a>, which supplies a mini-algebra much like <code>FSAlg</code>, but specifically for throwing and catching errors. </p>
<ol>
<li>Break an “effectful” idea down into a primitive concept you’d like to support. </li>
<li>Research how this is handled in the <code>IO</code> algebra for Haskell. </li>
<li>Replace <code>IO</code> with <code>F</code> and incorporate into your algebra. </li>
</ol>
<p>Here are the implementations of <code>catchError</code> and <code>bracket</code> for our two interpreters. One test for your choice of effectful API is whether interpreters can implement it. I’ve chosen the <code>Err</code> type to represent errors to effectful programs, but the choice is yours.</p>
<div class="codehilite"><pre><span></span><span class="k">import</span> <span class="nn">scala.util.control.NonFatal</span>
<span class="c1">// IOFulFSAlg</span>
<span class="k">def</span> <span class="n">catchError</span><span class="o">[</span><span class="kt">A</span><span class="o">](</span><span class="n">mayFail</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">A</span><span class="o">],</span> <span class="c1">// () => A</span>
<span class="n">recover</span><span class="k">:</span> <span class="kt">Err</span> <span class="o">=></span> <span class="n">F</span><span class="o">[</span><span class="kt">A</span><span class="o">])</span> <span class="c1">// Err => () => A</span>
<span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">A</span><span class="o">]</span> <span class="k">=</span>
<span class="o">()</span> <span class="k">=></span> <span class="k">try</span> <span class="n">mayFail</span><span class="o">()</span> <span class="k">catch</span> <span class="o">{</span>
<span class="k">case</span> <span class="nc">NonFatal</span><span class="o">(</span><span class="n">e</span><span class="o">)</span> <span class="k">=></span> <span class="n">recover</span><span class="o">(</span><span class="nc">Err</span><span class="o">(</span><span class="n">e</span><span class="o">.</span><span class="n">getMessage</span><span class="o">))()</span>
<span class="o">}</span>
<span class="c1">// TestDirFSAlg</span>
<span class="k">def</span> <span class="n">catchError</span><span class="o">[</span><span class="kt">A</span><span class="o">](</span><span class="n">mayFail</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">A</span><span class="o">],</span>
<span class="n">recover</span><span class="k">:</span> <span class="kt">Err</span> <span class="o">=></span> <span class="n">F</span><span class="o">[</span><span class="kt">A</span><span class="o">])</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">A</span><span class="o">]</span> <span class="k">=</span>
<span class="n">dir</span> <span class="k">=></span> <span class="o">{</span>
<span class="k">val</span> <span class="o">(</span><span class="n">result</span><span class="o">,</span> <span class="n">dir2</span><span class="o">)</span> <span class="k">=</span> <span class="n">mayFail</span><span class="o">(</span><span class="n">dir</span><span class="o">)</span>
<span class="n">result</span> <span class="k">match</span> <span class="o">{</span>
<span class="k">case</span> <span class="nc">Left</span><span class="o">(</span><span class="n">err</span><span class="o">)</span> <span class="k">=></span> <span class="n">recover</span><span class="o">(</span><span class="n">err</span><span class="o">)(</span><span class="n">dir2</span><span class="o">)</span>
<span class="k">case</span> <span class="nc">Right</span><span class="o">(</span><span class="n">a</span><span class="o">)</span> <span class="k">=></span> <span class="o">(</span><span class="nc">Right</span><span class="o">(</span><span class="n">a</span><span class="o">),</span> <span class="n">dir2</span><span class="o">)</span>
<span class="o">}</span>
<span class="o">}</span>
<span class="c1">// IOFul</span>
<span class="k">def</span> <span class="n">bracket</span><span class="o">[</span><span class="kt">A</span><span class="o">](</span><span class="n">first</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">A</span><span class="o">],</span> <span class="n">cleanup</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">Unit</span><span class="o">])</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">A</span><span class="o">]</span> <span class="k">=</span>
<span class="o">()</span> <span class="k">=></span> <span class="k">try</span> <span class="n">first</span><span class="o">()</span> <span class="k">finally</span> <span class="n">cleanup</span><span class="o">()</span>
<span class="c1">// TestDir</span>
<span class="k">def</span> <span class="n">bracket</span><span class="o">[</span><span class="kt">A</span><span class="o">](</span><span class="n">first</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">A</span><span class="o">],</span> <span class="n">cleanup</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">Unit</span><span class="o">])</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">A</span><span class="o">]</span> <span class="k">=</span>
<span class="n">dir</span> <span class="k">=></span> <span class="o">{</span>
<span class="k">val</span> <span class="o">(</span><span class="n">result</span><span class="o">,</span> <span class="n">dir2</span><span class="o">)</span> <span class="k">=</span> <span class="n">first</span><span class="o">(</span><span class="n">dir</span><span class="o">)</span>
<span class="k">val</span> <span class="o">(</span><span class="k">_</span><span class="o">,</span> <span class="n">dir3</span><span class="o">)</span> <span class="k">=</span> <span class="n">cleanup</span><span class="o">(</span><span class="n">dir2</span><span class="o">)</span>
<span class="o">(</span><span class="n">result</span><span class="o">,</span> <span class="n">dir3</span><span class="o">)</span>
<span class="o">}</span>
</pre></div>
<h2 id="markdown-header-when-should-i-take-an-f-argument">When should I take an <code>F</code> argument?</h2>
<p>The functions so far follow a pattern that is common for algebra that include <code>Monad</code>. Specifically, our algebra API comes in two flavors:</p>
<ol>
<li>Specific functions like <code>readFile</code> that carry out some task specific to this domain; these <em>return</em> an <code>F</code> but do not <em>take</em> an <code>F</code> as an argument. </li>
<li>Abstract effect combinators like <code>map</code>, <code>bind</code>, <code>catchError</code> that likewise <em>return</em> an <code>F</code> but also <em>take</em> one or more as arguments. </li>
</ol>
<p>This pattern arises because for functions like <code>readFile</code>, this is the most useful signature in the presence of <code>Monad</code>.</p>
<p>With <code>Monad</code> in place, we can easily implement <code>copy</code>’s writing step in terms of <code>flatMap</code> or <code>bind</code>. Without it, we might be tempted to solve the problem of calling <code>writeFile</code> by adding an <code>F</code> argument.</p>
<div class="codehilite"><pre><span></span><span class="c1">// don't do this</span>
<span class="k">def</span> <span class="n">writeFileBad</span><span class="o">(</span><span class="n">filename</span><span class="k">:</span> <span class="kt">String</span><span class="o">,</span> <span class="n">contents</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">String</span><span class="o">])</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">Unit</span><span class="o">]</span>
</pre></div>
<p>Now we can call <code>writeFileBad</code> directly with the result of <code>readFile</code>. But what about these?</p>
<ol>
<li>Suppose we want to process the contents of the source file before writing them. Maybe we want to split in two lines and filter some out (like <code>grep</code>), or sort them?</li>
<li>Suppose we want to read two or more files, writing all of their contents to the target file? </li>
<li>Suppose we wanted to read a file that contains, itself, a list of filenames, and we want to concatenate all of the contents of <em>those</em> files and put them into the target file?</li>
</ol>
<p>The redesigned <code>writeFileBad</code> is good for only one sort of thing: things like <code>copy</code>. <code>Monad</code> is so ubiquitous partly because it is flexible enough to solve all these combination problems, and many more besides.</p>
<p>Effectful programs can all split their demands on their algebras into wanting to call these two sorts of primitive algebra functions; a program calling <code>writeFileBad</code> can call <code>writeFile</code> and <code>flatMap</code> instead, and will be better for it. Learning to recognize when you’ve accidentally given a specific function the job of a generic combinator is a highly useful skill for the design of abstract APIs.</p>
<h2 id="markdown-header-bending-the-mind-the-right-way">Bending the mind the right way</h2>
<p>The most difficult part of learning to use this pattern is learning how to design usable function types for your effect algebras. I suggested earlier looking at <code>IO</code> in Haskell, because they’ve encountered and solved such problems many times, because they had to. </p>
<p>So this is an attitude worth adopting. Demand a purely-functional approach in your effectful programs. That so many purely functional effect types are widely known is a testament to the unwillingness to compromise the integrity of the Haskell model or abandon the reasoning power that comes with referential transparency. </p>
<p>It’s a good idea to have two interpreters, one like <code>IOFul</code> and one like <code>TestMap</code>, or at least to imagine both. When adding a new function, think </p>
<ol>
<li>Will using this function in effectful programs cause effects before “running” the returned <code>F</code>?</li>
<li>If I use this with a side-effect-free interpreter, will running <code>F</code> have side effects?</li>
</ol>
<p>If the answer to either of these is “yes”, change the type signature.</p>
<h2 id="markdown-header-still-to-come">Still to come</h2>
<ol>
<li><a href="/2016/12/part-3-working-with-abstract-f.html">Working with the abstract <code>F</code></a>;</li>
<li>How much is this “dependency injection”?</li>
</ol>
<p><em>This article was tested with Scala 2.12.0 and Scalaz 7.2.8.</em></p>Stephen Compallhttp://www.blogger.com/profile/13346366374080232972noreply@blogger.com0tag:blogger.com,1999:blog-1184549185438247550.post-37283836804909302002016-12-03T10:28:00.000-05:002016-12-03T10:33:50.754-05:00Tagless final effects à la Ermine Writers<p><em>This is the first of a four-part series on tagless-final effects.</em></p>
<p><a href="http://okmij.org/ftp/tagless-final/">“Finally tagless”</a> notation is a nice way to get many of the benefits of free monads without paying so dearly in allocation of steps.</p>
<p>Watching John DeGoes’s 2015 <a href="https://github.com/jdegoes/scalaworld-2015">presentation of free applicatives</a>, it struck me that I didn’t quite like the notation of “finally tagless” demonstrated therein. To me, threading an algebra compares unfavorably to having an algebra in the program’s scope. The latter is the approach taken by the implementation of the <a href="https://bitbucket.org/ermine-language/ermine-writers/">Ermine Writers</a>.</p>
<p>I think the style of effectful programs written like this will be appealing to programmers transitioning from out-of-control side effects to functional programming. By avoiding the sequence of intermediate command data structures that characterizes free monad effects, it saves significant runtime cost; the implementation of the interpreter should also be more obvious to the newcomer. On the other hand, it preserves the idea of multiple interpreter-dependent output types, and with it the testability benefit.</p>
<p>While this style does away with the abstract command structures, it preserves the limitations on available effects provided by good effect libraries. It does this by means of type-level abstraction rather than by means of a specific effect structure. This means that the abstraction is enforced, but erased; the concrete structures are the same as the interpreter output, though the code choosing effects can’t tell what that is.</p>
<p>There’s no library for me to mandate, or that you have to adopt. I recommend that you have a library like <a href="https://github.com/scalaz/scalaz#scalaz">Scalaz</a> or <a href="https://github.com/non/cats#cats">Cats</a> with <a href="http://eed3si9n.com/learning-scalaz/Monad.html"><code>Monad</code></a>, <a href="http://eed3si9n.com/learning-scalaz/IO+Monad.html"><code>IO</code></a> (for interpreters), and related functionality available, but it’s not a requirement. The code involved in adopting this style is specific to your use case.</p>
<p>While this is a good alternative to <a href="https://github.com/scalaz/scalaz/blob/v7.2.3/example/src/main/scala/scalaz/example/FreeUsage.scala">free monads</a>, <a href="https://github.com/atnos-org/eff-cats#eff"><code>eff</code></a>, and the like, it integrates well with them, too. You can combine these effects with other systems as convenient, either to implement interpreters or to produce effects within effect-abstract code, especially if you incorporate <code>Monad</code>.</p>
<p>How can this all be accomplished? With <a href="http://typelevel.org/blog/2016/08/21/hkts-moving-forward.html"><strong>higher-kinded types</strong></a>, that is, abstraction over type constructors.</p>
<h2 id="markdown-header-declaring-an-algebra">Declaring an algebra</h2>
<p>As with other designs for constrained effects, you need to declare the various effects you’re going to support. For the sake of a little exoticism, I’m going to declare a simple filesystem interface.</p>
<div class="codehilite"><pre><span></span><span class="k">trait</span> <span class="nc">FSAlg</span> <span class="o">{</span>
<span class="c1">// I'm using an abstract type constructor (i.e. FSAlg</span>
<span class="c1">// is higher-kinded) for the effect type, but this</span>
<span class="c1">// converts readily to a type parameter F[_] on FSAlg,</span>
<span class="c1">// like Scanner in Ermine</span>
<span class="k">type</span> <span class="kt">F</span><span class="o">[</span><span class="kt">A</span><span class="o">]</span>
<span class="k">def</span> <span class="n">listDirectory</span><span class="o">(</span><span class="n">pathname</span><span class="k">:</span> <span class="kt">String</span><span class="o">)</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">List</span><span class="o">[</span><span class="kt">String</span><span class="o">]]</span>
<span class="k">def</span> <span class="n">readFile</span><span class="o">(</span><span class="n">pathname</span><span class="k">:</span> <span class="kt">String</span><span class="o">)</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">String</span><span class="o">]</span>
<span class="k">def</span> <span class="n">writeFile</span><span class="o">(</span><span class="n">pathname</span><span class="k">:</span> <span class="kt">String</span><span class="o">,</span> <span class="n">contents</span><span class="k">:</span> <span class="kt">String</span><span class="o">)</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">Unit</span><span class="o">]</span>
<span class="o">}</span>
</pre></div>
<h2 id="markdown-header-writing-an-effectful-program">Writing an effectful program</h2>
<p>Instances of <code>FSAlg</code> supply concrete operations for the <code>F</code> type constructor, and so must choose a concrete <code>F</code> as well. Concrete programs that choose <em>which effects to perform</em> should be abstract in <code>F</code> under this design approach.</p>
<p>Instances of <code>FSAlg</code> can be defined in <a href="http://www.cakesolutions.net/teamblogs/demystifying-implicits-and-typeclasses-in-scala">typeclass</a> style (in which case you should use a type parameter for <code>F</code> instead of a type member), or passed as normal arguments.</p>
<p>The other thing ‘effectful programs’ do in this style is return an <code>F</code>; specifically, the <code>F</code> associated with the algebra instance being passed in. In this way, this style radically departs from “dependency injection” or <a href="http://c2.com/cgi/wiki?StrategyPattern">“the strategy pattern”</a>—the “dependency” has a concrete influence on the public return type.</p>
<p>Organizing the program as a set of standalone methods makes it easy to use path-dependent style to return the correct type of effect, when using a type member.</p>
<div class="codehilite"><pre><span></span><span class="k">def</span> <span class="n">write42</span><span class="o">(</span><span class="n">alg</span><span class="k">:</span> <span class="kt">FSAlg</span><span class="o">)(</span><span class="n">pathname</span><span class="k">:</span> <span class="kt">String</span><span class="o">)</span><span class="k">:</span> <span class="kt">alg.F</span><span class="o">[</span><span class="kt">Unit</span><span class="o">]</span> <span class="k">=</span>
<span class="n">alg</span><span class="o">.</span><span class="n">writeFile</span><span class="o">(</span><span class="n">pathname</span><span class="o">,</span> <span class="s">"42"</span><span class="o">)</span>
</pre></div>
<p>A typeclass version would look more like</p>
<div class="codehilite"><pre><span></span><span class="k">def</span> <span class="n">write42</span><span class="o">[</span><span class="kt">F</span><span class="o">[</span><span class="k">_</span><span class="o">]](</span><span class="n">pathname</span><span class="k">:</span> <span class="kt">String</span><span class="o">)(</span><span class="k">implicit</span> <span class="n">alg</span><span class="k">:</span> <span class="kt">FSAlg</span><span class="o">[</span><span class="kt">F</span><span class="o">])</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">Unit</span><span class="o">]</span> <span class="k">=</span>
<span class="n">alg</span><span class="o">.</span><span class="n">writeFile</span><span class="o">(</span><span class="n">pathname</span><span class="o">,</span> <span class="s">"42"</span><span class="o">)</span>
</pre></div>
<p>To avoid passing around the <code>alg</code> everywhere, you might put a whole group of methods under a class, and have the class take the algebra as a constructor parameter. This is a straightforward translation for the typeclass or simple type parameter approach.</p>
<div class="codehilite"><pre><span></span><span class="k">class</span> <span class="nc">MyProg</span><span class="o">[</span><span class="kt">F</span><span class="o">[</span><span class="k">_</span><span class="o">]](</span><span class="k">implicit</span> <span class="n">alg</span><span class="k">:</span> <span class="kt">FSAlg</span><span class="o">[</span><span class="kt">F</span><span class="o">])</span> <span class="o">{</span>
<span class="c1">// several methods using F and alg</span>
<span class="o">}</span>
</pre></div>
<p>This is the organization style of Ermine Writers; each individual <a href=""><code>Writer</code></a> class (e.g. <a href="https://bitbucket.org/ermine-language/ermine-writers/src/9b15ed69c77fe917579dfe92896d7957f579a5af/writers/html/src/main/scala/com/clarifi/reporting/writers/HTMLWriter.scala?at=default&fileviewer=file-view-default#HTMLWriter.scala-675"><code>HTMLWriter</code></a>) is similar to <code>MyProg</code>.</p>
<p>Doing this with a type member is a little trickier; if you simple put a <code>alg: FSAlg</code> argument in the constructor, you’ll “forget” the <code>F</code>, existential-style. You can either put a bounded type parameter on the class for the alg:</p>
<div class="codehilite"><pre><span></span><span class="k">class</span> <span class="nc">MyProg</span><span class="o">[</span><span class="kt">Alg</span> <span class="k"><:</span> <span class="kt">FSAlg</span><span class="o">](</span><span class="n">alg</span><span class="k">:</span> <span class="kt">Alg</span><span class="o">)</span>
</pre></div>
<p>or a higher-kinded parameter, via <a href="http://typelevel.org/blog/2015/07/19/forget-refinement-aux.html#why-t0--whats-aux">the <code>Aux</code> pattern</a>.</p>
<div class="codehilite"><pre><span></span><span class="c1">// object FSAlg</span>
<span class="k">type</span> <span class="kt">Aux</span><span class="o">[</span><span class="kt">F0</span><span class="o">[</span><span class="k">_</span><span class="o">]]</span> <span class="k">=</span> <span class="nc">FSAlg</span> <span class="o">{</span><span class="k">type</span> <span class="kt">F</span><span class="o">[</span><span class="kt">X</span><span class="o">]</span> <span class="k">=</span> <span class="n">F0</span><span class="o">[</span><span class="kt">X</span><span class="o">]}</span>
<span class="c1">// replacing MyProg</span>
<span class="k">class</span> <span class="nc">MyProg</span><span class="o">[</span><span class="kt">F</span><span class="o">[</span><span class="k">_</span><span class="o">]](</span><span class="n">alg</span><span class="k">:</span> <span class="kt">FSAlg.Aux</span><span class="o">[</span><span class="kt">F</span><span class="o">])</span>
</pre></div>
<p>I think the latter yields easier-to-understand method types and type errors, but all of the above alternatives have equal power. So choose whatever seems nice, and change it later if you like.</p>
<h2 id="markdown-header-writing-an-interpreter">Writing an interpreter</h2>
<p>When writing the effectful program, you’re condemned to be free: you have to choose the effects to perform. The “interpreter”, which “executes” the effect, is more of a guided exercise. You must extend the algebra trait, <code>FSAlg</code>, implementing all of the abstract members.</p>
<div class="codehilite"><pre><span></span><span class="k">object</span> <span class="nc">IOFulFSAlg</span> <span class="k">extends</span> <span class="nc">FSAlg</span> <span class="o">{</span>
<span class="k">import</span> <span class="nn">java.io.File</span><span class="o">,</span> <span class="n">java</span><span class="o">.</span><span class="n">nio</span><span class="o">.</span><span class="n">file</span><span class="o">.{</span><span class="nc">Files</span><span class="o">,</span> <span class="nc">Paths</span><span class="o">}</span>
<span class="k">type</span> <span class="kt">F</span><span class="o">[</span><span class="kt">A</span><span class="o">]</span> <span class="k">=</span> <span class="o">()</span> <span class="k">=></span> <span class="n">A</span> <span class="c1">// your choice!</span>
<span class="k">def</span> <span class="n">listDirectory</span><span class="o">(</span><span class="n">pathname</span><span class="k">:</span> <span class="kt">String</span><span class="o">)</span><span class="k">:</span> <span class="o">()</span> <span class="o">=></span> <span class="nc">List</span><span class="o">[</span><span class="kt">String</span><span class="o">]</span> <span class="k">=</span>
<span class="o">()</span> <span class="k">=></span> <span class="k">new</span> <span class="nc">File</span><span class="o">(</span><span class="n">pathname</span><span class="o">).</span><span class="n">list</span><span class="o">().</span><span class="n">toList</span>
<span class="k">def</span> <span class="n">readFile</span><span class="o">(</span><span class="n">pathname</span><span class="k">:</span> <span class="kt">String</span><span class="o">)</span><span class="k">:</span> <span class="o">()</span> <span class="o">=></span> <span class="nc">String</span> <span class="k">=</span>
<span class="o">()</span> <span class="k">=></span> <span class="k">new</span> <span class="nc">String</span><span class="o">(</span><span class="nc">Files</span> <span class="n">readAllBytes</span> <span class="o">(</span><span class="nc">Paths</span> <span class="n">get</span> <span class="n">pathname</span><span class="o">))</span>
<span class="k">def</span> <span class="n">writeFile</span><span class="o">(</span><span class="n">pathname</span><span class="k">:</span> <span class="kt">String</span><span class="o">,</span> <span class="n">contents</span><span class="k">:</span> <span class="kt">String</span><span class="o">)</span><span class="k">:</span> <span class="o">()</span> <span class="o">=></span> <span class="nc">Unit</span> <span class="k">=</span>
<span class="o">()</span> <span class="k">=></span> <span class="nc">Files</span> <span class="n">write</span> <span class="o">(</span><span class="nc">Paths</span> <span class="n">get</span> <span class="n">pathname</span><span class="o">,</span> <span class="n">contents</span><span class="o">.</span><span class="n">getBytes</span><span class="o">)</span>
<span class="o">}</span>
</pre></div>
<p>The key is to choose an <code>F</code> type—you can almost consider it an implementation detail of the class—that will allow you to implement the methods <em>without side-effecting when they are called</em>. I’ve made a good starter choice above, but the real magic happens when I choose more interesting <code>F</code>s.</p>
<h2 id="markdown-header-a-test-interpreter">A test interpreter</h2>
<p>With <code>F</code> abstract in “real” programs, we can choose different ones for different interpreters. Here we use one that allows simulation of the algebra methods without performing any side effects or mutation.</p>
<div class="codehilite"><pre><span></span><span class="k">final</span> <span class="k">case</span> <span class="k">class</span> <span class="nc">Directory</span><span class="o">(</span>
<span class="n">listing</span><span class="k">:</span> <span class="kt">Map</span><span class="o">[</span><span class="kt">String</span>, <span class="kt">Either</span><span class="o">[</span><span class="kt">String</span>, <span class="kt">Directory</span><span class="o">]])</span>
<span class="k">final</span> <span class="k">case</span> <span class="k">class</span> <span class="nc">Err</span><span class="o">(</span><span class="n">msg</span><span class="k">:</span> <span class="kt">String</span><span class="o">)</span>
<span class="k">object</span> <span class="nc">TestMapAlg</span> <span class="k">extends</span> <span class="nc">FSAlg</span> <span class="o">{</span>
<span class="k">type</span> <span class="kt">F</span><span class="o">[</span><span class="kt">A</span><span class="o">]</span> <span class="k">=</span> <span class="nc">Directory</span> <span class="k">=></span> <span class="o">(</span><span class="nc">Either</span><span class="o">[</span><span class="kt">Err</span>, <span class="kt">A</span><span class="o">],</span> <span class="nc">Directory</span><span class="o">)</span>
<span class="k">private</span> <span class="k">def</span> <span class="n">splitPath</span><span class="o">(</span><span class="n">p</span><span class="k">:</span> <span class="kt">String</span><span class="o">)</span> <span class="k">=</span>
<span class="n">p</span><span class="o">.</span><span class="n">split</span><span class="o">(</span><span class="sc">'/'</span><span class="o">).</span><span class="n">toList</span>
<span class="k">private</span> <span class="k">def</span> <span class="n">readLocation</span><span class="o">(</span><span class="n">dir</span><span class="k">:</span> <span class="kt">Directory</span><span class="o">,</span> <span class="n">p</span><span class="k">:</span> <span class="kt">List</span><span class="o">[</span><span class="kt">String</span><span class="o">])</span>
<span class="k">:</span> <span class="kt">Option</span><span class="o">[</span><span class="kt">Either</span><span class="o">[</span><span class="kt">String</span>, <span class="kt">Directory</span><span class="o">]]</span> <span class="k">=</span>
<span class="n">p</span> <span class="k">match</span> <span class="o">{</span>
<span class="k">case</span> <span class="nc">List</span><span class="o">()</span> <span class="k">=></span> <span class="nc">Some</span><span class="o">(</span><span class="nc">Right</span><span class="o">(</span><span class="n">dir</span><span class="o">))</span>
<span class="k">case</span> <span class="n">k</span> <span class="o">+:</span> <span class="n">ks</span> <span class="k">=></span>
<span class="n">dir</span><span class="o">.</span><span class="n">listing</span> <span class="n">get</span> <span class="n">k</span> <span class="n">flatMap</span> <span class="o">{</span>
<span class="k">case</span> <span class="n">r</span><span class="nd">@Left</span><span class="o">(</span><span class="k">_</span><span class="o">)</span> <span class="k">=></span>
<span class="k">if</span> <span class="o">(</span><span class="n">ks</span><span class="o">.</span><span class="n">isEmpty</span><span class="o">)</span> <span class="nc">None</span> <span class="k">else</span> <span class="nc">Some</span><span class="o">(</span><span class="n">r</span><span class="o">)</span>
<span class="k">case</span> <span class="nc">Right</span><span class="o">(</span><span class="n">subd</span><span class="o">)</span> <span class="k">=></span> <span class="n">readLocation</span><span class="o">(</span><span class="n">subd</span><span class="o">,</span> <span class="n">ks</span><span class="o">)</span>
<span class="o">}</span>
<span class="o">}</span>
<span class="k">def</span> <span class="n">listDirectory</span><span class="o">(</span><span class="n">p</span><span class="k">:</span> <span class="kt">String</span><span class="o">)</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">List</span><span class="o">[</span><span class="kt">String</span><span class="o">]]</span> <span class="k">=</span>
<span class="n">dir</span> <span class="k">=></span> <span class="o">(</span><span class="n">readLocation</span><span class="o">(</span><span class="n">dir</span><span class="o">,</span> <span class="n">splitPath</span><span class="o">(</span><span class="n">p</span><span class="o">))</span> <span class="k">match</span> <span class="o">{</span>
<span class="k">case</span> <span class="nc">None</span> <span class="k">=></span> <span class="nc">Left</span><span class="o">(</span><span class="nc">Err</span><span class="o">(</span><span class="s">s"No such file or directory </span><span class="si">$p</span><span class="s">"</span><span class="o">))</span>
<span class="k">case</span> <span class="nc">Some</span><span class="o">(</span><span class="nc">Left</span><span class="o">(</span><span class="k">_</span><span class="o">))</span> <span class="k">=></span>
<span class="nc">Left</span><span class="o">(</span><span class="nc">Err</span><span class="o">(</span><span class="s">s"</span><span class="si">$p</span><span class="s"> is not a directory"</span><span class="o">))</span>
<span class="k">case</span> <span class="nc">Some</span><span class="o">(</span><span class="nc">Right</span><span class="o">(</span><span class="nc">Directory</span><span class="o">(</span><span class="n">m</span><span class="o">)))</span> <span class="k">=></span> <span class="nc">Right</span><span class="o">(</span><span class="n">m</span><span class="o">.</span><span class="n">keys</span><span class="o">.</span><span class="n">toList</span><span class="o">)</span>
<span class="o">},</span> <span class="n">dir</span><span class="o">)</span>
<span class="k">def</span> <span class="n">readFile</span><span class="o">(</span><span class="n">pathname</span><span class="k">:</span> <span class="kt">String</span><span class="o">)</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">String</span><span class="o">]</span> <span class="k">=</span>
<span class="n">dir</span> <span class="k">=></span> <span class="o">(</span><span class="n">readLocation</span><span class="o">(</span><span class="n">dir</span><span class="o">,</span> <span class="n">splitPath</span><span class="o">(</span><span class="n">pathname</span><span class="o">))</span> <span class="k">match</span> <span class="o">{</span>
<span class="k">case</span> <span class="nc">None</span> <span class="k">=></span> <span class="nc">Left</span><span class="o">(</span><span class="nc">Err</span><span class="o">(</span><span class="s">s"No such file or directory </span><span class="si">$pathname</span><span class="s">"</span><span class="o">))</span>
<span class="k">case</span> <span class="nc">Some</span><span class="o">(</span><span class="nc">Right</span><span class="o">(</span><span class="k">_</span><span class="o">))</span> <span class="k">=></span>
<span class="nc">Left</span><span class="o">(</span><span class="nc">Err</span><span class="o">(</span><span class="s">s"</span><span class="si">$pathname</span><span class="s"> is a directory"</span><span class="o">))</span>
<span class="k">case</span> <span class="nc">Some</span><span class="o">(</span><span class="nc">Left</span><span class="o">(</span><span class="n">c</span><span class="o">))</span> <span class="k">=></span> <span class="nc">Right</span><span class="o">(</span><span class="n">c</span><span class="o">)</span>
<span class="o">},</span> <span class="n">dir</span><span class="o">)</span>
<span class="k">def</span> <span class="n">writeFile</span><span class="o">(</span><span class="n">pathname</span><span class="k">:</span> <span class="kt">String</span><span class="o">,</span> <span class="n">contents</span><span class="k">:</span> <span class="kt">String</span><span class="o">)</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">Unit</span><span class="o">]</span> <span class="k">=</span>
<span class="n">dir</span> <span class="k">=></span> <span class="o">{</span>
<span class="k">def</span> <span class="n">rec</span><span class="o">(</span><span class="n">subdir</span><span class="k">:</span> <span class="kt">Directory</span><span class="o">,</span> <span class="n">path</span><span class="k">:</span> <span class="kt">List</span><span class="o">[</span><span class="kt">String</span><span class="o">])</span><span class="k">:</span> <span class="kt">Either</span><span class="o">[</span><span class="kt">Err</span>, <span class="kt">Directory</span><span class="o">]</span> <span class="k">=</span>
<span class="n">path</span> <span class="k">match</span> <span class="o">{</span>
<span class="k">case</span> <span class="nc">List</span><span class="o">(</span><span class="n">filename</span><span class="o">)</span> <span class="k">=></span>
<span class="nc">Right</span><span class="o">(</span><span class="nc">Directory</span><span class="o">(</span><span class="n">subdir</span><span class="o">.</span><span class="n">listing</span> <span class="o">+</span> <span class="o">((</span><span class="n">filename</span><span class="o">,</span> <span class="nc">Left</span><span class="o">(</span><span class="n">contents</span><span class="o">)))))</span>
<span class="k">case</span> <span class="n">dirname</span> <span class="o">+:</span> <span class="n">subpath</span> <span class="k">=></span>
<span class="k">val</span> <span class="n">subsubdir</span> <span class="k">=</span> <span class="n">subdir</span><span class="o">.</span><span class="n">listing</span> <span class="n">get</span> <span class="n">dirname</span>
<span class="n">subsubdir</span> <span class="k">match</span> <span class="o">{</span>
<span class="k">case</span> <span class="nc">Some</span><span class="o">(</span><span class="nc">Left</span><span class="o">(</span><span class="k">_</span><span class="o">))</span> <span class="k">=></span>
<span class="nc">Left</span><span class="o">(</span><span class="nc">Err</span><span class="o">(</span><span class="s">s"</span><span class="si">$dirname</span><span class="s"> is not a directory"</span><span class="o">))</span>
<span class="k">case</span> <span class="nc">Some</span><span class="o">(</span><span class="nc">Right</span><span class="o">(</span><span class="n">d</span><span class="o">))</span> <span class="k">=></span>
<span class="n">rec</span><span class="o">(</span><span class="n">d</span><span class="o">,</span> <span class="n">subpath</span><span class="o">)</span>
<span class="k">case</span> <span class="nc">None</span> <span class="k">=></span>
<span class="n">rec</span><span class="o">(</span><span class="nc">Directory</span><span class="o">(</span><span class="nc">Map</span><span class="o">()),</span> <span class="n">subpath</span><span class="o">)</span>
<span class="o">}</span>
<span class="o">}</span>
<span class="n">rec</span><span class="o">(</span><span class="n">dir</span><span class="o">,</span> <span class="n">splitPath</span><span class="o">(</span><span class="n">pathname</span><span class="o">))</span> <span class="k">match</span> <span class="o">{</span>
<span class="k">case</span> <span class="nc">Left</span><span class="o">(</span><span class="n">e</span><span class="o">)</span> <span class="k">=></span> <span class="o">(</span><span class="nc">Left</span><span class="o">(</span><span class="n">e</span><span class="o">),</span> <span class="n">dir</span><span class="o">)</span>
<span class="k">case</span> <span class="nc">Right</span><span class="o">(</span><span class="n">newdir</span><span class="o">)</span> <span class="k">=></span> <span class="o">(</span><span class="nc">Right</span><span class="o">(()),</span> <span class="n">newdir</span><span class="o">)</span>
<span class="o">}</span>
<span class="o">}</span>
<span class="o">}</span>
</pre></div>
<h2 id="markdown-header-executing-the-effects">Executing the effects</h2>
<p>The implementations of effectful programs in this scheme can’t tell what <code>F</code> is. But the code that chooses the interpreter and passes it to that abstract program <em>does</em> know. Accordingly, the type returned by invoking the program will change according to the <code>F</code> type of the interpreter you pass in.</p>
<div class="codehilite"><pre><span></span><span class="n">scala</span><span class="o">></span> <span class="n">write42</span><span class="o">(</span><span class="nc">IOFulFSAlg</span><span class="o">)(</span><span class="s">"hello.txt"</span><span class="o">)</span>
<span class="n">res1</span><span class="k">:</span> <span class="kt">IOFulFSAlg.F</span><span class="o">[</span><span class="kt">Unit</span><span class="o">]</span> <span class="k">=</span> <span class="o"><</span><span class="n">function0</span><span class="o">></span>
<span class="n">scala</span><span class="o">></span> <span class="n">res1</span><span class="o">()</span>
<span class="c1">// hello.txt appears on my disk. Guess what's in it?</span>
<span class="n">scala</span><span class="o">></span> <span class="n">write42</span><span class="o">(</span><span class="nc">TestMapAlg</span><span class="o">)(</span><span class="s">"hello.txt"</span><span class="o">)</span>
<span class="n">res2</span><span class="k">:</span> <span class="kt">TestMapAlg.F</span><span class="o">[</span><span class="kt">Unit</span><span class="o">]</span> <span class="k">=</span> <span class="o"><</span><span class="n">function1</span><span class="o">></span>
<span class="n">scala</span><span class="o">></span> <span class="n">res2</span><span class="o">(</span><span class="nc">Directory</span><span class="o">(</span><span class="nc">Map</span><span class="o">()))</span>
<span class="n">res4</span><span class="k">:</span> <span class="o">(</span><span class="kt">Either</span><span class="o">[</span><span class="kt">Err</span>,<span class="kt">Unit</span><span class="o">],</span> <span class="nc">Directory</span><span class="o">)</span> <span class="k">=</span>
<span class="o">(</span><span class="nc">Right</span><span class="o">(()),</span><span class="nc">Directory</span><span class="o">(</span><span class="nc">Map</span><span class="o">(</span><span class="n">hello</span><span class="o">.</span><span class="n">txt</span> <span class="o">-></span> <span class="nc">Left</span><span class="o">(</span><span class="mi">42</span><span class="o">))))</span>
</pre></div>
<p>As the invoker of the interpreter, this very concrete level—the first truly concrete segments of code I’ve shown in this post—is responsible for supplying the “execution environment”. It’s <em>here</em> that side effects—if any!—happen. For the first example, we can invoke the zero-argument function and watch the side effects happen. In the second case, we can make up a test <code>Directory</code> and inspect the resulting tuple for the final <code>Directory</code> state, and error if any.</p>
<p>Otherwise, the usual rules of the interpreter pattern apply; by inventing new instances of <code>FSAlg</code>, we can choose different things that should happen in the effects of the various algebra methods.</p>
<h2 id="markdown-header-effects-must-be-delayed">Effects must be delayed</h2>
<p>You may be tempted to use an <code>F</code> like this:</p>
<div class="codehilite"><pre><span></span><span class="k">type</span> <span class="kt">F</span><span class="o">[</span><span class="kt">A</span><span class="o">]</span> <span class="k">=</span> <span class="n">A</span>
</pre></div>
<p>and then do something like the <code>IOFul</code> implementation without the leading <code>() =></code>. This will seem to work, but effectively prevent effectful programs from doing functional programming.</p>
<p>We can see why via a simple counterexample. Consider this simple program,</p>
<div class="codehilite"><pre><span></span><span class="n">readFile</span><span class="o">(</span><span class="s">"hello.txt"</span><span class="o">)</span>
<span class="c1">// ...</span>
<span class="n">writeFile</span><span class="o">(</span><span class="s">"hello.txt"</span><span class="o">,</span> <span class="s">"33"</span><span class="o">)</span>
</pre></div>
<p>According to the rules of FP, the below program must always do the same thing as the above program.</p>
<div class="codehilite"><pre><span></span><span class="k">val</span> <span class="n">wf</span> <span class="k">=</span> <span class="n">writeFile</span><span class="o">(</span><span class="s">"hello.txt"</span><span class="o">,</span> <span class="s">"33"</span><span class="o">)</span>
<span class="n">readFile</span><span class="o">(</span><span class="s">"hello.txt"</span><span class="o">)</span>
<span class="c1">// ...</span>
<span class="n">wf</span>
</pre></div>
<p>If calling <code>writeFile</code> performs a side effect right away, this will not hold true.</p>
<p>For a similar reason, naive memoization of the side effects’ results will also break FP. Consider this program:</p>
<div class="codehilite"><pre><span></span><span class="n">readFile</span><span class="o">(</span><span class="s">"hello.txt"</span><span class="o">)</span>
<span class="c1">// ...</span>
<span class="n">writeFile</span><span class="o">(</span><span class="s">"hello.txt"</span><span class="o">,</span> <span class="s">"33"</span><span class="o">)</span>
<span class="c1">// ...</span>
<span class="n">readFile</span><span class="o">(</span><span class="s">"hello.txt"</span><span class="o">)</span>
</pre></div>
<p>In FP, I can factor these two <code>readFile</code> calls to a <code>val</code>. If <code>readFile</code> memoizes with a local variable, though, the second use of that <code>val</code> gives the wrong file contents. (Of course, if you don’t have any effects in your algebra that can change the results of later effects, this is no problem!)</p>
<p>Otherwise, you don’t have to be wildly principled about the purity of your interpreters’ <code>F</code> choices, while still granting the benefits of pure FP to your effectful programs. Ermine writers’ interpreters use <a href="https://bitbucket.org/ermine-language/ermine-scala/src/e201cfb32a4b636905e54f33965fd3074e66d1b4/core/src/main/scala/com/clarifi/reporting/backends.scala?at=default&fileviewer=file-view-default#backends.scala-6">something like</a></p>
<div class="codehilite"><pre><span></span><span class="k">type</span> <span class="kt">F</span><span class="o">[</span><span class="kt">A</span><span class="o">]</span> <span class="k">=</span> <span class="n">java</span><span class="o">.</span><span class="n">sql</span><span class="o">.</span><span class="nc">Connection</span> <span class="k">=></span> <span class="n">A</span>
</pre></div>
<p>and it’s perfectly fine.</p>
<h2 id="markdown-header-relaxed-rules-in-the-interpreter-from-abstraction-in-the-program">Relaxed rules in the interpreter from abstraction in the program</h2>
<p>“Local mutation” is a broadly accepted way to implement pure functions. Even Haskell supports it, still without breaking the rules of the language, via the <code>ST</code> abstraction (covered in chapter 14 of <em>Functional Programming in Scala</em>). This is usually taken to refer strictly to mutation local to a certain dynamic scope, though; <a href="https://ocharles.org.uk/blog/guest-posts/2014-12-18-rank-n-types.html#how-runst-works">the universal quantification trick in <code>ST</code></a> is precisely meant to enforce the dynamic scope of mutable variables.</p>
<p>The Ermine Writers show us a different story, though. When using a concrete type for hiding side effects, like <code>IO</code>, you must be very careful to hide the runner, lest the side-effect be exposed.</p>
<p>Type-level abstraction such as in tagless-final changes both of these parts of the typical ‘local mutation’ method of design.</p>
<ol>
<li>Instead of being dynamically scoped, control over side effects is statically, or lexically scoped, to the code in the interpreter. This lends a new dimension to the idea of a side-effecting “shell” for a pure program—the “shell” is syntactic, not based on the patterns of function calls at runtime.</li>
<li>The only care required to not break a type-level abstraction is to not pass something that would break it in the algebra, and to follow the <a href="https://imgur.com/a04WoHn">Scalazzi Safe Scala Subset</a> and avoid use of reified type information. (This will be discussed in more detail in part 3, <a href="/2016/12/part-3-working-with-abstract-f.html">“Working with the abstract <code>F</code>”</a>.)</li>
</ol>
<p>The degree to which you can “break the rules” in your interpreter is directly proportional to the degree to which you enforce abstraction in effectful programs. Following the approach of examples in this article, there remains a great deal of freedom to experiment with interpreters that exploit your favorite mutation techniques to speed up interpretation. For example, a less naive memoization of <code>readFile</code> and <code>listDirectory</code> is admissible under the current algebra.</p>
<p>By contrast, if you expose too much detail about the <code>F</code> functor to effectful programs, then your interpreter becomes severely constrained. Suppose you define in the abstract algebra</p>
<div class="codehilite"><pre><span></span><span class="k">def</span> <span class="n">asReader</span><span class="o">[</span><span class="kt">A</span><span class="o">](</span><span class="n">fa</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">A</span><span class="o">])</span><span class="k">:</span> <span class="o">()</span> <span class="o">=></span> <span class="n">A</span>
</pre></div>
<p>This may be expedient, but effectively demands that every interpreter works like <code>IOFul</code>; others cannot be safely implemented.</p>
<h2 id="markdown-header-still-to-come">Still to come</h2>
<ol>
<li><a href="/2016/12/part-2-role-of-monad.html">The role of Monad</a>;</li>
<li><a href="/2016/12/part-3-working-with-abstract-f.html">Working with the abstract <code>F</code></a>;</li>
<li>How much is this “dependency injection”?</li>
</ol>
<p>Also, Adelbert Chang is covering <a href="http://typelevel.org/blog/2016/09/21/edsls-part-1.html">“Monadic EDSLs in Scala”</a> in a series over on Typelevel blog; he’s taking a different route to many of the same ideas as this series. I suggest checking out both his series and this one to find the most comfortable route for <em>you</em>.</p>
<p><em>This article was tested with Scala 2.12.0.</em></p>Stephen Compallhttp://www.blogger.com/profile/13346366374080232972noreply@blogger.com0tag:blogger.com,1999:blog-1184549185438247550.post-39209323146578209752016-09-16T05:40:00.002-04:002016-10-31T20:10:25.305-04:00The missing diamond of Scala variance
<p><em>This article is an expanded version of my LambdaConf 2016 talk of the
same name
(<a href="https://www.youtube.com/watch?v=h4LzUkYQGyE">video</a>, <a href="https://github.com/lambdaconf/lambdaconf-2016-usa/raw/master/Missing%20diamond%20of%20Scala%20variance/missing-diamond.pdf">slides</a>). The
below covers every topic from that talk, and much more on possible
extensions to variance polymorphism, but the talk is a gentler
introduction to the main concepts here.</em></p>
<p>As part of its subtyping system, Scala features <em>variance</em>. Using
variance lets you lift the subtyping relation into type constructors,
just like type equality (<code>~</code>) already does.</p>
<div class="codehilite"><pre><span></span><span class="k">type</span> <span class="kt">Endo</span><span class="o">[</span><span class="kt">A</span><span class="o">]</span> <span class="k">=</span> <span class="n">A</span> <span class="k">=></span> <span class="n">A</span> <span class="c1">// invariant</span>
<span class="k">type</span> <span class="kt">Get</span><span class="o">[</span><span class="kt">+A</span><span class="o">]</span> <span class="k">=</span> <span class="nc">Foo</span> <span class="k">=></span> <span class="n">A</span> <span class="c1">// covariant</span>
<span class="k">type</span> <span class="kt">Put</span><span class="o">[</span><span class="kt">-A</span><span class="o">]</span> <span class="k">=</span> <span class="n">A</span> <span class="k">=></span> <span class="nc">Foo</span> <span class="c1">// contravariant</span>
<span class="n">X</span> <span class="o">~</span> <span class="n">Y</span> <span class="o">→</span> <span class="nc">Endo</span><span class="o">[</span><span class="kt">X</span><span class="o">]</span> <span class="o">~</span> <span class="nc">Endo</span><span class="o">[</span><span class="kt">Y</span><span class="o">]</span> <span class="c1">// invariant</span>
<span class="n">X</span> <span class="k"><:</span> <span class="n">Y</span> <span class="o">→</span> <span class="nc">Get</span><span class="o">[</span><span class="kt">X</span><span class="o">]</span> <span class="k"><:</span> <span class="nc">Get</span><span class="o">[</span><span class="kt">Y</span><span class="o">]</span> <span class="c1">// covariant</span>
<span class="n">X</span> <span class="k"><:</span> <span class="n">Y</span> <span class="o">→</span> <span class="nc">Put</span><span class="o">[</span><span class="kt">Y</span><span class="o">]</span> <span class="k"><:</span> <span class="nc">Put</span><span class="o">[</span><span class="kt">X</span><span class="o">]</span> <span class="c1">// contravariant</span>
<span class="o">↑</span><span class="n">reversed</span><span class="o">!↑</span>
</pre></div>
<h2 id="markdown-header-subtyping-is-incomplete-without-variance">Subtyping is incomplete without variance</h2>
<p>With simple type equality, you have four properties:</p>
<ol>
<li>Reflexivity: <em>A</em> ~ <em>A</em></li>
<li>Symmetry: <em>A</em> ~ <em>B</em> → <em>B</em> ~ <em>A</em></li>
<li>Transitivity: <em>A</em> ~ <em>B</em> ∧ <em>B</em> ~ <em>C</em> → <em>A</em> ~ <em>C</em></li>
<li>Congruence: <em>A</em> ~ <em>B</em> → <em>F</em>[<em>A</em>] ~ <em>F</em>[<em>B</em>]</li>
</ol>
<p>Just try to use GADTs without equality congruence! That’s what’s
expected in a subtyping system without variance.</p>
<ol>
<li>Reflexivity: <em>A</em> <: <em>A</em></li>
<li>Antisymmetry: <em>A</em> <: <em>B</em> ∧ <em>B</em> <: <em>A</em> → <em>A</em> = <em>B</em></li>
<li>Transitivity: <em>A</em> <: <em>B</em> ∧ <em>B</em> <: <em>C</em> → <em>A</em> <: <em>C</em></li>
<li>Congruence: <em>A</em> <: <em>B</em> → <code>Put</code>[<em>B</em>] <: <code>Put</code>[<em>A</em>]</li>
</ol>
<h3 id="markdown-header-completing-subtyping-variables">Completing subtyping: variables</h3>
<div class="codehilite"><pre><span></span><span class="k">val</span> <span class="n">aCat</span> <span class="k">=</span> <span class="nc">Cat</span><span class="o">(</span><span class="s">"Audrey"</span><span class="o">)</span>
<span class="k">val</span> <span class="n">anAnimal</span><span class="k">:</span> <span class="kt">Animal</span> <span class="o">=</span> <span class="n">aCat</span>
</pre></div>
<p>A bare type is in a covariant position. You can’t abstract over
something as simple as a value box without variance.</p>
<h3 id="markdown-header-completing-subtyping-the-harmony-of-a-function-call">Completing subtyping: the harmony of a function call</h3>
<div class="codehilite"><pre><span></span><span class="k">def</span> <span class="n">speak</span><span class="o">(</span><span class="n">a</span><span class="k">:</span> <span class="kt">Animal</span><span class="o">)</span><span class="k">:</span> <span class="kt">IO</span><span class="o">[</span><span class="kt">Unit</span><span class="o">]</span>
<span class="n">speak</span><span class="o">(</span><span class="n">aCat</span><span class="o">)</span>
</pre></div>
<p><img alt="" src="https://canvas-files-prod.s3.amazonaws.com/uploads/69d1a9b2-69c5-45ce-8add-c95a34678db6/harmony.svg"></p>
<p>This is how functions and their arguments form a more perfect
union. One way to think of the mechanics here is that <code>Cat</code> upcasts to
<code>Animal</code>. But there’s no way to tell that what is really happening
isn’t that the <em>function</em> is upcasting from <code>Animal => IO[Unit]</code> to
<code>Cat => IO[Unit]</code>! Or that they aren’t meeting in the middle
somewhere; maybe <code>Cat</code> upcasts to <code>Mammal</code>, and <code>Animal => IO[Unit]</code>
upcasts to <code>Mammal => IO[Unit]</code>.</p>
<p>So I don’t think there’s really subtyping without variance; there’s
just failing to explicitly model variance that is there anyway. Since
you cannot have subtyping without variance, if variance is too
complicated, so is subtyping.</p>
<h2 id="markdown-header-what-else-is-there-what-else-is-needed">What else is there? What else is needed?</h2>
<p>There is one advanced feature that fans of higher-kinded types would
have found a deal-breaker to live without in Scala, even if they are
unaware of its existence.</p>
<div class="codehilite"><pre><span></span><span class="k">def</span> <span class="n">same</span><span class="o">[</span><span class="kt">A</span>, <span class="kt">F</span><span class="o">[</span><span class="k">_</span><span class="o">]](</span><span class="n">fa</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">A</span><span class="o">])</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">A</span><span class="o">]</span> <span class="k">=</span> <span class="n">fa</span>
<span class="k">def</span> <span class="n">widen</span><span class="o">[</span><span class="kt">A</span>, <span class="kt">B</span> <span class="k">>:</span> <span class="kt">A</span>, <span class="kt">F</span><span class="o">[</span><span class="kt">+</span><span class="k">_</span><span class="o">]](</span><span class="n">fa</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">A</span><span class="o">])</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">B</span><span class="o">]</span> <span class="k">=</span> <span class="n">fa</span>
</pre></div>
<p>All of <code>Endo</code>, <code>Get</code>, and <code>Put</code> can be passed as the <code>F</code> type
parameter to <code>same</code>. However, only <code>Get</code> can be passed as the <code>F</code> type
parameter to <code>widen</code>. This works because you can only “use” variance
if you know you have it, but don’t need to make any assumptions about
your type constructors’ variances if you don’t “use” it.</p>
<h2 id="markdown-header-variance-exhibits-a-subkinding-relation">Variance exhibits a subkinding relation</h2>
<p><code>same</code> takes an invariant type constructor <code>F</code>, but you are free to
pass covariant and contravariant type constructors to it. That’s
because there is a subtype relationship between these at the type
level, or a <em>subkind</em> relationship.</p>
<p><img alt="" src="https://canvas-files-prod.s3.amazonaws.com/uploads/c26e827b-e2d8-4970-978c-
dc1644c83e94/half-diamond-nomargin.svg" title="dc1644c83e94/half-diamond-nomargin.svg"></p>
<p>Invariance is the ‘top’ variance, the most abstract, and co- and
contravariance are its <em>subvariances</em>. When you pass a covariant or
contravariant type constructor as <code>F</code> to <code>same</code>, its variance
“widens”.</p>
<p>Because <strong>variance is part of the “type of type constructor”</strong>, not
the specific parameter where the variance annotation appears, it’s an
element of the kind of that type constructor, just as arity is. For
example, when we talk about the kind of <code>Get</code> , we don’t say that <code>A</code>
has a covariant kind, because this variance has nothing to do with
<code>A</code>. Instead, we say that <code>Get</code> has kind <code>+* -> *</code>, because this
particular variance annotation is all about the behavior of types
referring to <code>Get</code>. Moreover, subvariance is just a restricted flavor
of subkind.</p>
<h3 id="markdown-header-flipping-variances">Flipping variances</h3>
<p>I started to find it odd that this order of subclassing was enforced
as a result of the subvariance relation.</p>
<div class="codehilite"><pre><span></span><span class="n">mutable</span><span class="o">.</span><span class="nc">Seq</span><span class="o">[</span><span class="kt">A</span><span class="o">]</span> <span class="k">extends</span> <span class="nc">Seq</span><span class="o">[</span><span class="kt">+A</span><span class="o">]</span>
<span class="nc">CovCoy</span><span class="o">[</span><span class="kt">F</span><span class="o">[</span><span class="kt">+</span><span class="k">_</span><span class="o">]]</span> <span class="k">extends</span> <span class="nc">Coy</span><span class="o">[</span><span class="kt">F</span><span class="o">[</span><span class="k">_</span><span class="o">]]</span>
</pre></div>
<p>It makes perfect sense empirically, though; you can easily derive
<code>unsafeCoerce</code> if you assume this exact ordering isn’t enforced.</p>
<p>Then I found, while working on monad transformers, that <em>this</em> is the
only way that makes sense, too.</p>
<div class="codehilite"><pre><span></span><span class="nc">InvMT</span><span class="o">[</span><span class="kt">T</span><span class="o">[</span><span class="k">_</span><span class="o">[</span><span class="k">_</span><span class="o">]]]</span> <span class="k">extends</span> <span class="nc">CovMT</span><span class="o">[</span><span class="kt">T</span><span class="o">[</span><span class="k">_</span><span class="o">[</span><span class="kt">+</span><span class="k">_</span><span class="o">]]]</span>
<span class="nc">CovMTT</span><span class="o">[</span><span class="kt">W</span><span class="o">[</span><span class="k">_</span><span class="o">[</span><span class="k">_</span><span class="o">[</span><span class="kt">+</span><span class="k">_</span><span class="o">]]]]</span> <span class="k">extends</span> <span class="nc">InvMTT</span><span class="o">[</span><span class="kt">W</span><span class="o">[</span><span class="k">_</span><span class="o">[</span><span class="k">_</span><span class="o">[</span><span class="k">_</span><span class="o">]]]]</span>
</pre></div>
<p>I had seen this before.</p>
<h3 id="markdown-header-type-parameter-positions-are-variance-contravariant">Type parameter positions are variance-contravariant</h3>
<p>The kinds of type constructors (type-level functions) work like the
types of value-level functions. Both sorts of functions are
contravariant in the parameter position. So every extra layer of
nesting, “flips” the variance, just like with functions. Below is the
version of this for value-level functions, akin to the examples above.</p>
<div class="codehilite"><pre><span></span><span class="k">type</span> <span class="kt">One</span><span class="o">[</span><span class="kt">-A</span><span class="o">]</span> <span class="k">=</span> <span class="n">A</span> <span class="k">=></span> <span class="n">Z</span>
<span class="k">type</span> <span class="kt">Two</span><span class="o">[</span><span class="kt">+A</span><span class="o">]</span> <span class="k">=</span> <span class="o">(</span><span class="n">A</span> <span class="k">=></span> <span class="n">Z</span><span class="o">)</span> <span class="k">=></span> <span class="n">Z</span>
<span class="k">type</span> <span class="kt">Three</span><span class="o">[</span><span class="kt">-A</span><span class="o">]</span> <span class="k">=</span> <span class="o">((</span><span class="n">A</span> <span class="k">=></span> <span class="n">Z</span><span class="o">)</span> <span class="k">=></span> <span class="n">Z</span><span class="o">)</span> <span class="k">=></span> <span class="n">Z</span>
<span class="k">type</span> <span class="kt">Four</span><span class="o">[</span><span class="kt">+A</span><span class="o">]</span> <span class="k">=</span> <span class="o">(((</span><span class="n">A</span> <span class="k">=></span> <span class="n">Z</span><span class="o">)</span> <span class="k">=></span> <span class="n">Z</span><span class="o">)</span> <span class="k">=></span> <span class="n">Z</span><span class="o">)</span> <span class="k">=></span> <span class="n">Z</span>
</pre></div>
<h3 id="markdown-header-a-bottom-variance-the-diamond-completed">A bottom variance: the diamond, completed</h3>
<p>Scala wisely included a bottom type, <code>Nothing</code>, to go with its top
type <code>Any</code>. That helps complete its subtyping system, but a bottom
variance was unfortunately left out of its subkinding system. Here’s
something that might work.</p>
<div class="codehilite"><pre><span></span><span class="k">type</span> <span class="kt">ConstI</span><span class="o">[</span><span class="err">👻</span><span class="kt">A</span><span class="o">]</span> <span class="k">=</span> <span class="nc">Int</span>
<span class="nc">ConstI</span><span class="o">[</span><span class="kt">A</span><span class="o">]</span> <span class="o">~</span> <span class="nc">ConstI</span><span class="o">[</span><span class="kt">B</span><span class="o">]</span> <span class="c1">// phantom, or 👻</span>
</pre></div>
<p>This is exactly what you’d get if you applied both the covariant and
contravariant rules to a type parameter. Therefore, <em>phantom variance</em>
or <em>anyvariance</em> is more specific than either, and as the ‘bottom’
variance, completes the diamond. It is the perfect choice, because it
is truly a <em>greatest lower bound</em>; it is both more specific than
either by itself and <em>no more</em> specific than both together.</p>
<p><img alt="" src="https://canvas-files-prod.s3.amazonaws.com/uploads/5f1244c6-a001-40e0-a526-8a0b17a08904/diamond-nomargin.svg"></p>
<h2 id="markdown-header-whence-monad-transformer-variance">Whence monad transformer variance?</h2>
<p>There are two competing flavors of the monad transformers, like
<code>OptionT</code>.</p>
<div class="codehilite"><pre><span></span><span class="k">final</span> <span class="k">case</span> <span class="k">class</span> <span class="nc">NewOptionT</span><span class="o">[</span><span class="kt">F</span><span class="o">[</span><span class="k">_</span><span class="o">]</span>, <span class="kt">A</span><span class="o">](</span><span class="n">run</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">Option</span><span class="o">[</span><span class="kt">A</span><span class="o">]])</span>
<span class="k">final</span> <span class="k">case</span> <span class="k">class</span> <span class="nc">OldOptionT</span><span class="o">[</span><span class="kt">F</span><span class="o">[</span><span class="kt">+</span><span class="k">_</span><span class="o">]</span>, <span class="kt">+A</span><span class="o">](</span><span class="n">run</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">Option</span><span class="o">[</span><span class="kt">A</span><span class="o">]])</span>
</pre></div>
<p>The first remains invariant over <code>A</code>, but conveniently doesn’t care
about the variance of <code>F</code>—its declared variance is the “top”
variance. The latter gives the more specific covariance over <code>A</code>, but
requires the <code>F</code> to be covariant (or phantom), which can be very
inconvenient. These two transformers can’t be practically unified.</p>
<p>If you look at the structure, the variance of <code>A</code>’s position is always
of a piece with the variance of <code>F</code>.</p>
<h3 id="markdown-header-variance-variables">Variance variables</h3>
<div class="codehilite"><pre><span></span><span class="k">final</span> <span class="k">case</span> <span class="k">class</span> <span class="nc">OptionT</span><span class="o">[</span><span class="err">😕</span><span class="kt">V</span>, <span class="kt">F</span><span class="o">[</span><span class="kt">V</span><span class="err">😕</span><span class="k">_</span><span class="o">]</span>, <span class="kt">V</span><span class="err">😕</span><span class="kt">A</span><span class="o">]</span>
</pre></div>
<ol>
<li>The “variance variable” <code>V</code> is declared with the syntactic marker 😕.</li>
<li>😕 appears infix before <code>_</code> to say “<code>F</code> has the variance <code>V</code> for the
type parameter in this position”.</li>
<li>😕 appears infix before <code>A</code> to say “the parameter <code>A</code> has the
variance <code>V</code>”.</li>
</ol>
<p>Therefore, when you specify variance <code>V = +</code>, the remaining parameters
have kind <code>F[+_]</code> and <code>+A</code>, and similarly for other variances.</p>
<p>I’ve chosen the confused smiley 😕 to represent likely reactions to
this idea, and especially to my complete lack of consideration for
elegant syntax, but it’s really just a limited form of a kind variable
in
<a href="https://downloads.haskell.org/~ghc/8.0.1/docs/html/users_guide/glasgow_exts.html#overview-of-kind-polymorphism"><code>PolyKinds</code></a>—after
all, variance is part of the kind of the surrounding type
constructor. And, as seen above, we already have subkind-polymorphism
with respect to variance, so what’s wrong with parametric kind
polymorphism? “Variance variables” are just parametric kind
polymorphism, but not as powerful.</p>
<h3 id="markdown-header-variance-bounds">Variance bounds</h3>
<p>The <code>OptionT</code> example is a simple case; it supports all possible
variances, and features a simple relationship: whatever you select for
<code>V</code> is exactly the variance used at the various places <code>V</code>
appears. It’s easy to break this simple scheme, though.</p>
<div class="codehilite"><pre><span></span><span class="k">final</span> <span class="k">case</span> <span class="k">class</span> <span class="nc">WrenchT</span><span class="o">[</span><span class="kt">F</span><span class="o">[</span><span class="k">_</span><span class="o">]</span>, <span class="kt">A</span><span class="o">](</span><span class="n">run</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">Option</span><span class="o">[</span><span class="kt">A</span><span class="o">]],</span> <span class="n">wrench</span><span class="k">:</span> <span class="kt">A</span><span class="o">)</span>
</pre></div>
<p>Now the variance of the <code>A</code> position isn’t strictly the variance of
<code>F</code>, as it was with <code>OptionT</code>; it can only be covariant or invariant,
due to the wrench being in covariant position.</p>
<p>Well, we have variables over something with a conformance relation,
let’s add bounds!</p>
<div class="codehilite"><pre><span></span><span class="k">final</span> <span class="k">case</span> <span class="k">class</span> <span class="nc">WrenchT</span><span class="o">[</span><span class="err">😕</span><span class="kt">V</span> <span class="k">>:</span> <span class="kt">+</span>, <span class="kt">F</span><span class="o">[</span><span class="kt">V</span><span class="err">😕</span><span class="k">_</span><span class="o">]</span>, <span class="kt">V</span><span class="err">😕</span><span class="kt">A</span><span class="o">]</span>
</pre></div>
<p>And these bounds themselves could be determined by other variance
variables, &c, but I don’t want to dwell on that because the
complications aren’t over yet.</p>
<h2 id="markdown-header-no-easy-unification">No easy unification</h2>
<div class="codehilite"><pre><span></span><span class="k">final</span> <span class="k">case</span> <span class="k">class</span> <span class="nc">Compose</span><span class="o">[</span><span class="kt">F</span><span class="o">[</span><span class="k">_</span><span class="o">]</span>, <span class="kt">G</span><span class="o">[</span><span class="k">_</span><span class="o">]</span>, <span class="kt">A</span><span class="o">](</span><span class="n">run</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">G</span><span class="o">[</span><span class="kt">A</span><span class="o">]])</span>
</pre></div>
<p>This type complicates things even further! There are numerous
possibilities based on the variances of <code>F</code> and <code>G</code>.</p>
<div class="codehilite"><table border="1"><tr><th>F,G</th><th>G,F</th><th>A </th></tr>
<tr><td>Inv</td><td>Inv</td><td>Inv</td></tr>
<tr><td>Inv</td><td>+ </td><td>Inv</td></tr>
<tr><td>Inv</td><td>- </td><td>Inv</td></tr>
<tr><td>Inv</td><td>👻 </td><td>👻 </td></tr>
<tr><td>+ </td><td>+ </td><td>+ </td></tr>
<tr><td>+ </td><td>- </td><td>- </td></tr>
<tr><td>+ </td><td>👻 </td><td>👻 </td></tr>
<tr><td>- </td><td>- </td><td>+ </td></tr>
<tr><td>- </td><td>👻 </td><td>👻 </td></tr>
<tr><td>👻 </td><td>👻 </td><td>👻 </td></tr>
</table></div>
<p>(The order of <code>F</code> and <code>G</code> doesn’t matter here, so I’ve left out the
reverses here.) So the variance of <code>A</code>’s position is the result of
<em>multiplying the variance</em> of <code>F</code> and <code>G</code>. The multiplication table is
just above. Guess we need a notation for that.</p>
<div class="codehilite"><pre><span></span><span class="o">[</span><span class="err">😕</span><span class="kt">FV</span>, <span class="err">😕</span><span class="kt">GV</span>, <span class="err">😕</span><span class="kt">V</span> <span class="k">>:</span> <span class="kt">FV</span> <span class="kt">×</span> <span class="kt">GV</span>, <span class="kt">F</span><span class="o">[</span><span class="kt">FV</span><span class="err">😕</span><span class="k">_</span><span class="o">]</span>, <span class="kt">G</span><span class="o">[</span><span class="kt">GV</span><span class="err">😕</span><span class="k">_</span><span class="o">]</span>, <span class="kt">V</span><span class="err">😕</span><span class="kt">A</span><span class="o">]</span>
</pre></div>
<p>The bound here only means that <code>V</code> is in variance-covariant position,
but bounded by <code>FV × GV</code>.</p>
<h3 id="markdown-header-another-wrench">Another wrench</h3>
<p>We can privilege <code>F</code> a little bit to make things more interesting.</p>
<div class="codehilite"><pre><span></span><span class="k">final</span> <span class="k">case</span> <span class="k">class</span> <span class="nc">ComposeWr</span><span class="o">[</span><span class="kt">F</span><span class="o">[</span><span class="k">_</span><span class="o">]</span>, <span class="kt">G</span><span class="o">[</span><span class="k">_</span><span class="o">]</span>, <span class="kt">A</span><span class="o">](</span><span class="n">run</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">G</span><span class="o">[</span><span class="kt">A</span><span class="o">]],</span> <span class="n">fa</span><span class="k">:</span> <span class="kt">F</span><span class="o">[</span><span class="kt">A</span><span class="o">])</span>
</pre></div>
<table border="1">
<tr><th>F×G </th><th>F </th><th>A </th></tr>
<tr><td> </td><td>Inv,+,-</td><td>Inv</td></tr>
<tr><td>Inv,+,-,👻</td><td>Inv </td><td>Inv</td></tr>
<tr><td>+,👻 </td><td>+ </td><td>+ </td></tr>
<tr><td>- </td><td>+ </td><td>Inv</td></tr>
<tr><td>-,👻 </td><td>- </td><td>- </td></tr>
<tr><td>+ </td><td>- </td><td>Inv</td></tr>
<tr><td>👻 </td><td>👻 </td><td>👻 </td></tr>
</table>
<p>Here’s another level to the function that determines the <code>A</code>-position
variance: it’s now <code>lub(F×G, F)</code>, where <em>lub</em> is the <em>least upper
bound</em>, the most specific variance that still holds both arguments as
subvariances.</p>
<h3 id="markdown-header-variance-families-really">Variance families? Really?</h3>
<p>I hope you guessed where I was going when I said “function”: the rules
determining the lower bound on the <code>A</code>-position variance can be
specified by the programmer with a variance-level function—a mapping
where the arguments and results are variances—or a <em>variance
family</em>. Again, I don’t think this is terribly novel; it’s just a kind
family, but more restricted.</p>
<p>You’d want it closed and total, and I can see the Haskell now.</p>
<div class="codehilite"><pre><span></span><span class="nf">variance</span> <span class="n">family</span> <span class="kt">LUBTimes</span> <span class="n">a</span> <span class="n">b</span> <span class="kr">where</span>
<span class="kt">LUBTimes</span> <span class="o">-</span> <span class="o">+</span> <span class="ow">=</span> <span class="o">-</span>
<span class="kt">LUBTimes</span> <span class="err">👻</span> <span class="kt">Inv</span> <span class="ow">=</span> <span class="err">👻</span>
<span class="err">…</span>
</pre></div>
<p>Only because I’m not sure where to begin making up the Scala syntax.</p>
<h2 id="markdown-header-there-are-four-variances">There are four variances</h2>
<p>When I started looking for ways to describe the variances of the
datatypes I’ve shown you, I noticed the relationship between variance
and kinds, converted the problem to a kind-level problem, and started
thinking of solutions in terms of <code>PolyKinds</code>. That’s where variance
variables come from, and everything else follows from those.</p>
<p>However, I think I’ve made a mistake. Not with variance variables
themselves, mind you, nor the variance-conformance already built in to
Scala. But, to deal with the problems that arise with only these, I’ve
hypothesized tools—described above—that are way too powerful. They
work on open domains of unbounded complexity—the kinds of types—and
there are only four variances.</p>
<p>There are two reasons I think there must be a better approach.</p>
<p>First, there are finitely many variance families for a given sort.</p>
<p>Second, there are only so many ways to put variables in positions of
variance. You get argument and result position of methods, within
other type constructors, and that’s it. Things are simple enough that,
discounting desire for working GADTs (which will be discussed in a
later post, <a href="http://typelevel.org/blog/2016/09/19/variance-phantom.html">“Choosing variance for a phantom type”</a>), it is always
possible to <em>infer</em> which of the four variances a type parameter ought
to have, in a first-order variance situation.</p>
<h3 id="markdown-header-a-good-polyvariance-constraint-system">A “good” polyvariance constraint system</h3>
<p>Since there are only four variances, and only so many ways to combine
them, it might be possible to design something more refined and suited
for the task than the too-powerful variance families.</p>
<p>It might be that there are only a few useful variance relations, like
× and lub, and a good solution would be to supply these relations
along with an expression model to combine them. Or maybe not. Instead,
I’ll stop hypothesizing and instead say what a I think a “good” system
would look like.</p>
<ol>
<li>It must be <strong>writable</strong>. Just as it is desirable to write a
stronger type role than the inferred one in GHC Haskell ≥ 7.8,
there are very common reasons to want a more general variance than
the one that would be inferred. So the convenience of writing out
the rule explicitly matters a great deal.</li>
<li>It must be <strong>checkable</strong>. For variance variables, that means every
possible variance you can choose puts every type parameter only in
positions consistent with its variance. For example, our fully
generalized <code>OptionT</code> always places <code>A</code> only in positions matching
the variance of the <code>F</code> type constructor. <p> We <em>can</em> just
check every possible variance—up to four for each variable—but I
think this is the wrong way to go. We don’t just enumerate over
every possible type to check type parameters—that would take
forever—we have a systematic way to check exactly one time, with
skolemization. Variance is simpler—it should be an easier problem.</li>
<li>Not a requirement—but it ought to be <strong>inferrable</strong>. In the same
way that skolemization gives us a path to automatic generalization,
if there is a similar way to do quantified variance checking, it
should be possible to use the output of that decision procedure to
determine the relationships between and bounds on variance
variables. <p> How that decision is expressed is another
question.</li>
</ol>
<h2 id="markdown-header-variance-ghc-type-roles">Variance & GHC type roles</h2>
<p>It might seem that Haskell, with no subtyping, might not care about
this problem. But GHC 7.8
<a href="https://ghc.haskell.org/trac/ghc/wiki/Roles"><em>type roles</em></a> are
similar enough to variances; the main difference is that Scala
variance is about the congruence/liftability of the
<a href="http://www.scala-lang.org/files/archive/spec/2.11/03-types.html#conformance">strong conformance relation</a>,
while type roles are about the congruence/liftability of the
<a href="https://ghc.haskell.org/trac/ghc/wiki/Roles#Coercible">weak type equality/“coercibility” relation</a>.</p>
<div class="codehilite"><pre><span></span><span class="nf">nominal</span><span class="kt">:</span> <span class="n">a</span> <span class="o">~</span> <span class="n">b</span> <span class="err">→</span> <span class="n">f</span> <span class="n">a</span> <span class="o">~</span> <span class="n">f</span> <span class="n">b</span>
<span class="nf">representational</span><span class="kt">:</span> <span class="n">a</span> <span class="o">~</span><span class="err"><sub><small><em>w</em></small></sub></span> <span class="n">b</span> <span class="err">→</span> <span class="n">f</span> <span class="n">a</span> <span class="o">~</span><span class="err"><sub><small><em>w</em></small></sub></span> <span class="n">f</span> <span class="n">b</span>
<span class="nf">phantom</span><span class="kt">:</span> <span class="n">f</span> <span class="n">a</span> <span class="o">~</span><span class="err"><sub><small><em>w</em></small></sub></span> <span class="n">f</span> <span class="n">b</span>
</pre></div>
<p>This is pretty useful from a practical, performance-minded
perspective, but the problem is</p>
<div class="codehilite"><pre><span></span><span class="kr">newtype</span> <span class="kt">MaybeT</span> <span class="n">m</span> <span class="n">a</span> <span class="ow">=</span> <span class="kt">MaybeT</span> <span class="p">(</span><span class="n">m</span> <span class="p">(</span><span class="kt">Maybe</span> <span class="n">a</span><span class="p">))</span>
</pre></div>
<p>there is no way to describe the role of the <code>a</code> parameter in the most
general way. It’s stuck at <code>nominal</code>, even if <code>m</code>’s parameter is
<code>representational</code>.</p>
<p>Just as the integration of variance and higher kinds in Scala is
incomplete without something like the polyvariance system I’ve been
describing, Haskell’s type roles are not fully integrated with higher
kinds.</p>
<p>I hope if one of these language communities finds a good solution, it
is adopted by the other posthaste. The Haskell community is
<a href="https://ghc.haskell.org/trac/ghc/wiki/Roles2">attempting to tackle these problems with roles</a>;
perhaps Scala can profit from its innovations. Much of the prose on
the GHC matter can be profitably read by replacing “role” with
“variance” where it appears. For example, this should sound familiar.</p>
<blockquote>
<p>This design incorporates roles into kinds. It solves the exact
problems here, but at great cost: because roles are attached to
kinds, we have to choose a types roles in the wrong place. For
example, consider the <code>Monad</code> class. Should the parameter <code>m</code> have
type <code>*/R -> *</code>, requiring all monads to take representational
arguments, or should it have type <code>*/N -> *</code>, disallowing GND if
<code>join</code> is in the <code>Monad</code> class? We’re stuck with a different set of
problems.</p>
</blockquote>
<h2 id="markdown-header-subtyping-or-higher-kinds">Subtyping or higher kinds?</h2>
<p>As things stand, you will have a little trouble combining heavy use of
subtyping and of higher kinds in the same system.</p>
<p>I’m not saying for certain that it comes down to one or the other. In
Scalaz,
<a href="https://github.com/scalaz/scalaz/pull/328">we weakened support for subtyping</a>
to have better support for higher kinds, because its users typically
do not want to use subtyping. However, this preference doesn’t
generalize to the Scala community at large. This was only a real
concern for monad transformers; most Scalaz constructs, for now, have
fine subtyping support.</p>
<p>My suggestion is that you should favor higher kinds; they’re a more
powerful abstraction mechanism, and ultimately easier to understand,
than subtyping; they also happen to be less buggy in Scala. If you
must use subtyping, be warned: it’s much more complex than it first
seems.</p>
<h2 id="markdown-header-further-reading">Further reading</h2>
<ul>
<li><a href="http://typelevel.org/blog/2016/09/19/variance-phantom.html">“Choosing variance for a phantom type”</a>
builds upon this post to connect variance and pattern matching</li>
<li><a href="http://typelevel.org/blog/2016/02/04/variance-and-functors.html">“Of variance and functors”</a>
by Adelbert Chang, provides some foundational intuition for variance</li>
<li><a href="https://issues.scala-lang.org/browse/SI-2066">SI-2066 “Unsoundness in overriding methods with higher‐order type parameters”</a>,
easily the most important surfacing of the interaction of higher
kinds and variance, revealing the existence of subvariance</li>
<li>The
<a href="https://github.com/scalaz/scalaz/blob/v7.2.6/core/src/main/scala/scalaz/Liskov.scala#L4-L12">Liskov type</a>,
which builds every other subtyping property on reflexivity and
contravariance</li>
</ul>
Stephen Compallhttp://www.blogger.com/profile/13346366374080232972noreply@blogger.com2tag:blogger.com,1999:blog-1184549185438247550.post-15428134299425343192015-10-13T22:28:00.000-04:002015-10-13T22:28:24.809-04:00The uninteresting monoids of certain monadsSuppose there is some structure from which arises a monad. Let’s call
one <code>Sem</code>.<br />
<br />
<div class="codehilite">
<pre><span class="kr">data</span> <span class="kt">Sem</span> <span class="n">a</span> <span class="ow">=</span> <span class="o">...</span> <span class="c1">-- doesn't matter</span>
</pre>
</div>
<br />
In the spirit of defining every typeclass instance you can think of—a
spirit that I share, believe me—you discover a monoid, and suggest
that it be included with <code>Sem</code>.<br />
<br />
<div class="codehilite">
<pre><span class="kr">instance</span> <span class="o">???</span> <span class="ow">=></span> <span class="kt">Monoid</span> <span class="p">(</span><span class="kt">Sem</span> <span class="n">a</span><span class="p">)</span> <span class="kr">where</span>
<span class="c1">-- definition here</span>
</pre>
</div>
<br />
But then, you are surprised to encounter pessimism and waffling, from
me!<br />
<br />
I’m so skeptical of your monoid because it is “common”; many monoids
simply fall out of numerous monads, to greater or lesser degree, but
that doesn’t make them “good” monoids. Having rediscovered a common,
uninteresting monoid, you need to provide more justification of why it
should be “the” monoid for this data type.<br />
<br />
<h2 id="markdown-header-the-lifted-monoid">
The lifted monoid</h2>
<i>Every</i> applicative functor gives rise to a monoid that lifts their
arguments’ monoid.<br />
<br />
<div class="codehilite">
<pre><span class="kr">instance</span> <span class="kt">Monoid</span> <span class="n">a</span> <span class="ow">=></span> <span class="kt">Monoid</span> <span class="p">(</span><span class="kt">Sem</span> <span class="n">a</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">mempty</span> <span class="ow">=</span> <span class="n">pure</span> <span class="n">mempty</span>
<span class="n">mappend</span> <span class="ow">=</span> <span class="n">liftA2</span> <span class="n">mappend</span>
</pre>
</div>
<br />
This is “the” monoid for <code>(->) r</code> and <code>Maybe</code>. It is decidedly <i>not</i>
the monoid for <code>[]</code>. For in that universe,<br />
<br />
<div class="codehilite">
<pre>> <span class="p">[</span><span class="kt">Sum</span> <span class="mi">2</span><span class="p">]</span> <span class="p">`</span><span class="n">mappend</span><span class="p">`</span> <span class="p">[</span><span class="kt">Sum</span> <span class="mi">3</span><span class="p">,</span> <span class="kt">Sum</span> <span class="mi">7</span><span class="p">]</span>
<span class="p">[</span><span class="kt">Sum</span> <span class="mi">5</span><span class="p">,</span> <span class="kt">Sum</span> <span class="mi">9</span><span class="p">]</span>
<span class="o">></span> <span class="p">[</span><span class="kt">Sum</span> <span class="mi">42</span><span class="p">]</span> <span class="p">`</span><span class="n">mappend</span><span class="p">`</span> <span class="kt">[]</span>
<span class="kt">[]</span>
</pre>
</div>
<br />
Maybe you reaction is “but that’s not a legal monoid!” Sure it is.
The <code>mappend</code> is based on combination, just as <code>Applicative []</code>’s <code><*></code> is. And, in the example above, the left and right identity is
<code>[Sum 0]</code>, not <code>[]</code>.<br />
<br />
It’s just not the monoid you’re used to.<br />
<br />
Moreover, it isn’t quite right for <code>Maybe</code>! The constraint
generalizes to <code>Semigroup a</code>. It is an unfortunate accident of
history that the constraint on Haskell <code>Maybe</code>’s monoid is also
<code>Monoid</code>.<br />
<br />
Even the choice for <code>(->) r</code> makes many people unhappy, though we’re
not quite ready to explore the reason for that.<br />
<br />
So, what makes you think this is a good choice for <code>Sem</code>? It’s not
enough justification that it can be written; that is always the case.
There must be something that makes <code>Sem</code> like <code>(->) r</code> or <code>Maybe</code>, and
not like <code>[]</code>.<br />
<br />
<h2 id="markdown-header-the-monadplus-monoid">
The <code>MonadPlus</code> monoid</h2>
To be entirely modern, this would be the <code>Alternative</code> monoid.
Despite the possibilities for equivocation, this monoid is just as
good as any other.<br />
<br />
Simply: every <code>Alternative</code> (a subclass of <code>Applicative</code> and a
superclass of the more well-known <code>MonadPlus</code>) gives rise to a monoid
that is <i>universal</i> over the argument, no <code>Monoid</code> constraint
required.<br />
<br />
<div class="codehilite">
<pre><span class="c1">-- supposing Alternative Sem,</span>
<span class="kr">instance</span> <span class="kt">Monoid</span> <span class="p">(</span><span class="kt">Sem</span> <span class="n">a</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">mempty</span> <span class="ow">=</span> <span class="n">empty</span>
<span class="n">mappend</span> <span class="ow">=</span> <span class="p">(</span><span class="o"><|></span><span class="p">)</span>
</pre>
</div>
<br />
You would not be surprised at this having prepared by reading
<a href="https://hackage.haskell.org/package/base-4.8.1.0/docs/Control-Applicative.html#t%3AAlternative">the haddock for <code>Alternative</code></a>:
“a monoid on applicative functors”, it says.<br />
<br />
<code>[]</code> is <code>Alternative</code>, and indeed this is the monoid of choice for
<code>[]</code>. But <code>Maybe</code> is also <code>Alternative</code>. Why is this one good for
<code>[]</code>, but not <code>Maybe</code>? Let’s take a peek through the looking glass.<br />
<br />
<div class="codehilite">
<pre>> <span class="kt">Just</span> <span class="mi">1</span> <span class="p">`</span><span class="n">mappend</span><span class="p">`</span> <span class="kt">Just</span> <span class="mi">4</span>
<span class="kt">Just</span> <span class="mi">1</span>
> <span class="kt">Nothing</span> <span class="p">`</span><span class="n">mappend</span><span class="p">`</span> <span class="kt">Just</span> <span class="mi">3</span>
<span class="kt">Just</span> <span class="mi">3</span>
</pre>
</div>
<br />
I happen to agree with the monoid of choice for <code>Maybe</code>. But I’m sure
many have been surprised it’s not “just take the leftmost <code>Just</code>, or
give <code>Nothing</code>”.<br />
<br />
Except where
<a href="https://hackage.haskell.org/package/base-4.8.1.0/docs/src/Control.Applicative.html#line-81">phantom <code>Const</code>-style functors</a>
are involved, the two preceding monoids always have incompatible
behavior. One sums the underlying values, the other never touchs
them, only rearranging them. So, if both are available to <code>Sem</code>, to
define a monoid, we must give up at least one of these.<br />
<br />
Alternatively, we could put off the decision until someone comes up
with a convincing argument for “the” monoid.<br />
<br />
<h2 id="markdown-header-the-category-endomorphism-monoid">
The category endomorphism monoid</h2>
This monoid hasn’t let the lack of a pithy name handicap it; despite
the stunning blow of
<a href="https://mail.haskell.org/pipermail/libraries/2005-October/004500.html">losing the prized <code>(->)</code> to the lifted monoid</a>
(<a href="https://git.haskell.org/ghc.git/commitdiff/2cf6d82a53131e8fc1f18900435cd0ee25bd434e#patch1">the commit</a>),
this one probably has even more fans eager for a rematch today than it
did back then.<br />
<br />
I’m referring to this one, still thought of as “the” monoid for <code>(->)</code>
by some.<br />
<br />
<div class="codehilite">
<pre><span class="kr">instance</span> <span class="kt">Monoid</span> <span class="p">(</span><span class="n">a</span> <span class="ow">-></span> <span class="n">a</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">mempty</span> <span class="ow">=</span> <span class="n">id</span>
<span class="n">mappend</span> <span class="ow">=</span> <span class="p">(</span><span class="o">.</span><span class="p">)</span>
</pre>
</div>
<br />
The elegance of this kind of “summing” of functions is undeniable.
Moreover, it applies to <i>every</i> <code>Category</code>, not just <code>(->)</code>. Even
more, it works for anything sufficiently <code>Category</code>-ish, such as
<code>ReaderT</code>.<br />
<br />
<div class="codehilite">
<pre><span class="kr">instance</span> <span class="kt">Monad</span> <span class="n">m</span> <span class="ow">=></span> <span class="kt">Monoid</span> <span class="p">(</span><span class="kt">ReaderT</span> <span class="n">a</span> <span class="n">m</span> <span class="n">a</span><span class="p">)</span> <span class="kr">where</span>
<span class="n">mempty</span> <span class="ow">=</span> <span class="n">ask</span>
<span class="kt">ReaderT</span> <span class="n">f</span> <span class="p">`</span><span class="n">mappend</span><span class="p">`</span> <span class="kt">ReaderT</span> <span class="n">g</span> <span class="ow">=</span>
<span class="kt">ReaderT</span> <span class="o">$</span> <span class="n">f</span> <span class="o"><=<</span> <span class="n">g</span>
</pre>
</div>
<br />
Its fatal flaw is that twin appearance of <code>a</code>; it requires
<code>FlexibleInstances</code>, so can’t be written in portable Haskell 2010.
As such, it will probably remain in the minor leagues of newtypes like
<a href="https://hackage.haskell.org/package/base-4.8.1.0/docs/Data-Monoid.html#t%3AEndo"><code>Endo</code></a>.<br />
<br />
Moreover, should you discover it for <code>Sem</code>, its applicability to <i>any</i>
category-ish thing should <i>still</i> give you pause.<br />
<br />
<h2 id="markdown-header-the-burden-of-proof">
The burden of proof</h2>
In Haskell, hacking until it compiles is a great way to work. It is
tempting to rely on its conclusions in ever more cases, once you have
discovered its effectiveness. However, in the cases above, it is very
easy to be led astray by the facile promises of the typechecker.<br />
<br />
Introducing one of these monoids is risky. It precludes the later
introduction of the “right” monoid for a datatype, for want of
compatibility. If you really must offer one of these monoids as “the”
monoid for a datatype, the responsibility falls to you: demonstrate
that this is a <i>good</i> monoid, not just an easy one.Stephen Compallhttp://www.blogger.com/profile/13346366374080232972noreply@blogger.com1tag:blogger.com,1999:blog-1184549185438247550.post-68132393450466561832014-03-02T00:44:00.000-05:002014-03-02T00:46:48.804-05:00Encountering the people of Free SoftwareOver my time as a programmer, I have grown in the practice mostly by way of contact with the <a href="https://www.gnu.org/philosophy/free-software-even-more-important.html">free software</a> community. However, for the first 8 years of this time, that contact was entirely mediated by Internet communication, for my hometown did not feature a free software community to speak of.<br />
<br />
So, instead, I learned what these people, mostly distributed among the other continents, were like by way of their mailing list messages, IRC chats, wiki edits, and committed patches. This is a fine way to become acquainted with the hats people wear for the benefit of the projects they're involved with, but isn't really a way to observe what they are really like.<br />
<br />
<br />
<h2>
About face </h2>
<br />
Then, a few years ago, I moved to Boston. Well-known for being steeped in history, Boston is the geographic heart of free software, being also the home of the <a href="https://fsf.org/">Free Software Foundation</a>. Here also is the FSF's annual <a href="https://libreplanet.org/2014/">LibrePlanet</a> conference, a policy conference accompanied by a strong lineup of technical content.<br />
<br />
I first attended LibrePlanet <a href="https://libreplanet.org/wiki/LibrePlanet:Conference/2012">in 2012</a>. There, after a decade of forming an idea in my head of what these people were like, I could finally test that idea against real-life examples.<br />
<br />
<br />
<h2>
Oddity</h2>
<br />
Richard Stallman (rms), the founder and leader, both in spirit and in practice, of free software has long since excised non-free software from his life. If he cannot use a website without using non-free software, he will not use that website. If he can't open a document you send him without using a non-free program to open it, he will ask you to send it in a different format, or otherwise simply not read it. If he cannot use a newer computer with only free software on it, he will use an older computer instead. Because people keep asking him to do these things, this is an ongoing effort. This is well-known about him.<br />
<br />
So here was the surprise: my fellow attendees had all followed rms's example, <i>with varying success</i>. They traded tips on the freedom-respecting aspects of this or that hardware, yet admitted those areas where they hadn't yet been able to cut out non-free software.<br />
<br />
<br />
<h2>
Little things </h2>
<br />
There was no grand philosophical reason for this, no essential disagreement with rms's philosophy in play. It was just life. Perhaps they had a spouse who simply would not do without this non-free video streaming service. Perhaps they had friends with whom contact over that non-free messaging service was the foundation of the community. Perhaps they would like to work from home, albeit over some other non-free corporate network connector, in case they get snowed in.<br />
<br />
Or maybe they simply haven't found the time. Maybe they tried once, failed, and haven't had the opportunity to try again. There are many demands on people; they deal with them as best as they can.<br />
<br />
I should have realized this, and I should have known it from what rms himself had <a href="https://static.fsf.org/nosvn/faif-2.0.pdf">said</a>.<br />
<blockquote class="tr_bq">
"I hesitate to exaggerate the importance of this little puddle of freedom," he says. "Because the more well-known and conventional areas of working for freedom and a better society are tremendously important. I wouldn't say that free software is as important as they are. It's the responsibility I undertook, because it dropped in my lap and I saw a way I could do something about it…"</blockquote>
<br />
<br />
<h2>
Try!</h2>
<br />
For all of these compromises, though, there was still the sense that these compromises are not the end of the story. Maybe <a href="http://media.libreplanet.org/u/libby/m/mako/">free software isn't (practically) better</a> sometimes. Maybe there are compromises that could ease up as the situation changes. Or maybe some inconvenience will be worth the trouble in the long run; after all, that practically inferior software probably won't get better without users.<br />
<br />
People are perfectly capable, on our own, of following Milton Friedman's method of entering gain or loss of freedom on the appropriate side of the pros-and-cons list when making such choices: when you have little, the loss of a little means a lot. Why, then, look to the example of rms, or those small ranks of others who have also cut out all non-free software from their lives?<br />
<br />
The people of free software don't necessarily believe that rms's goal is reachable within our lifetimes. I think that what people respond to is his clear, clearly stated, and continually adapting ideas of how the world could be better, never mind the occasional bout of eerie prescience. Maybe we will never get there. Does that mean people shouldn't set a lofty goal for making a better world, and spend a bit of time pushing the real one towards it?Stephen Compallhttp://www.blogger.com/profile/13346366374080232972noreply@blogger.com0Boston, MA, USA42.3584308 -71.059773242.170560800000004 -71.38249669999999 42.5463008 -70.7370497tag:blogger.com,1999:blog-1184549185438247550.post-45338978953370476862013-08-01T22:48:00.000-04:002013-08-01T22:58:41.317-04:00Rebasing makes collaboration harderThanks to certain version control systems' making these operations too attractive, history rewriting, e.g. rebase and squashed merge, of published revisions is currently quite popular in free software projects. What does the <a href="https://www.kernel.org/pub/software/scm/git/docs/git-rebase.html#_recovering_from_upstream_rebase">git-rebase manpage</a>, an otherwise advocate of the practice, have to say about that?<br />
<br />
<blockquote class="tr_bq">
Rebasing (or any other form of rewriting) a branch that others have based work on is a bad idea: anyone downstream of it is forced to manually fix their history.</blockquote>
<br />
The manpage goes on to describe, essentially, <a href="http://failex.blogspot.com/2008/09/what-is-cascading-rebase.html">cascading rebase</a>. I will not discuss further here <i>why</i> it is a bad idea.<br />
<br />
So, let us suppose you wish to follow git-rebase's advice, <i>and</i> you wish to alter history you have made available to others, perhaps in a branch in a public repository. The qualifying question becomes: "has anyone based work on this history I am rewriting?"<br />
<br />
There are four ways in which you might answer this question.<br />
<br />
<ol>
<li>Someone <i>has</i> based work on your commits; rewriting history is a bad idea.</li>
<li>Someone <i>may have or might yet</i> base work on your commits; rewriting history is a bad idea.</li>
<li>It's unlikely that someone has based work on your commits so you can <i>dismiss the possibility</i>; the manpage's advice does not apply.</li>
<li>It is <i>not possible</i> that someone has or will yet based work on your commits; the manpage's advice does not apply.</li>
</ol>
<br />
If you have truly met the requirement above and made the revisions available to others, you can only choose #4 if you have some kind of logging of revision fetches, and check this logging beforehand; this almost never applies, so it is not interesting here. Note: it is not enough to check other public repositories; someone might be writing commits locally to be pushed later as you consider this question. Perhaps someone is shy about sharing experiments until they're a little further along.<br />
<br />
Now that we must accept it is <i>possible</i> someone has based changes on yours, even if you have dismissed it as unlikely, let's look at this from the perspective of another developer who wishes to build further revisions on yours. The relevant question here is "should I base changes on my fellow developer's work?" For which these are reasonable answers.<br />
<br />
<ol>
<li>You <i>know</i> someone has built changes on your history and will therefore not rewrite history, wanting to follow the manpage's advice. It is safe for me to build on it.</li>
<li>You <i>assume someone might</i> build changes on your history, and will not rewrite it for the same reason as with #1. It is safe for me to build on it.</li>
<li>You've <i>dismissed the possibility</i> of someone like me building on your history, and might rebase or squash, so it is not safe for me to build on it.</li>
</ol>
<br />
I have defined these answers to align with the earlier set, and wish to specifically address #3. By answering #3 to the prior question, you have reinforced the very circumstances you might think you are only predicting. In other words, <b>by assuming no one will wish to collaborate on your change, you have created the circumstances by which no one can safely collaborate on your change.</b> It is a self-fulfilling prophecy that reinforces the tendency to keep collaboration unsafe on your next feature branch.<br />
<br />
In this situation, it becomes very hard to break this cycle where each feature branch is "owned" by one person. I believe this is strongly contrary to the spirits of distributed version control, free software, and public development methodology.<br />
<br />
In circumstances with no history rewriting, the very interesting possibility of <i>ad hoc</i> cross-synchronizing via merges between two or more developers <i>on a single feature branch</i> arises. You work on your parts, others work on other parts, you merge from each other when ready. Given the above, it is not surprising to me that so many developers have not experienced this very satisfying way of working together, even as our modern tools with sophisticated merge systems enable it.Stephen Compallhttp://www.blogger.com/profile/13346366374080232972noreply@blogger.com0tag:blogger.com,1999:blog-1184549185438247550.post-60854832107483275732013-06-23T23:58:00.000-04:002013-06-24T10:20:56.666-04:00Fake Theorems for FreeThis article documents an element of
<a href="http://typelevel.org/projects/scalaz/">Scalaz</a> design that I
practice, because I believe it to be an element of Scalaz design
principles, and quite a good one at that. It explains why
<a href="https://github.com/scalaz/scalaz/pull/276"><code>Functor[Set]</code> was removed yet <code>Foldable[Set]</code> remains</a>. More broadly, it
explains why a functor may be considered invalid even though “it
doesn't break any laws”. It is a useful discipline to apply to your
own Scala code.<br />
<ol start="1" type="1">
<li>Do not use runtime type information in an unconstrained way.
</li>
<li>Corollary: do not use <code>Object#equals</code> or <code>Object#hashCode</code>
in unconstrained contexts, because that would count as #1.
</li>
</ol>
The simplest way to state and remember it is <b>“for all means
for all”</b>. Another, if you prefer, might be <b>“if I don't
know anything about it, I can't look at it”</b>.<br />
<br />
We accept this constraint for the same reason that we accept the
constraint of referential transparency: it gives us powerful reasoning
tools about our code. Specifically, it gives us our free theorems
back.<br />
<br />
<h4 class="subheading">
Madness</h4>
Let's consider a basic signature.<br />
<br />
<pre class="example"> def const[A](a: A, a2: A): A
</pre>
<br />
With the principle intact, there are only two total, referentially
transparent functions that we can write with this signature.<br />
<br />
<pre class="example"> def const[A](a: A, a2: A): A = a
def const2[A](a: A, a2: A): A = a2
</pre>
<br />
That is, we can return one or the other argument. We can't “look
at” either <code>A</code>, so we can't do tests on them or combine them in
some way.<br />
<br />
Much of Scalaz is minimally documented because it is easy enough to
apply this approach to more complex functions once you have a bit of
practice. Many Scalaz functions are the <i>only</i> function you
could write with such a signature.<br />
<br />
Now, let us imagine that we permit the unconstrained use of runtime
type information. Here are functions that are referentially
transparent, which you will find insane anyway.<br />
<br />
<pre class="example"> def const3[A](a: A, a2: A): A = (a, a2) match {
case (s: Int, s2: Int) => if (s < s2) a else a2
case _ => a
}
def const4[A](a: A, a2: A): A =
if (a.## < a2.##) a else a2
</pre>
<br />
Now, look at what we have lost! If the lowly <code>const</code> can be
driven mad this way, imagine what could happen with <code>fmap</code>. One
of our most powerful tools for reasoning about generic code has been
lost. No, this kind of thing is not meant for the realm of Scalaz.
<br />
<br />
<h4 class="subheading">
Missing theorems</h4>
For completeness's sake, let us see the list of theorems from
<a href="http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.38.9875">Theorems for free!</a>, figure 1 on page 3, for which I can think of a
counterexample, still meeting the stated function's signature, if we
violate the above explained principle.
<br />
<br />
In each case, I have assumed all other functions but the one in
question have their standard definitions as explained on the previous
page of the paper. I recommend having the paper open to page 3 to
follow along. They are restated in fake-Scala because you might like
that. Let <code>lift(f)</code> be <code>(_ map f)</code>, or <code>f*</code> as written
in the paper.
<br />
<br />
<dl>
<dt><code>head[X]: List[X] => X</code></dt>
<dd><code>a compose head</code> = <code>head compose lift(a)</code>
<br />
<br /></dd>
<dt><code>tail[X]: List[X] => List[X]</code></dt>
<dd><code>lift(a) compose tail</code> = <code>tail compose lift(a)</code>
<br />
<br /></dd>
<dt><code>++[X]: (List[X], List[X]) => List[X]</code></dt>
<dd><code>lift(a)(xs ++ ys)</code> = <code>lift(a)(xs) ++ lift(a)(ys)</code>
<br />
<br /></dd>
<dt><code>zip[X, Y]: ((List[X], List[Y])) => List[(X, Y)]</code></dt>
<dd><code>lift(a product b) compose zip</code> = <code>zip compose (lift(a)
product lift(b))</code>
<br />
<br /></dd>
<dt><code>filter[X]: (X => Boolean) => List[X] => List[X]</code></dt>
<dd><code>lift(a) compose filter(p compose a)</code> = <code>filter(p) compose a</code>
<br />
<br /></dd>
<dt><code>sort[X]: ((X, X) => Boolean) => List[X] => List[X]</code></dt>
<dd>wherever for all <code>x</code>, <code>y</code> in <code>A</code> , <code>(x < y) =
(a(x) <' a(y))</code>, also <code>lift(a) compose sort(<)</code> = <code>sort(<')
compose lift(a)</code>
<br />
<br /></dd>
<dt><code>fold[X, Y]: ((X, Y) => Y, Y) => List[X] => Y</code></dt>
<dd>wherever for all <code>x</code> in <code>A</code>, <code>y</code> in <code>B</code>, <code>b(x
+ y) = a(x) * b(y)</code> and <code>b(u) = u'</code>, also <code>b compose fold(+,
u)</code> = <code>fold(*, u') compose lift(a)</code>
</dd><br />
</dl>
<code>Object#equals</code> and <code>Object#hashCode</code> are sufficient to
break all these free theorems, though many creative obliterations via
type tests of the <code>const3</code> kind also exist.
<br />
<br />
By contrast, here are the ones which I think are preserved. I
hesitate to positively state that they are, just because there are
<i>so many</i> possibilities opened up by runtime type information.
<br />
<br />
<dl>
<dt><code>fst[X, Y]: ((X, Y)) => X</code></dt>
<dd><code>a compose fst</code> = <code>fst compose (a product b)</code>
<br />
<br /></dd>
<dt><code>snd[X, Y]: ((X, Y)) => Y</code></dt>
<dd><code>b compose snd</code> = <code>snd compose (a product b)</code>
<br />
<br /></dd>
<dt><code>I[X]: X => X</code></dt>
<dd><code>a compose I</code> = <code>I compose a</code>
<br />
<br /></dd>
<dt><code>K[X, Y]: (X, Y) => X</code></dt>
<dd><code>a(K(x, y))</code> = <code>K(a(x), a(y))</code>
</dd><br />
</dl>
Here is a useful excerpt from the paper itself, section 3.4
“Polymorphic equality”, of which you may consider this entire
article a mere expansion.
<br />
<br />
<blockquote>
<small class="dots">...</small> polymorphic equality cannot be defined in the pure
polymorphic lambda calculus. Polymorphic equality can be added as a
constant, but then parametricity will not hold (for terms containing
the constant).
<br />
<br />
This suggests that we need some way to tame the power of the
polymorphic equality operator. Exactly such taming is provided by the
eqtype variables of Standard ML [Mil87], or more generally by the type
classes of Haskell [HW88, WB89].
</blockquote>
<h4 class="subheading">
Compromise</h4>
Scalaz has some tools to help deal with things here. The <code>Equal</code>
typeclass contains the <code>equalIsNatural</code> method as runtime
evidence that <code>Object#equals</code> is expected to work; this evidence
is used by other parts of Scalaz, and available to you.
<br />
<br />
Scalaz also provides
<br />
<br />
<pre class="example"> implicit def setMonoid[A]: Monoid[Set[A]]
</pre>
<br />
Relative to <code>Functor</code>, this is more or less harmless, because
<code>Monoid</code> isn't so powerful; once you have <code>Monoid</code> evidence
in hand, it doesn't “carry” any parametric polymorphism the way most
Scalaz typeclasses do. It provides no means to actually fill sets,
and the semigroup is also symmetric, so it seems unlikely that there
is a way to write Monoid-generic code that can use this definition to
break things.
<br />
<br />
More typical are definitions like
<br />
<br />
<pre class="example"> implicit def setOrder[A: Order]: Order[Set[A]]
</pre>
<br />
Which may use <code>Object#equals</code>, but is <i>constrained</i> in a way
that they can be sure it's safe to do so, just as implied in the quote
above.
<br />
<br />
Insofar as “compromise” characterizes the above choices, I think
Scalaz's position in the space of possibilities is quite good.
However, I would be loath to see any further relaxing of the principles
I have described here, and I hope you would be too.
Stephen Compallhttp://www.blogger.com/profile/13346366374080232972noreply@blogger.com0tag:blogger.com,1999:blog-1184549185438247550.post-13186383369664041472013-06-23T16:28:00.001-04:002013-06-23T16:28:34.705-04:00Mistakes are part of historyAnd sometimes, later, they turn out not to be mistakes at all.<br />
<br />
Has this never happened to you? For my part, sometimes I am mistaken, and sometimes I am even mistaken about what I am mistaken about. So it is worthwhile to keep records of <b>failed experiments</b>.<br />
<br />
You can always delete information later, as a log-<i>viewing</i> tool might, but you can never get it back if you just deleted it in the first place. <br />
<br />
Please consider this, <a href="http://git-scm.com/">git</a> lovers, before performing your next rebase or squashed merge.<br />
<br />
(My favorite VC quote courtesy <a href="http://web.archive.org/web/20070623135616/http://www.gnuarch.org/gnuarchwiki/Arch_quotes#head-b81a3e1c203ebe7579f49675cd1c6e7eb273776b">ddaa of GNU Arch land</a>, of all places)<br />
<br />Stephen Compallhttp://www.blogger.com/profile/13346366374080232972noreply@blogger.com0tag:blogger.com,1999:blog-1184549185438247550.post-25466252259394027402012-04-03T22:26:00.000-04:002012-04-03T22:41:36.227-04:00Some type systems are much better than others.As a shortcut, I previously complained about <a href="http://failex.blogspot.com/2012/03/c-and-java-do-not-have-good-type.html">C and Java's type systems leaving bad impressions</a>. Let's
broaden that to the realm of type system research, avoiding useless
discussion of the relative merits of this or that programming language
as a whole. <b>Some type systems let you express more of the set
of valid programs, or different subsets.</b> More out of the theoretical
realm, but because we are human, <b>different inference engines
are better at figuring out types in different situations</b>.
<br /><br />
There is a nice graphic in <cite>Gödel, Escher, Bach</cite> that
illustrates this issue: it displays the universe split into the true
and the false, and elaborate structures covered with appendages
illustrating “proof”. Of all valid programs<a href="#fn-1" name="fnd-1" rel="footnote">¹</a>, many are “typeable”,
meaning that we can prove that they are sound, type-wise, with a
particular type system; conversely, most invalid programs are not
typeable.
<br /><br />
However, you might imagine that the fingers stretch into the invalid
space on occasion, and they don't cover the valid space entirely
either; for any given type system, there are valid programs that are
rejected, and there are invalid programs that are accepted.
<br /><br />
<h4 class="subheading">
Not so simple math</h4>
The goal of type system research is to both accept more valid programs
and reject more invalid programs. Let's consider these three
programs in three different languages, which implement one step of a
divide-and-conquer list reduction strategy.
<br />
<pre class="example"> def pm(x, y, z): # <span style="font-family: serif;">Python</span>
return x + y - z
pm x y z = x + y - z -- <span style="font-family: serif;">Haskell</span>
-- <span style="font-family: serif;">inferred as </span>Num a ⇒ a → a → a → a
int pm(int x, int y, int z) { /* <span style="font-family: serif;">C</span> */
return x + y - z;
}
</pre>
First, the expressive problem in C is easy to spot: you can only add
and subtract <code>int</code>s. You have to reimplement this for each
number type, even if the code stays the same.
<br /><br />
The Haskell expressive problem is a little more obscure, but more
obvious given the inferred type: all the numbers must be of the same
type. You can see this with some partial application:
<br />
<pre class="example"> pm (1::Int) -- Int → Int → Int
pm (1::Integer) -- Integer → Integer → Integer
pm (1::Float) -- Float → Float → Float
</pre>
The Python version works on a numeric tower: once you introduce a
float, it sticks. This may be good for you, or it may be bad. If you
are reducing some data with <code>pm</code> and a float sneaks in there, you
won't see it until you get the final output. So with dynamic
promotion, <code>pm</code> works with everything, even some things you
probably don't want.
<br /><br />
There are adjustments you can make to get the Haskell version to be
more general, but this all depends on the kind of generalization you
mean. This is a matter of continued innovation, even in the standard
library; many commonly used libraries provide alternative, more
generic versions of built-in <code>Prelude</code> functions, such as
<code>fmap</code> for functors, one of many, <i>many</i> generalizations
that work with lists, in this case replacing <code>map</code>.
<br /><br />
<hr />
<h4>
Footnotes</h4>
<div class="footnote">
<small>[<a href="#fnd-1" name="fn-1">1</a>]</small> Use whatever
meaning you like for “valid”, but if you want something more formal,
perhaps “it terminates” will suffice.</div>Stephen Compallhttp://www.blogger.com/profile/13346366374080232972noreply@blogger.com0tag:blogger.com,1999:blog-1184549185438247550.post-46772124881911111252012-04-01T14:43:00.002-04:002012-04-01T14:52:03.269-04:00“Inference” is “proof assistance”.<p>As a Lisper, I'm used to just writing out some code that more or less expresses what I mean, then trying it out. I don't want to mess around with proving that what I'm writing makes sense. Moreover, when I first started learning Haskell, I didn't know nearly enough to begin proving what I meant. </p><p>Fortunately, Haskell can figure out all the argument types and return types of very complex functions.<a href="fn-1" rel="footnote" name="fnd-1">¹</a> You know how to write a function that applies a function to each element of a list, and then combines all the resulting lists into one, so just write it: </p><p><a name="index-concatMap-16"></a> </p><pre class="example"> concatMap _ [] = []<br /> concatMap f (x:xs) = f x ++ concatMap f xs<br /> -- <span style="font-family:serif">inferred type of </span>concatMap<span style="font-family:serif"> is </span>(t → [a]) → [t] → [a] </pre> <p><a name="index-concatMapM-17"></a>That's pretty nice; I didn't have to specify a single type, and Haskell figured out not only the types of the arguments and results, one of which was itself a function type, but figured out the precise level of polymorphism appropriate. A frequent mistake when trying to guess this type is writing <code>(a → [a]) → [a] → [a]</code>, which is not as general as the inferred version above. It will compile, but unnecessarily (and often fatally) restrict users of the <code>concatMap</code> function.<a href="fn-2" rel="footnote" name="fnd-2">²</a> </p><p><a name="index-proof-assistance-18"></a><a name="index-type-inference-19"></a>So <strong>inference helps you prove things, often avoiding or explaining generalizations you didn't think of.</strong> It is a “proof assistant”. It greatly aids in refactoring, if you continue to rely on it, as you have to fix your proof in fewer places when you change the rules. It's an absolutely vital tool for entry into the typeful world, when you frequently know how to write a function, but not how to express its true, maximally polymorphic type. </p><p>Unfortunately, the “proof assistant” can't figure out absolutely everything. Moreover, the semantics of the language and type system affect how much the assistant <em>can</em> prove.<a href="fn-3" rel="footnote" name="fnd-3">³</a> </p><br /><a rel="footnote" name="fnd-3"></a> <div class="footnote"> <hr /> <h4>Footnotes</h4><p class="footnote"><small>[<a name="fn-1">1</a>]</small> Haskell is really figuring out “the types of very complex expressions”, but that doesn't sound quite so good, despite being quite a bit better.</p> <p class="footnote"><small>[<a name="fn-2">2</a>]</small> As it happens, we've restricted ourselves to lists, where <code>concatMap</code> actually makes sense for all monads, but that's a result of our using the list-only operations for our implementation, not a failure of the type inference. In Haskell terms, <code>concatMapM f xs = join (liftM f xs)</code>, which is inferred as a <code>Monad m ⇒ (t → m a) → m t → m a</code>. Other generalizations are possible, and you can accidentally lock them down in exactly the same way as <code>concatMap</code>, including to our original inferred type.</p> <p class="footnote"><small>[<a name="fn-3">3</a>]</small> To be overly general, features like mutability and looking up functions by name on a receiver type make inference harder. These are the rule in Simula-inspired object-oriented environments, making inference harder in <acronym>OOP</acronym>, and conversely, easier in functional environments. For example, in the Java expression <code>x.getBlah(y)</code>, you can't infer anything about <code>getBlah</code> until you know the type of <code>x</code>. But in Haskell, <code>getBlah</code> has one known type, albeit perhaps polymorphic or constrained by typeclasses, which can be used to infer things about <code>x</code> and <code>y</code> without necessarily knowing anything else about them.</p></div>Stephen Compallhttp://www.blogger.com/profile/13346366374080232972noreply@blogger.com0tag:blogger.com,1999:blog-1184549185438247550.post-66232425144110380262012-03-29T21:27:00.001-04:002012-04-03T22:29:10.879-04:00With a type system, whether you can write a program depends on whether you can prove its correctness.The trouble with bad type systems is that you have to use “escape hatches” pretty frequently. In C and many of its derivatives, these take the form of <dfn>casts</dfn>, which, in type terms, are like saying “I can't prove this even a little bit, but trust me, it's true. Try it a few times and you'll see.” The traditional cast system is very broad, as it must be to account for the shortcomings of lesser type systems. No one wants to reimplement the collection classes for every element type they might like to use. <br /><br />
<a href="http://www.blogger.com/blogger.g?blogID=1184549185438247550" name="index-Dynamic-9"></a><a href="http://www.blogger.com/blogger.g?blogID=1184549185438247550" name="index-magic-10"></a>After being so impressed by the power of Haskell's inference, many people next discover that they can't put values of different types into lists.<a href="http://www.blogger.com/blogger.g?blogID=1184549185438247550" name="fnd-1" rel="footnote">¹</a> Well, that's not quite true, you can always chain 2-tuples together, but that's not what you really <i>mean</i>. Well, what did you mean? <br /><br />
<i>Oh, well, I meant that sometimes the elements of my list are integers, and sometimes they're strings.</i> <br /><br />
<a href="http://www.blogger.com/blogger.g?blogID=1184549185438247550" name="index-IntOrString-11"></a><a href="http://www.blogger.com/blogger.g?blogID=1184549185438247550" name="index-AnInt-12"></a><a href="http://www.blogger.com/blogger.g?blogID=1184549185438247550" name="index-AString-13"></a>Okay, no problem. Put the integer type and string type together as alternatives in a single type:
<pre class="example"> data IntOrString = AnInt Int | AString String
[AnInt 42, AString "hi"] -- <span style="font-family:serif">has type </span>[IntOrString] </pre>
<i>No, I meant that it's alternating integers and strings, one of each for the other.</i> <br /><br />
Well why didn't you say so!
<a href="http://www.blogger.com/blogger.g?blogID=1184549185438247550" name="index-IntAndString-14"></a><a href="http://www.blogger.com/blogger.g?blogID=1184549185438247550" name="index-IntAndString-15"></a>
<pre class="example"> data IntAndString = IntAndString Int String
[IntAndString 42 "hi"] -- <span style="font-family: serif;">has type </span>[IntAndString] </pre>
You can't just stick integers and strings together in a list without proving something about what you mean. <b>To write any program typefully, you have to prove that it sort of makes sense.</b> In the former example, you really meant that each element could be either one, and you have to prove that it's one of those, and not, say, a map, before you can put it in the list. In the latter example, you have to prove that you have exactly one string for each integer that you put into the list. <br /><br />
This permits a more analytical approach to programming than can occur in latent-typed systems. Let's say you had the <code>[IntOrString]</code>, and you realized it was wrong and changed it to <code>[IntAndString]</code> in one module. You have two other modules that are trying to use the lists, and now they don't work, because you didn't prove that you had one string for each integer in those modules. Now nothing loads, and you have to jump around for a bit fixing your proofs until you can test again. This separates the task into two phases: one where you're only thinking about and testing the proofs, and the other where you're thinking about and testing the behavior. <br /><br />
I don't think this is an unqualified improvement over the latent-typed situation. On one hand, breaking tasks down into little bits is the foundation of human software development. Moreover, this example clearly helped us to clarify what we meant about the structure we were building. On the other hand, sometimes I prefer to focus on getting one module right on both type and runtime levels before moving on to the next. This is harder to do with most typeful programming languages, as type errors naturally cascade, and types both described and inferred usually influence runtime behavior. <br /><br />
<hr />
<small>[<a href="http://www.blogger.com/blogger.g?blogID=1184549185438247550" name="fn-1">1</a>]</small> Haskell also has escape hatches, but using them is viewed rather like gratuitous use of <code>eval</code> is by Lispers. Whereas most C and Java programs use casts, very few Haskell programs use <code>Data.Dynamic</code>, just as very few OCaml programs use <code>Obj.magic</code>.Stephen Compallhttp://www.blogger.com/profile/13346366374080232972noreply@blogger.com0tag:blogger.com,1999:blog-1184549185438247550.post-32203909785385452002012-03-25T18:37:00.001-04:002012-03-29T21:26:02.173-04:00C and Java do not have good type systems.<p>You know how, once you learn Scheme, or Common Lisp, the idea of a language not providing lambda expressions and still somehow being good is just absurd? There are similar things I discovered about type systems when learning Haskell, as in, “it's just absurd that anyone thinks a type system without this feature is good.” </p><p>The exact features I'll describe later. But if your opinion of type systems is based on the really popular ones, know that those are missing the features in question. To be more direct, <strong>C and Java don't have “good” type systems.</strong> </p><p>Just be wary of forming an opinion of, say, the Haskell type system, based on the very severe limitations of something else.</p>Stephen Compallhttp://www.blogger.com/profile/13346366374080232972noreply@blogger.com