The claim of a multi-paradigm language is to harmoniously serve
various approaches to programming. The AnyVal
subclass feature forms
a strong counterargument to Scala’s multiparadigm claim.
AnyVal
subclasses penalize parametric-polymorphic, type-safe
programming, in order to better support type-unsafe programming
styles, such as those making use of isInstanceOf
. They sneakily
shift the blame for their performance problems onto type safety and
polymorphism. I will provide an existence proof that the blame ought
to land squarely on AnyVal
subclasses, but I cannot stop this
blame-shifting from lending further credence
to
the witticism “If scala was the only language I had to think in, I’d think functional programming was a bad idea that didn’t scale, too.”
Moreover, by creating the false impression that the “newtype problem”
has been solved in Scala, AnyVal
subclasses obscure solutions that
better serve polymorphic, type-safe programming. While I describe such
a solution in this article, I have no illusions that I alone can
reverse the upward trend of the AnyVal
meme.
Scala, today, has the potential to better support type-safe
programming, and it has since before the advent of AnyVal
subclasses. In this article, we will focus on how the language could
reveal this potential, becoming a better foundation for polymorphic,
type-safe programming than it advertises today.
A String
reference must be boxed
Suppose that you want a “wrapper” around String
s with a unique type
so that they can’t be accidentally confused with arbitrary
String
s. This is a common use case for a newtype, a wrapper with
intentionally incompatible type that exists only at compile time. (The
name “newtype” comes from the
Haskell keyword for its version of this feature.)
You decide to use extends AnyVal
, since you have heard that this is
a compile-time-only class that doesn’t get allocated on the heap.
class Label(val str: String) extends AnyVal object Label { def apply(s: String): Label = new Label(s) }
This seems to do the trick with your first several tests.
class MyFirstTests { def combineLabels(l: Label, r: Label): Label = Label(l.str + r.str) def printLabels(): Unit = { val fst = Label("hello") val snd = Label("world") println(fst.str) println(snd.str) } }
As reported by javap
, the new Label
goes away for Label.apply
.
// javap -c -cp target/scala-2.12/classes hcavsc.av.Label$ public java.lang.String apply(java.lang.String); Code: 0: aload_1 1: areturn
It vanishes for the signature of combineLabels
too, meaning that we
can write some functions over Label
s without allocating them.
// javap -cp target/scala-2.12/classes hcavsc.av.MyFirstTests public java.lang.String combineLabels(java.lang.String, java.lang.String);
You can even use Label
in a case class
, and it will be String
at
runtime.
case class Labelled[A](lbl: Label, a: A) // javap -p -cp target/scala-2.12/classes hcavsc.av.Labelled private final java.lang.String lbl; private final A a;
But then, you decide that you want a List
of Label
s.
// add to printLabels val lbls = List(fst, snd) // javap -c -cp target/scala-2.12/classes hcavsc.av.MyFirstTests 24: iconst_2 25: anewarray #56 // class hcavsc/av/Label 28: dup 29: iconst_0 30: new #56 // class hcavsc/av/Label 33: dup 34: aload_1 35: invokespecial #59 // Method hcavsc/av/Label."<init>":(Ljava/lang/String;)V 38: aastore 39: dup 40: iconst_1 41: new #56 // class hcavsc/av/Label 44: dup 45: aload_2 46: invokespecial #59 // Method hcavsc/av/Label."<init>":(Ljava/lang/String;)V 49: aastore 50: invokevirtual #63 // Method scala/Predef$.genericWrapArray:(Ljava/lang/Object;)Lscala/collection/mutable/WrappedArray; 53: invokevirtual #66 // Method scala/collection/immutable/List$.apply:(Lscala/collection/Seq;)Lscala/collection/immutable/List;
Huh. Didn’t expect those two new
s to be there. Ah well, maybe now
that they’re in the list,
lbls.map{x => Label(x.str + "Aux")} // javap -c -cp target/scala-2.12/classes hcavsc.av.MyFirstTests public static final java.lang.Object $anonfun$printLabels$1$adapted(java.lang.Object); Code: 0: new #61 // class hcavsc/av/Label 3: dup 4: aload_0 5: checkcast #61 // class hcavsc/av/Label 8: invokevirtual #117 // Method hcavsc/av/Label.str:()Ljava/lang/String; 11: invokestatic #119 // Method $anonfun$printLabels$1:(Ljava/lang/String;)Ljava/lang/String; 14: invokespecial #64 // Method hcavsc/av/Label."<init>":(Ljava/lang/String;)V 17: areturn
OK, sure, so you took it out and put it back, so it unboxed and then boxed again. How about a tuple, instead?
// add to printLabels (fst, snd) // javap -c -cp target/scala-2.12/classes hcavsc.av.MyFirstTests 73: new #103 // class scala/Tuple2 76: dup 77: new #61 // class hcavsc/av/Label 80: dup 81: aload_1 82: invokespecial #64 // Method hcavsc/av/Label."<init>":(Ljava/lang/String;)V 85: new #61 // class hcavsc/av/Label 88: dup 89: aload_2 90: invokespecial #64 // Method hcavsc/av/Label."<init>":(Ljava/lang/String;)V 93: invokespecial #106 // Method scala/Tuple2."<init>":(Ljava/lang/Object;Ljava/lang/Object;)Vf
Two more new
s. Fine. How about the identity
method?
// add to printLabels identity(fst) // javap -c -cp target/scala-2.12/classes hcavsc.av.MyFirstTests 97: getstatic #59 // Field scala/Predef$.MODULE$:Lscala/Predef$; 100: new #61 // class hcavsc/av/Label 103: dup 104: aload_1 105: invokespecial #64 // Method hcavsc/av/Label."<init>":(Ljava/lang/String;)V 108: invokevirtual #109 // Method scala/Predef$.identity:(Ljava/lang/Object;)Ljava/lang/Object;
So there seems to be an impressive collection of things that will
cause an AnyVal
subclass to box. You assume there’s a good reason
they implemented it this way;
we’ll
get into that later.
No boxing with type tags
However, you decide to look for an alternative newtype
mechanism
that doesn’t box, under the theory that scalac
’s reasons for boxing
AnyVal
subclasses don’t apply to the use cases you have in mind for
Label
and similar things in your codebase.
You have heard that Scalaz’s “type tags” are a kind of newtype with no
boxing. You could just pull in scalaz-core
and see if you can get
them to work, but decide to implement Label
directly using the same
technique as Scalaz tags, instead.
object Labels { sealed abstract class LabelImpl { type T def apply(s: String): T def unwrap(lbl: T): String } // do not forget `: LabelImpl`; it is key val Label: LabelImpl = new LabelImpl { type T = String override def apply(s: String) = s override def unwrap(lbl: T) = lbl } type Label = Label.T } import Labels._
While regretting that the compiler no longer makes your Label
type
very convenient to define, you press on. First, to confirm, you can’t
treat an arbitrary String
as a Label
:
scala> "hi there": Label <console>:15: error: type mismatch; found : String("hi there") required: hcavsc.subst.Labels.Label (which expands to) hcavsc.subst.Labels.Label.T "hi there": Label ^
So far, so good. Then, why not retry some of the earlier experiments
that caused the AnyVal
-based label to box?
// javap -c -cp target/scala-2.12/classes hcavsc.subst.MyFirstTests val fst = Label("hello") val snd = Label("world") identity(fst) 24: getstatic #43 // Field scala/Predef$.MODULE$:Lscala/Predef$; 27: aload_1 28: invokevirtual #47 // Method scala/Predef$.identity:(Ljava/lang/Object;)Ljava/lang/Object; (fst, snd) 32: new #49 // class scala/Tuple2 35: dup 36: aload_1 37: aload_2 38: invokespecial #53 // Method scala/Tuple2."<init>":(Ljava/lang/Object;Ljava/lang/Object;)V val lbls = List(fst, snd) 48: iconst_2 49: anewarray #4 // class java/lang/Object 52: dup 53: iconst_0 54: aload_1 55: aastore 56: dup 57: iconst_1 58: aload_2 59: aastore 60: invokevirtual #62 // Method scala/Predef$.genericWrapArray:(Ljava/lang/Object;)Lscala/collection/mutable/WrappedArray; 63: invokevirtual #65 // Method scala/collection/immutable/List$.apply:(Lscala/collection/Seq;)Lscala/collection/immutable/List; lbls.map{x => Label(Label.unwrap(x) + "Aux")} public static final java.lang.Object $anonfun$printLabels$1(java.lang.Object); Code: 0: getstatic #26 // Field hcavsc/subst/Labels$.MODULE$:Lhcavsc/subst/Labels$; 3: invokevirtual #30 // Method hcavsc/subst/Labels$.Label:()Lhcavsc/subst/Labels$LabelImpl; 6: new #104 // class java/lang/StringBuilder 9: dup 10: invokespecial #106 // Method java/lang/StringBuilder."<init>":()V 13: getstatic #26 // Field hcavsc/subst/Labels$.MODULE$:Lhcavsc/subst/Labels$; 16: invokevirtual #30 // Method hcavsc/subst/Labels$.Label:()Lhcavsc/subst/Labels$LabelImpl; 19: aload_0 20: invokevirtual #110 // Method hcavsc/subst/Labels$LabelImpl.unwrap:(Ljava/lang/Object;)Ljava/lang/String; 23: invokevirtual #114 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder; 26: ldc #116 // String Aux 28: invokevirtual #114 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder; 31: invokevirtual #120 // Method java/lang/StringBuilder.toString:()Ljava/lang/String; 34: invokevirtual #36 // Method hcavsc/subst/Labels$LabelImpl.apply:(Ljava/lang/String;)Ljava/lang/Object; 37: areturn
No allocation there. Hmm. Well, maybe our concrete LabelImpl
instance is doing some secret boxing?
// javap -c -cp target/scala-2.12/classes 'hcavsc.subst.Labels$$anon$1' public java.lang.String apply(java.lang.String); Code: 0: aload_1 1: areturn public java.lang.Object apply(java.lang.String); Code: 0: aload_0 1: aload_1 2: invokevirtual #27 // Method apply:(Ljava/lang/String;)Ljava/lang/String; 5: areturn public java.lang.String unwrap(java.lang.String); Code: 0: aload_1 1: areturn public java.lang.String unwrap(java.lang.Object); Code: 0: aload_0 1: aload_1 2: checkcast #21 // class java/lang/String 5: invokevirtual #23 // Method unwrap:(Ljava/lang/String;)Ljava/lang/String; 8: areturn
No boxing there. That makes sense; in that context, Label
is
String
; the fact that our Label
-using code doesn’t know that is
irrelevant, because we hid that information using existential types.
So, it is possible to have a newtype
mechanism that doesn’t
box. You don’t have to wait for the JVM to
deliver
its own brand of value types; you
can even implement it yourself, in Scala, today. They must have had
another reason for all this boxing, because “we have to because JVM”
is denied by the behavior of Scala-JVM itself.
You aren’t sure what those reasons are, but you decide to port the
rest of your code to use the existential Label
. Befitting an unboxed
newtype, the runtime representation of List[Label]
is exactly the
same as the underlying List[String]
, as well as every Option
,
Either
, and whatever else you can think up.
You notice that the erasure for Label
is different, but this seems
significantly less serious than the boxing problem, so leave it for
now. (We
will
dig into related design decisions later.)
What can you do with a box? What can you do without a box?
Let’s start with a quick comparison of boxing AnyVal
and the “type
tagging” mechanism we’ve just seen.
Capability | AnyVal subclass |
Type tag |
---|---|---|
Defining methods | normal override ; virtual method dispatch available |
implicit class enrichment only |
lbl.getClass |
Label |
String |
Cast Any to Label |
checked at runtime | unchecked; no wrapper left at runtime |
isInstanceOf |
checked at runtime | unchecked; same reason casting doesn’t work |
Adding type parameters to methods | boxing/unbox penalty | no boxing penalty |
Wrapping a List |
O(n): box every element and reallocate list itself | O(1), with subst : no allocation, output list eq to input list |
Unwrapping a list | O(n): reallocate list, unbox each element | O(1): eq output with subst . Also possible to make unwrapping a <: (free liftable automatic upcast) |
Coinductive type class instances | works; boxing penalty applies | works; no boxing penalty |
Wrapping whole program parts | each function must be wrapped to add per-value wrapping/unwrapping | O(1): just works with subst |
I detect from this matrix a particular theme: AnyVal
subclasses give
up a lot of capability in the type-safe arena. Consider rewriting a
loop that uses Label
as state as a foldLeft
: you must contend with
a new boxing/unboxing penalty, since the state parameter in a
foldLeft
is type-parametric. It’s more fodder for the persistent
higher-order function skeptics among us.
While we know that adding type parameters to our functions improves type-safety, the skeptic will note the boxing penalty, and attribute it to parametric polymorphism. But we know the true culprit.
If AnyVal
subclassing taxes type-safe programming in these ways,
what is it spending the money on? Simple: support for isInstanceOf
,
“safe” casting, implementing interfaces, overriding AnyRef
methods
like toString
, and the like.
As type-safe, parametrically-polymorphic programmers, we avoid these
features, as a matter of principle and of practice. Some, like checked
casting, are simply not type-safe. Some ruin free theorems, like
toString
, and we would prefer safe mechanisms, like the Show
typeclass, to actually tell us at compile time if our programs make
sense. Yet, if we use AnyVal
subclasses, we have to pay the price
for all the programmers that wish to write type-unsafe code, like
List[Any] => List[Label]
. All is not well in Multiparadigmatic Land.
When will our methods be resolved?
To showcase the relationship of the two approaches to runtime-reflective programming versus statically-proven programming, let’s consider stringification.
Scala provides the toString
virtual method on Any
. Calling this
method is dynamically resolved on the value itself; it is as if every
value must carry around a pointer to a function that, given itself,
returns a String
. We can define this for our original AnyVal
-based
Label
, and so toString
on List
et al will also work.
// add to class Label override def toString = s"Label($str)" scala> List(fst, snd).toString res0: String = List(Label(hello), Label(world)) scala> Some(fst).toString res1: String = Some(Label(hello))
Moreover, this “works” even for the type List[Any]
.
scala> List[Any](fst, "hi").toString res2: String = List(Label(hello), hi)
You cannot override toString
for our fully-erased Label
. After
all, every Label
is just a String
at runtime!
(Different types, same class.)
However, the type-safe programmer will recognize List[Any]
as a type
that, if it occurs in her program, means “something has gone wrong
with this program”. Moreover, because toString
doesn’t make sense
for all types, we use a static mechanism,
like
the scalaz.Show
typeclass. And
this works fine for Label
, because it is statically resolved by
type, not dependent on an implicit runtime member of every Label
; in
fact, it can only work because it is static!
// add to object Labels import scalaz.Show implicit val showLabel: Show[Label] = Show shows {lbl => s"Label(${Label.unwrap(lbl)})"} scala> import scalaz.syntax.show._, scalaz.std.list._, scalaz.std.option._ scala> List(fst, snd).shows res1: String = [Label(hello),Label(world)] scala> some(fst).shows res2: String = Some(Label(hello))
So if you are doing this kind of programming, it doesn’t matter
whether you can’t override toString
, or type test, &c; you weren’t
doing it anyway. But, aside from a little performance bump, what do
you gain from unboxed type-tagging?
When is a Label
a String
? When is it not?
You notice that subst
is at the foundation of several Scalaz
constructs like Leibniz
and Liskov
, and plays a prominent role in
the Tag
API as well. You decide to add this to your LabelImpl
as
well.
// in LabelImpl def subst[F[_]](fs: F[String]): F[T] // and in val Label override def subst[F[_]](fs: F[String]) = fs
It’s interesting that you can use this to tag a whole List[String]
in constant time:
scala> val taggedList = Label.subst(List("hello", "world")) taggedList: List[Label.T] = List(hello, world)
It’s also interesting that you can use this to untag a whole list in constant time.
scala> Label.subst[Lambda[x => List[x] => List[String]]](identity)(taggedList) res0: List[String] = List(hello, world)
Functions and typeclass instance can be tagged or untagged, too.
scala> Label.subst[Lambda[x => (x, Int) => x]](_ substring _) res1: (Label.T, Int) => Label.T = $$Lambda$3194/964109489@72557d64 scala> import scalaz.Monoid, scalaz.std.string._ scala> Label.subst(Monoid[String]) res3: scalaz.Monoid[Label.T] = scalaz.std.StringInstances$stringInstance$@252798fe
All of this works because subst
is really evidence
that,
deep down, String
and Label
are the same.
scala> import scalaz.Leibniz, Leibniz.{===, refl} scala> Label.subst[String === ?](refl) res4: Leibniz[Nothing,Any,String,Label.T] = scalaz.Leibniz$$anon$2@702af12c
Yet, you ran an experiment earlier to prove that you can’t confuse
String
and Label
; indeed, this still holds true, despite the
presence of subst
!
scala> "still a string": Label <console>:21: error: type mismatch; found : String("still a string") required: hcavsc.subst.Labels.Label (which expands to) hcavsc.subst.Labels.Label.T "still a string": Label ^ scala> Label("still a label"): String <console>:21: error: type mismatch; found : hcavsc.subst.Labels.Label.T required: String Label("still a label"): String ^
Here’s what’s happening: in a sense, (new Label(_)): (String =>
Label)
and (_.str): (Label => String)
witness that there’s a
conversion between the two types. subst
witnesses that there’s
identical runtime representation between its own two types. You get
to selectively reveal this evidence when it makes writing your
program more convenient; the rest of the time, it is hidden.
But I would like to step one level up: this is a design space, and
subst
as we have seen it isn’t appropriate for all designs. As the
author of your own abstract newtypes, you get to choose how much, if
any, of this underlying type equality to reveal.
If subst
is the right choice
For various reasons, the above is how Scalaz Tag
(@@
) is
defined. If you wish these semantics, you might as well throw
everything else away and write
sealed trait LabelTag // no instances type Label = String @@ LabelTag val Label = Tag.of[LabelTag]
and take advantage of the convenient tools around subst
defined in
Tag.Of
. But it’s not the only choice! It’s one point in the design
space. To do right by your API users, it’s
worth
exploring that design space a little more.
Type-unsafe code isn’t type-safe
Unboxed existential tagging spreads through your codebase. You feel
free to apply it liberally, because you know you aren’t paying the
wrapping costs of AnyVal
subclasses; all these new abstraction
layers are pure type-level, and fully erased.
You receive a “bug report” from a fellow developer that this
expression never seems to filter out the non-label String
s.
(xs: List[Any]).collect{case t: Label => t} <console>:16: warning: abstract type pattern hcavsc.translucent.Labels.Label.T (the underlying of hcavsc.translucent.Labels.Label) is unchecked since it is eliminated by erasure (xs: List[Any]).collect{case t: Label => t} ^ <console>:16: warning: The outer reference in this type test cannot be checked at run time. (xs: List[Any]).collect{case t: Label => t} ^
Your mind
on
safe pattern matching practice,
you add def unapply(s: String): Option[T]
to LabelImpl
and counsel
preference for the form case Label(t) => ...
, as well as to not
ignore -unchecked
warnings.
You get another bug report that this always seems to succeed.
(s: String).asInstanceOf[Label]
Repeating your advice about warnings, you start to wonder, “where is this kind of code coming from?”
Someone else complains that they want to make T extends Ordered[T]
,
and can’t fathom where the code should go. You advise the static
approach of implementing the Ordering
typeclass instance instead for
T
, wonder how deep the object-orientation hole goes, and forward the
link about the typeclass pattern again, too.
Suppose you went back to AnyVal
We’ve seen that AnyVal
subclasses could have been incredibly
cheap, but aren’t, so as to support “features” like checked
casting. Who’s going to foot the bill?
- Oh, this allocates when passing through polymorphic contexts, but not monomorphic ones? Avoid polymorphic code.
- Oh, this extra type-safety adds all this allocation? Type safety is
expensive at runtime; we need to stick to
String
. - We can’t do any better; the JVM limits the possibilities. You have to pay for runtime class wrapping if you want a wrapper type.
In this article, I have demonstrated that none of these conclusions are correct. However, only a tiny minority of Scala practitioners will ever read this article, and I will not blame the rest for drawing these seemingly straightforward inferences, ultimately faulty as they are.
The real cost of AnyVal
subclasses is not all the needless memory
allocation. The real cost is the damage to the practice of type-safe
programming in Scala. It’s in all the curious developers who sought to
add a little more type safety to their programs, only to find
themselves penalized by the runtime, once bitten. It’s in the
reinforcement of this attitude towards abstraction that they’ll
continue to carry with them, the next time an opportunity presents
itself. It’s a missed opportunity for pure type-level thinking, all so
that asInstanceOf
“works”.
See
“…and the glorious subst
to come” for
further development of the ideas in this article.
This article was tested with Scala 2.12.1, Scalaz 7.2.10, and Kind Projector 0.9.3. The code is available in compilable form for your own experiments via Bazaar.
Great post! I released a small project months ago that generates valueclasses and newtypes in compile time according to any given type. I'd love your feedback on that. https://github.com/vitorsvieira/dilate
ReplyDeleteThis is a thoughtful post. I too shuttered when I saw the introduction of AnyVal.
ReplyDeleteHowever, do we place too much blame on Scala? Scala is trying to work around the limitations of the JVM; in my understanding, there is not a cheap way to do value classes in the JVM.
It does seem like Scala could be more consistent... IE - always erase the value class, rather than have special rules in different circumstances. And, throw a fat compiler warning when matching on an AnyVal value class? Would this help?
As part of this post, I demonstrate an alternative implementation as you describe, "always erase the value class", and moreover, that this has nothing to do with JVM limitations; there is no need to wait for a JVM "value class" mechanism such as is in the pipeline.
Delete"The real cost is the damage to the practice of type-safe programming in Scala."
ReplyDeleteIs this "cost" hidden to the developer using Value Classes? If not, i.e. it's a direct downside, could you please demonstrate the loss of type safety with AnyVal?
The loss is in that people are discouraged from using type-safe mechanisms. Folks are more likely to do "stringly-typed" programming if it appears that type-safe alternatives have significant runtime costs.
DeleteThere is a downside to tagged types though: they will *always* box value types. AnyVal can avoid boxing of value types in many situations. Unfortunately there does not seem to be a way to get the best of both worlds.
ReplyDeleteSee section "Boxing Ints" of the followup article, https://failex.blogspot.com/2017/04/and-glorious-subst-to-come.html . In short, you can get a tagged Int to erase to int, &c.
DeleteThe need for an upper bound is not ideal, as discussed in section "T = String translucency" of the same article.
Thanks for a very informative explanation about the pros/cons of AnyVal vs. type tags and of some of the pitfalls involved. Have you considered exploring applying type tags to Spark?
ReplyDeleteI haven't considered any specific library, since they are a very general-purpose tool, particularly in codebases that prefer type-safety.
DeleteI think it's an interesting exercise to pick out one informality (something represented by String or Int or some such that really deserves its own type), refine it with a type tag to formalize the abstraction, and see what happens when you try to get it to compile. Because "be careful" is just a joke strategy, you'll likely find several bugs where unverified strings/ints/&c "leaked" into the informal stringly "type". Figuring out what to do then can be a real challenge, I think.
So if you're a fan of Spark, you're far better placed to experiment with type tags therein than I am. Let me know how it goes :)
Ok; I will keep you posted.
DeleteExcellent article; I hope you can answer a follow up. Unfortunately `toString` is not going away. I'd like to use type tags to protect sensitive information (most importantly, making it hard to log that info accidently). That means overriding `toString`. Is there any way type tags can help or am I stuck with `AnyVal`? (I don't have much hope but ...)
ReplyDeleteThanks!
Not as long as you accept the premise that you must use `toString`. Its presence doesn't imply a requirement to use it; Scalaz's `Show` typeclass is one example of an alternative, type-safe approach, but is not any more a one-size-fits-all than toString is.
DeleteAs long as you recognize that dynamic, unchecked stringification isn't some law of physics, but a deliberate choice that cuts you off from other possibilities (e.g. a logging system may certainly reject printing values that don't meet some particular typeclass constraint, and that includes wrappers you may write for unsafe logging systems), that's half the battle.
I didn't recall this while writing the article, but there's a good chance I internalized the basic idea of how to do this from this scala-user post by Julian Michael (@julianmichael on github).
ReplyDeleteSo, to be safe, assume I invented none of this, only poked around with the bytecode/JS output for a while, and got all the real ideas from OCaml and/or Julian's post.
Thanks for the article, I like it. I only have a few small nits:
ReplyDelete1. I'm not sure I get this part:
> Your mind on safe pattern matching practice, you add def unapply(s: String): Option[T] to LabelImpl and counsel preference for the form case Label(t) => ..., as well as to not ignore -unchecked warnings.
unless you take "this code can't be fixed" as so obvious that it's not worth mentioning.
2. Also, unless I miss something, a small correction is needed.
I can write `def unapply(s: String): Some[T] = Some(apply(s))` so that it always succeeds (so I prefer `Some[T]` to `Option[T]` as return value, but either works), but `Label.unapply` should be `def unapply(l: Label): Some[String] = Some(unwrap(l))`. Luckily, inside `val Label` the calls to `apply` and `unwrap` can be elided.
3. Would `@inline` work on the accessors (and on `subst`) to remove the remaining (small) runtime overhead? I fear not, and I suspect it might be worth optimizing.
Thanks for your notes, Paolo; I'll try to address them.
ReplyDeleteRegarding #1:
Right, it can't be fixed, because the behavior of type patterns cannot be overridden.
Regarding #2:
The goal of the user in this case is to distinguish between non-Label strings and Labels in a list of values of unknown "type". So we can't use your suggested signature for two reasons: (1) the premise 'l: Label' is not satisfied, and we cannot get the pattern matcher to do the right thing, because type patterns cannot be customized; (2) the intention, indeed, is to return 'None' for certain arguments, so using 'Some' would be inappropriate.
This confusion is understandable, and I apologize for causing it. It assumes somewhat the context of "unidirectional 'subst'" in the followup article "...and the glorious 'subst' to come", which changes the relationship between Label and String in two ways: (1) 'apply' is non-total, i.e. validating; (2) the thing that makes a String a Label is no longer 'it passed through an apply/subst barrier', it is 'the characters fit some particular pattern that can be tested'. In this situation, it's not meaningful to declare 'Some' as the return type, and this is the only way the 'unapply' implementation can make a sensible determination about whether to answer 'Some' or 'None'.
Regarding #3:
I optimistically '@inline'd several appropriate points in the new Scalaz 7.3 type tag implementation, but as of Scala 2.12.1 it doesn't help. So your fears are well-founded. Scala.js users are in luck, though: my test in "...and the glorious subst to come" shows that 'subst' is elided beautifully even in 'fastOptJS' with no '@inline' markers at all.
I have not tested it, but given the stated semantics of 'inline' from "inline-meta", I expect that inlining mechanism to do as well as Scala.js already does. But we'll have to wait and see.
On #3: Dotty-linker should detect interfaces with single implementations, so it should be able to apply inlining. I suspect Scala.js does something similar. In fact, if you seal the interface, detecting there's a single implementation should become easier (though I guess Scalac/Dotty might not do it yet).
ReplyDeleteBeware: when I say "should", I only mean "it seems possible to implement this optimization in the current architecture AFAICS".
For the rest, I'll take a look at the followup post. Thanks for your answers!
Related Scala issue: "Universal traits shouldn't add allocation to a value class"
ReplyDeleteSee: https://github.com/scala/bug/issues/9731
Suppose I have:
ReplyDeletesealed abstract trait Shape
sealed trait Square <: Shape
sealed trait Circle <: Shape
sealed abstract trait Line
sealed trait Spline <: Line
sealed trait Strait <: Line
val q: String @@ Square = ...
val c: String @@ Circle = ...
val s: String @@ Spline = ...
The following compiles; worse, it may even succeed if their String value happens to be equal!
if (q == c) ...
Is there a way to prevent this?
Nicolas,
ReplyDeleteYou have a few tools at your disposal.
First, your code example does compile, but gives a warning, "AnyRef{type Tag = shapes.Square; type Self = String} and AnyRef{type Tag = shapes.Circle; type Self = String} are unrelated: they will most likely never compare equal". Turning on fatal warnings may be called for. Also, this warning may be hidden behind -Xlint; I'm not sure.
More generally, == is not safe because it is not type-safe. That's why various alternatives have sprung up.
First, Wartremover lets you ban == from your code, and suggests a simple alternative, which will give you a compiler error for your example. This is a good starting point for any of the below.
Going a little bit further with safety is univeq, which lets you limit which types can be passed to an equality check, but still uses the Object#equals Java method for implementing the function.
For even more safety and control, Scalaz, Cats, and Spire define full-powered equality typeclasses, which bypass the incomplete Object#equals method entirely.
Any typeclass-based approach is easy to extend to your tagged types by calling subst on the underlying typeclass instance.
It's up to you how far you want to go. But out-of-the-box == is simply incompatible with safety considerations.