The purpose of this post is to discuss how to reconcile expressivity and efficiency in a language with parametric polymorphism, without losing implementation simplicity. Feedback is most welcome. However, I take it for granted that you appreciate the benefits of parametric polymorphism and static type safety in general. They can be legitimately questioned – somewhere else.

The simplest implementation strategy for parametric polymorphism is type erasure: eliminate all type tags and only then translate the resulting untyped program. Its main disadvantage is that it requires a uniform representation for values of all types, which is inefficient when it’s not strictly necessary.

The simplest implementation strategy for parametric polymorphism that doesn’t require a uniform representation for values of all types is monomorphization: specialize all polymorphic definitions to monotypes and only then translate the resulting simply typed program. Its main disadvantage is that, to the best of my knowledge, it only works for programs with rank-1 types.

There exist other implementation strategies, such as type passing, but those are considerably more complex and I won’t discuss them here.

My claim is that programming languages can benefit from an explicit distinction in their syntax and semantics between two kinds of parametric polymorphism: one that guarantees monomorphization and another that supports impredicative type variable instantiation. From now onwards I will refer to these as “ML-style” and “F-style” polymorphism, respectively.

ML-style polymorphism is, as its name suggests, what vanilla ML gives you: Let-bound variables have type schemas, which are implicitly instantiated when the variable is used. Lambda-bound variables have monotypes. For example:

datatype 'a list = nil | :: of 'a * 'a list fun len (n, nil) = n | len (n, _ :: xs) = len (n + 1, xs) fun length xs = len (0, xs)

F-style polymorphism is, again as its name suggests, what System F gives you: Polymorphic definitions are type functions, which are explicitly instantiated by applying them to monotypes. The type of a type function is itself a monotype. For example:

fun stack (a : type) = let (* The constructor for reference cells has type: * * val ref : pi (a : type) -> a -> a ref * * Note that “a” is *not* a unification variable. *) val cell = ref (a list) nil fun push x = cell := x :: !cell fun pop () = case !cell of | x :: xs => cell := xs; x | _ => raise Empty in { push = push, pop = pop } end

In my experience, ML-style polymorphism is what you want between 80% (if you’re a hardcore Haskeller) and 99% of the time (if you just want generic containers and algorithms). Designing a language construct for this common case is a no-brainer: the efficiency benefits alone justify it, and the better type inference seals the deal. On the other hand, sometimes you want to express fancier things like lenses, free applicatives and reflection-without-remorse free monads. In these cases, you do need F-style polymorphism, and I do find myself sorely missing it when I use Standard ML.

Does monomorphisation also require whole-program compilation?

Monomorphization doesn’t require whole-program compilation, but it gets in the way of separate compilation to some extent. Under this scheme, parametrically polymorphic definitions can’t be directly translated to machine code. Instead, they are translated to an intermediate representation that keeps enough type information to support specialization.

One way to think about it is that monomorphization is compatible with modularity for type-checking purposes, but less so for translation purposes.