Re: Label Names Space - philosophy or implementation?

Xavier Leroy (xleroy@pauillac.inria.fr)
Wed, 24 Jul 1996 11:40:27 +0200 (MET DST)

From: Xavier Leroy <xleroy@pauillac.inria.fr>
Message-Id: <199607240940.LAA22489@pauillac.inria.fr>
Subject: Re: Label Names Space - philosophy or implementation?
To: johnm@vlibs.com (John Gerard Malecki)
Date: Wed, 24 Jul 1996 11:40:27 +0200 (MET DST)
In-Reply-To: <199607230040.RAA11401@owl.vlibs.com> from "John Gerard Malecki" at Jul 22, 96 05:40:16 pm

> A co-worker has (vehemently) pointed out to me that record label names
> cannot be shared. Are there good reasons for this "restriction"?

There's one: it simplifies type inference greatly. More formally, it
guarantees the principal type property. Apart from this, I agree it's
a major pain.

Labels that belong to several record types cause no problems as long
as all labels of a record are given. This is always the case at record
construction time { lbl1 = val1; ... lblN = valN }. Then, it's not
hard to determine the type of the record from the labels lbl1 ... lblN.

The problem is with access to a record field x.lbl and matching with
record patterns { lbl = pat }. Consider for instance

let f x = x.lbl

What type should be inferred for x if there are several record types
with a label "lbl"? There are at least three options:

a- Consider that lbl belongs to at most one record type t (the one
declared last) and infer that x : t. That's the Caml Light and
Objective Caml approach.

b- Try to gather more information on how f is used in the remainder of
the program to resolve the ambiguity. If this is not possible, fail
with a type error, requesting the user to put a type annotation

let f (x:t) = x.lbl

This is essentially identical to resolving overloading on arithmetic
operators, e.g. + which is both integer addition and floating-point
addition. Both Caml V3.1 and SML/NJ allow overloading on operators and
on record labels, with more or less clever overloading resolution
strategies. The bad news is that type inference in the presence of
overloading is costly and rejects as ambiguous many perfectly
well-typed definitions, such as

let f x = x.lbl
or
let g x = x + x

c- Introduce record polymorphism, i.e. the function fun x -> x.lbl
is given a type that says "I accept as arguments all records with a
label lbl of type T, and I return a value of type T". On the typing
side, record polymorphism is well understood, reasonably efficient,
and preserves the principal type property. (The typing of objects in
Objective Caml relies on similar techniques.)

The challenge is on the compilation side. With conventional records,
all record accesses are at fixed, statically-known offsets inside the
memory blocks that represent the records. With record polymorphism,
this is no longer true. For instance, in fun x -> x.lbl, the position
of lbl in x cannot be determined statically and has to be computed at
run-time in a way similar to method lookup for objects. This entails a
big performance hit, at least if done naively. It is still unclear how
much of this overhead can be reduced through clever compile-time
analyses.

- Xavier Leroy