Version française
Home     About     Download     Resources     Contact us    
Browse thread
[newbie] Define and use records in sum types
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Frank Atanassow <franka@c...>
Subject: Re: [newbie] Define and use records in sum types
Markus Mottl writes:
 > > Can you be more specific about "capturing logical invariants"? I don't
 > > know quite what you mean.
 > 
 > If you have axioms that specify invariant properties about the types and
 > functions in question. E.g. that removing an element from a set results in
 > a set that does not contain it and that this is true for all sets.
 > 
 > Exporting auxiliary functions from the module might allow the programmer to
 > violate this condition. The same is true for not restricting the types of
 > the functions. Therefore, you need both means of information hiding to
 > guarantee correctness (i.e. the properties in question).

I don't think that has anything to do with hiding values, or if it does, it
needn't. Suppose I want to define a type of sets, but I don't want to reveal
the fact that I'm using lists to do it.

  module type SET =
    sig
      type 'a set
      val union : 'a set -> 'a set -> 'a set
    end

  module ListSet =
    struct
      type 'a set = 'a list
      let removeDup m = ...
      let union m n = removeDup (m
      @ n)
    end

  module Set = ListSet : SET

(removeDup doesn't break the abstraction since it's the identity on sets, but
for the sake of brevity suppose it did.)

Here is my rendition of what the signature restriction to SET is doing:

  1) It existentially introduces a new type 'a Set.set, whose witness is 'a
     list, and at the same time introduces a new value Set.union, with
     occurrences of 'a ListSet.list replaced by 'a Set.set in its type, whose
     witness is ListSet.union.

  2) It hides all the names defined in ListSet. This includes ListSet.union as
     well as ListSet.removeDup (and ListSet.set).

So the type-theoretic role of the signature restriction is to do existential
quantification (or something close to it), and delimit the scope of the
quantification on types and values. However, there is no reason why it needed
to hide, e.g., ListSet.removeDup. It could have also added removeDup to Set
also, without breaking abstraction, as long as it weren't included in the
scope of the quantification. Then Set.removeDup would have type 'a list -> 'a
list, so you couldn't apply it to sets, so there is no problem.

I think that this is close to what Ocaml does, though I wouldn't swear to it.

BTW, if you are familiar with category theory, then you will know that all
this stuff, even the binding syntax for quantification and abstraction, can be
translated into a purely semantic, variable-free form. Ipso facto, you don't
need syntactic notions like names or name-hiding to express it.

 > I guess you mean that it is only the structure of the module that counts
 > and not the naming? But how about ambiguities then? You need a way to refer
 > to the right function, whatever, if it has the same signature as another
 > one.

Nono, identity is important. I'm just saying identifier visibility and
renaming should have nothing to do with semantics. (Scope is important,
though. By "visibility doesn't matter", I guess I mean you should never need
to _remove_ an identifier from the set of free variables in a subterm.)

 > >  > I'd rather use different namings like "r1_name" and "r2_name" for the
 > >  > field: although this requires me to always include the name of the
 > >  > type, one cannot get confused either: it makes the sources easier to
 > >  > understand.
 > > 
 > > I don't find that argument very convincing; Ocaml _already_ allows type,
 > > value and field label definitions to shadow each other, which many people
 > > consider a cognitive burden. I don't have an opinion on that issue; I'm
 > > just trying to be consistent with the way Ocaml treats bindings
 > > currently, and add a little bit more flexibility.
 > 
 > There's not much you can do about shadowing of value bindings unless you
 > prohibit it which would cause other problems: you'd have to invent a new
 > name each time even if you just want to "transform" a value using the same
 > name - this could cause horror if you wanted to add such a transformation
 > in the middle of a function body...

Well, I disagree that nothing can be done about it. Personally, I wouldn't
mind so much if shadowing were outlawed entirely, but a less drastic solution
is the one adopted by Haskell and Mercury, in which one distinguishes
top-level bindings from local ones, and "lazily" disallow shadowing at the
top level. This means that you can import distinct values with the same name,
but you only get an error if the name is actually used somewhere in the
module.

But that's just FYI; I'm not suggesting that Ocaml's shadowing rules be
changed in any way.

Frankly, I think Ocaml's perspective, that top-level value bindings are
treated essentially the same way as local bindings, is nicely consistent in
its own way, though. (However, it arguably causes problems too, since
side-effects at the top level makes module-linking order significant.)

 > Now, accessing fields more flexibly is surely possible. The question is
 > whether adding this feature is really of so much help: the only case where
 > it pays off is when you have large records which share a great deal of
 > names.

It pays off when you have records which share names, period.

In that case it improves readability, in the sense that "r.label" is easier to
read than "r.prefix_label".

 > Then it might be a bit more convenient to have an initial declaration that
 > tells the compiler what kind of record you are talking about instead of
 > inventing new names for records. A rather statistical question...

What sort of declaration?

 > > In fact, I don't see how this makes programs any less readable.
 > 
 > It adds another construct that people have to learn. Reading constructs
 > that you are not familiar with is confusing. I bet that there are plenty of
 > people who do not even know about the "with"-construct for records...

But this construct is closely analogous to the existing dot-notation for
modules. For all we know, it may turn out that some programmers actually
mistakenly assume it already exists.

However, it occurs to me now that my syntax would bring type names into the
same namespace as field labels. The reason is because of the ambiguity: does
"m.t.x" mean "the field x of the t-record m", or "the field x of the field t
of the record m"? I guess having type names and labels in the same namespace
is bad, so you would want a different syntax, maybe "m:t.x" for selection and

  { x = ... } : t

for building. I guess that makes it essentially the same as SML's records.

-- 
Frank Atanassow, Dept. of Computer Science, Utrecht University Padualaan 14,
PO Box 80.089, 3508 TB Utrecht, Netherlands Tel +31 (030) 253-1012, Fax +31
(030) 251-3791