Re: Proposal for study: Add a categorical Initial type to ocaml

From: John Prevost (prevost@maya.com)
Date: Mon Oct 11 1999 - 14:40:08 MET DST


To: skaller <skaller@maxtal.com.au>
Subject: Re: Proposal for study: Add a categorical Initial type to ocaml
From: John Prevost <prevost@maya.com>
Date: 11 Oct 1999 08:40:08 -0400
In-Reply-To: skaller's message of "Sun, 10 Oct 1999 16:14:23 +1000"

skaller <skaller@maxtal.com.au> writes:

> > I have a question--how is this different (except for the efficiency of
> > using null) from using a 'a option array? The array still must be
> > initialized--in this case to all null, or in the unsafe case, well,
> > it's unsafe.
>
> It isn't, in theory, it is exactly the same (and therefore sound).
> The difference lies in the way it is represented: using the initial
> the 'a option wrapping is done by the system, not the client programmer
> (sort of like a Haskell monad) and therefore is more robust, more
> efficient, and leaves the client code cleaner. This is much the
> same argument as for exceptions.

Hiding implementation details from the programmer does not always make
code cleaner. You can have the monad comparison, though. I'll use it
in a second.

> > If anything, I would argue that this points at representing 'a option
> > specially for already boxed values by using a null pointer for None,
> > rather than having a special strange value that comes out of arrays
> > and magically creates exceptions.
>
> That would make 'a option faster to use, but still require the
> overhead of writing code that needless does matches where it is
> known (dynamically) that only one of the cases can be used (such as
> in a variable length array).

Okay--so here's what I would do:

"Inefficient" safe array implementation:

type 'a exparray = { mutable ea_size : int;
                     mutable ea_length : int;
                     mutable ea_array : 'a array }

let length a = a.ea_length

let get a i =
  if i < a.ea_length
    then a.ea_array.(i)
    else raise (Invalid_argument "Earray.get")

let set a i v =
  if i < a.ea_length
    then a.ea_array.(i) <- v
    else raise (Invalid_argument "Earray.set")

let expand a n v =
  begin
    (if a.ea_length + n >= a.ea_size then
       let new_size =
         if n > a.ea_size then a.ea_size + n else a.ea_size * 2 in
       let new_arr = unsafe_create new_size in
       begin
         Array.blit a.ea_array 0 new_arr 0 a.ea_length;
         a.ea_array <- new_arr;
         a.ea_size <- new_size;
       end);
    Array.fill a.ea_array a.ea_length n v;
    a.ea_length <- a.ea_length + n
  end

let create i =
  { ea_size = i;
    ea_length = 0;
    ea_array = unsafe_create i }

let make i v =
  { ea_size = i;
    ea_length = i;
    ea_array = Array.make i v }

Something like that. We can probably define unsafe_create like this:

let unsafe_create i = (Obj.magic (Array.make i 0) : 'a array)

(The above works, I've tested it. I highly recommend not trying to
print out values of type 'a earray from the toplevel, though.)

So this is the pretty unsafe version. There can't be a really unsafe
version because not initializing the memory at all would play havoc
with the GC.

Now. What can we gather from this? First: by abusing Obj.magic, we
can do what we want. Obviously, it's not strictly a nice thing to do,
but if we can prove to ourselves that the code works right, then from
the point of view of the API's client, everything's great.

What else can we gather? Well, what would be different if this were
implemented with options? There'd be the boxing inefficiency, of
course. Anything else? Not really. We'd be guaranteeing something
we can show to be true anyway, since that's why we were okay with
using Obj.magic up there.

So, is there a problem with my solution of working around the type
system under the API level?

> > Again, rather than introduce new features into the language, add a new
> > unsafe optimization and make the option type more efficient for boxed
> > values.
>
> The initial is not 'strange',
> rather it is fundamental, the dual of the 'unit' type.

No--it's strange. The reason it's strange is that it's something that
can happen to a user that they don't expect. Like null in Java.

There's nothing unsound about null (well, unless you consider that
trying to call a method on null fails to be unsound, which I guess I
probably do), but that doesn't mean it's not a pain in the ass.

Why is it a pain in the ass? There's no way to turn it off. Like
I've been saying, with option you can turn it off, with Obj.magic, the
implementor ought to be damned sure he's doing things right. But with
these special "uninitialized value" sorts of things, people like me
who've gotten used to good type systems keep looking over their
shoulders because they're afraid it might turn around and bite them in
the butt.

It's appropriate for exceptions to be thrown when out of bounds array
accesses happen--although I'd rather it weren't. It's reasonable for
the programmer to check and make sure this doesn't happen. It's not
reasonable for the programmer to check that index i of an array that
was passed in has actually been assigned before now. Much like it's
not reasonable for the programmer to have to check that the argument
is not null on a function that doesn't claim to accept null as an
argument (my Java pet peeve).

I really think adding anything that can cause more runtime errors is
likely to be a wart, and no language needs more warts.

Hrm. That last bit is too rambly, but I'll send this anyway.

John.



This archive was generated by hypermail 2b29 : Sun Jan 02 2000 - 11:58:27 MET