Re: Internals details for cmmgen.ml

From: John Prevost (prevost@maya.com)
Date: Sat Dec 18 1999 - 01:51:02 MET


To: Xavier Leroy <Xavier.Leroy@inria.fr>
Subject: Re: Internals details for cmmgen.ml
From: John Prevost <prevost@maya.com>
Date: 17 Dec 1999 19:51:02 -0500
In-Reply-To: Xavier Leroy's message of "Sat, 11 Dec 1999 19:09:08 +0100"

Xavier Leroy <Xavier.Leroy@inria.fr> writes:

> Using a "checkbound" is perhaps the simplest solution. Otherwise,
> some system-wide exceptions such as Invalid_argument are assigned
> global symbols and you don't need to guess their integer index inside
> their defining module: just emit the C-- code corresponding to
>
> (raise (symbol "Invalid_argument") (string "my message"))

I'll try this.

> If you're really into high-performance stuff, you could fold the
> permission check and the bounds check in one "checkbound" instruction.
> Just arrange the "write enable" flag to be (the Caml integer) 0 if
> write is allowed, and -1 if it is not. Then, generate something like
>
> (checkbound (or index (write_enable_flag region) (size region)))

I don't think this is necessary--especially since both reading and
writing are things that could fail (so I need more than one bit).
When you're going for extreme speed, you'll probably use the unsafe
versions.

(How does -unsafe work, by the way? Does it make the C-- "checkbound"
stuff work differently?)

> Although hacking cmmgen.ml is fun, you could get a more portable
> implementation by writing it in ML using unsafe string accesses.
> Those will happily work on any char *, not necessarily on well-formed
> Caml strings. Something like:
>
> external mmap : ... -> string
> type t = { data: string; length: int }
>
> let read_char reg idx =
> if idx < 0 || idx >= reg.length
> then raise (Invalid_argument "Region.read_char")
> else String.unsafe_get reg.data idx
>
> It will be a bit slower, but maybe not too much.

Hmm. What do you mean by "a more portable implementation"? One which
doesn't require compiler modifications, or one which works with
bytecode? I believe that with bytecode, the C functions are
sufficient.

As for unsafe string access: but doesn't the pointer point to an
O'Caml block, which includes a tag and length information?

John.



This archive was generated by hypermail 2b29 : Sun Jan 02 2000 - 11:58:29 MET