Version française
Home     About     Download     Resources     Contact us    
Browse thread
Internals details for cmmgen.ml
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: John Prevost <prevost@m...>
Subject: Re: Internals details for cmmgen.ml
Xavier Leroy <Xavier.Leroy@inria.fr> writes:

> Using a "checkbound" is perhaps the simplest solution.  Otherwise,
> some system-wide exceptions such as Invalid_argument are assigned
> global symbols and you don't need to guess their integer index inside
> their defining module: just emit the C-- code corresponding to
> 
>         (raise (symbol "Invalid_argument") (string "my message"))

I'll try this.

> If you're really into high-performance stuff, you could fold the
> permission check and the bounds check in one "checkbound" instruction.
> Just arrange the "write enable" flag to be (the Caml integer) 0 if
> write is allowed, and -1 if it is not.  Then, generate something like
> 
>         (checkbound (or index (write_enable_flag region) (size region)))

I don't think this is necessary--especially since both reading and
writing are things that could fail (so I need more than one bit).
When you're going for extreme speed, you'll probably use the unsafe
versions.

(How does -unsafe work, by the way?  Does it make the C-- "checkbound"
stuff work differently?)

> Although hacking cmmgen.ml is fun, you could get a more portable
> implementation by writing it in ML using unsafe string accesses.
> Those will happily work on any char *, not necessarily on well-formed
> Caml strings.  Something like:
> 
>         external mmap : ... -> string
>         type t = { data: string; length: int }
> 
>         let read_char reg idx =
>           if idx < 0 || idx >= reg.length
>           then raise (Invalid_argument "Region.read_char")
>           else String.unsafe_get reg.data idx
> 
> It will be a bit slower, but maybe not too much.

Hmm.  What do you mean by "a more portable implementation"?  One which
doesn't require compiler modifications, or one which works with
bytecode?  I believe that with bytecode, the C functions are
sufficient.

As for unsafe string access: but doesn't the pointer point to an
O'Caml block, which includes a tag and length information?

John.