Re: lexing__get_next_char ?

Xavier Leroy (Xavier.Leroy@inria.fr)
Mon, 21 Oct 1996 16:30:50 +0200 (MET DST)

From: Xavier Leroy <Xavier.Leroy@inria.fr>
Message-Id: <199610211430.QAA11843@pauillac.inria.fr>
Subject: Re: lexing__get_next_char ?
To: itz@rahul.net (Ian T Zimmerman)
Date: Mon, 21 Oct 1996 16:30:50 +0200 (MET DST)
In-Reply-To: <199610180340.UAA12614@kronstadt.rahul.net> from "Ian T Zimmerman" at Oct 17, 96 08:40:55 pm

> In the caml-light sources, in src/runtime/lexing.c, the primitive
> get_next_char is defined as follows:
>
> struct lexer_buffer {
> value refill_buff;
> value lex_buffer;
> value lex_abs_pos;
> value lex_start_pos;
> value lex_curr_pos;
> value lex_last_pos;
> value lex_last_action;
> };
>
> value get_next_char(lexbuf) /* ML */
> struct lexer_buffer * lexbuf;
> {
> mlsize_t buffer_len, curr_pos;
>
> buffer_len = string_length(lexbuf->lex_buffer);
> ...
>
> How can this work, when lexer buffers are ML records on the heap, as
> the following piece of src/lib/lexing.ml seems to show

Viewed from C, Caml records are arrays of elements of type "value".
So, we're basically casting a pointer to a "value" array to a pointer
to a struct with all fields having type "value".

This is probably not guaranteed to work by the ANSI C standard, but I
doubt there's any C compiler around that does not represent both types
identically.

(There are several other assumptions not guaranteed by ANSI C in the
Caml runtime, in particular that any pointer type can be cast to and
from the type "long". I don't think it is even possible to write a
memory manager and runtime system such as Caml's in strictly
conformant ANSI C.)

The function could be rewritten to use Field(lexbuf, ...) as you
suggested, but having a "struct" declaration in the C code that
reflects the Caml record declaration makes it easier to keep both C
and Caml code in sync.

- Xavier Leroy