Version française
Home     About     Download     Resources     Contact us    
Browse thread
Re: lexing__get_next_char ?
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Xavier Leroy <Xavier.Leroy@i...>
Subject: Re: lexing__get_next_char ?

> In the caml-light sources, in src/runtime/lexing.c, the primitive
> get_next_char is defined as follows:
> 
> struct lexer_buffer {
>   value refill_buff;
>   value lex_buffer;
>   value lex_abs_pos;
>   value lex_start_pos;
>   value lex_curr_pos;
>   value lex_last_pos;
>   value lex_last_action;
> };
> 
> value get_next_char(lexbuf)     /* ML */
>      struct lexer_buffer * lexbuf;
> {
>   mlsize_t buffer_len, curr_pos;
>   
>   buffer_len = string_length(lexbuf->lex_buffer);
>      ...
> 
> How can this work, when lexer buffers are ML records on the heap, as
> the following piece of src/lib/lexing.ml seems to show

Viewed from C, Caml records are arrays of elements of type "value".
So, we're basically casting a pointer to a "value" array to a pointer 
to a struct with all fields having type "value". 

This is probably not guaranteed to work by the ANSI C standard, but I
doubt there's any C compiler around that does not represent both types
identically.

(There are several other assumptions not guaranteed by ANSI C in the
Caml runtime, in particular that any pointer type can be cast to and
from the type "long". I don't think it is even possible to write a
memory manager and runtime system such as Caml's in strictly
conformant ANSI C.)

The function could be rewritten to use Field(lexbuf, ...) as you
suggested, but having a "struct" declaration in the C code that
reflects the Caml record declaration makes it easier to keep both C
and Caml code in sync.

- Xavier Leroy