Version française
Home     About     Download     Resources     Contact us    
Browse thread
Native executable symtable
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Xavier Leroy <Xavier.Leroy@i...>
Subject: Re: [Caml-list] Native executable symtable
> We are writing a library implementing binary client-server protocol 
> based on the Marshal module. In order to guarantee some degree of 
> type-safety, during the handshaking phase of the communication we need 
> the client to send the server the MD5 checksums of all relevant modules. 
> [...]

> We have found that the md5sum can be fetced directly from the executable 
> file associated to the process. This technique is documented nowhere, as 
> far as I can see, but the source code of Symtable.init_toplevel is very 
> informative as to how to do this for bytecode executables. What I would 
> like to know is how to implement this technique for native code 
> executables. Essentially, how am I supposed to parse the binary 
> executable to extract the symtable information.

You cannot, because the checksums you mention (the digests of the
interfaces of the modules) are not included in ocamlopt-generated
executables.  They are included only in ocamlc-generated executables (in
the CRCS section) for use with Dynlink and the toplevel.

At any rate, I think you're on the wrong tracks: the checksums you'll
find in the CRCS section of bytecode executable are those of module
interfaces, not of module implementations.  To establish type
agreement between two processes communicating via
output_value/input_value, you really want the latter, not the former.
(Think of an abstract type implemented differently in the two
processes.  For more details, see e.g. the ICFP'03 paper by Leifer,
Peskine, Sewell and Wansbrough.)

So, you're looking for convenient ways to collect checksums for module
implementations.  An insight that might simplify your build process is
that while these checksums must be computed at compile-time (e.g. by
running md5sum on the source .ml files), they can be collected
together at run-time.  For instance, you could preprocess the .ml
sources of interest so as to insert at the beginning

        let _ = Registry.record_module "Modulename" "checksum"

where "Modulename" is the module name and "checksum" the outcome of
md5sum on the source file.  The Registry.record_module function just
accumulates its arguments in a hashtable or association list, which
can then be consulted during the agreement phase of your protocol.

There are probably many other ways to do it.  But I think your initial
idea (compute checksums of source files at compile-time) is the
correct one, it's just a question of implementing it in a way that
doesn't complicate your build process too much.

- Xavier Leroy