Version française
Home     About     Download     Resources     Contact us    
Browse thread
Marshal, closures, bytecode and native compilers
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Jacques Garrigue <garrigue@m...>
Subject: Re: [Caml-list] Marshal, closures, bytecode and native compilers
From: Damien Pous <Damien.Pous@ens-lyon.fr>

> I found some strange difference between the native and bytecode
> compilers, when Marshaling functional values:
> 
> [damien@mostha]$ cat lift.ml
> let r = ref 0
> let f =
>   fun () -> incr r; print_int !r; print_newline()
> let () = match Sys.argv.(1) with
>   | "w" ->  Marshal.to_channel stdout f [Marshal.Closures]
>   | "r" ->
>       let g = (Marshal.from_channel stdin: unit -> unit) in
>         g (); f ()
>   | _ -> assert false
> 
> [damien@mostha]$ ocamlc lift.ml; ( ./a.out w | ./a.out r )
> 1
> 1
> [damien@mostha]$ ocamlopt lift.ml; ( ./a.out w | ./a.out r )
> 1
> 2
> [damien@mostha]$ ocamlc -version
> 3.09.2
> 
> In the bytecode version, the reference [r] gets marshaled along with
> [f] so that the calls [f()] and [g()] respectively affect the initial
> reference of the reader, and the (fresh) marshaled reference.
> 
> On the contrary in the native version, it seems that [f] is not
> `closed': its code address is directly sent, and the call [g()]
> affects the initial reference of the reader.

Interesting phenomenon. According to the usual definition of closure,
the correct solution is probably the bytecode one. But this definition
seems hardly applicable in practice, since it would also mean bringing
all dependencies with you. This is not the case even with the bytecode
version. For instance if you move "let r = ref 0" to r.ml, and replace
the first line of your program by "open R", you get the same behaviour
as for native code.

So as a first approximation, the real specification is: local
variables are transmitted with the closure, but global ones are not.
The trouble being that the definition of global is different for
bytecode and native code. With bytecode, definitions from the same
module are local, while they are global for native code.

Moreover, I believe that, through optimizations, variables that look
local may turn up to be global.

I'm not sure what would be the right fix.
A more complete specification would be a good idea.
A flag to disable optimizations would be rather costly.

For now, a rule of the thumb would be:

* if you want your variable to be handled as global, even in bytecode,
  either receive it as parameter (after marshalling) or put it in
  another compilation unit.

* if you want your variable to be handle as local, even in native
  code, then define or redefine it locally inside your function.
    let r = ref 0
    let f =
      let r = r in
      fun () -> incr !r; print_int !r; print_newline()
   For the time being this seems to work.

Maybe it is better just to assume that you should not mix closure
marshalling with mutable variables. In either case, the semantics
seems fishy. It seems more reasonable to make such functions receive
their mutable state explicitly, and choose either to send it
(obtaining a "fork" behaviour) or not.

Jacques Garrigue