Version française
Home     About     Download     Resources     Contact us    
Browse thread
[Caml-list] Alternative Bytecodes for OCaml
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Basile Starynkevitch [local] <basile.starynkevitch@i...>
Subject: [Caml-list] Re: (GC issues) Alternative Bytecodes for OCaml
On Sat, Aug 28, 2004 at 07:03:44PM +0200, Nicolas Cannasse wrote:

Citing me (Basile S.)

> > Unfortunately, Parrot http://www.parrotcode.org/docs/intro.html
> > implement all structured objects as PMCs, which are opaque data chunks
> > on which all operations goes thru a table of functions (similar to C++
> > vtables). So, using PMCs for tuple implementations would require an
> > indirect call (even with Parrot JIT technology) for every basic tuple
> > operations (ie allocation, tag fetching, field fetching). Also, I am
> > not sure that Parrot has terminal (tail recursive) calls which are
> > essential to Ocaml (and other functional languages).
> 
> That's the kind of things I was thinking when talking about object oriented
> VM . In the OO world, everything is an object and people tends to abstract
> everything - even fields, so pure efficient data structures are not
> available anymore, only through expensive getter/setter and/or field lookup.
> A functionnal VM typicaly need both : efficient direct access data
> structures and objects with runtime lookup.

I'm not sure a functionnal VM need objects, except for making more
easy hooks to the external world (like the Custom block of Ocaml
provides). BTW, Ocaml objects (in the Ocaml source sense) are not
implemented as objects with C vtables in the runtime (the equivalent
of the vtable is for Ocaml only -it is not useful to the GC- and
contains Ocaml closures, not C function pointers).


> [...]
> > Alos, note that Ocaml (and other functional languages) need a
> > performant garbage collector (because immutable tuples and closures
> > are allocated and accessed a lot), and that conservative GCs (à la
> > Boehm) have lots of interesting features (in particular,
> > conservativeness which makes then C-friendly, and multithreading), but
> > are slower than naive generational copying precise GC (see
> > http://starynkevitch.net/Basile/qishintro.html for some benches).
> 
> I would like to get your thoughs about the following, since you already
> wrote a GC as well as OCamlJIT.
> IMHO, having an any way to interface with C is a key feature for any
> language. Because the OCaml GC is not scanning the C stack, it is needed to
> use a lot of macros to take care that newly allocated blocks are not
> garbaged.
> 
> What would be the cost of extending the OCaml GC with a C stack scanning
> pass ?

[Damien Doligez would probably answer better than I do]

Ocaml does scan *explicitly* the C stack. Telling *explicitly* what
are the local GC roots of C routines is the task of CAML_param &
CAML_local macros (FWIW, my Qish GC use very similar tricks,and every
exact GC has to do similarily.).

The difference between Ocaml GC's (or Qish) and Boehm's GC is that
Boehm's GC is conservative while Ocaml GC is exact (or precise): it
uses a non-copying scheme (some kind of incremental mark&sweep IIRC)
and scan every word of the C call stack, assuming it could be a GC-ed
pointer (so when you have a bit pattern - say a floating point
constant like pi - which happens to be a valid address it is followed
by Boehm's GC - which may leak this way). Some variants (eg Bartlett'
mostly copying GC) of conservative GC scan conservatively the C stack
and handle exactly the heap.

The Ocaml GC has to work quickly for young GC-ed values (which,
notably in functional programs, tend to die quickly, because a
functional programming style -with no mutable state- favors quick
allocation of many short-lived temporary values). Hence, a copying
scheme is preferable for the young generation. But a GC which work by
copying should move (ie update) pointers, so cannot be conservative:
it has to know which words are GC-ed pointers, and which are not
(otherwise, your floating point pi constant on the stack would change
mysteriously to a bitpattern of a newly forwared pointer).

> Would it be costly/unsafe to move the pointers on the C stack ?

It would be usafe, because a copying GC scheme is preferable (it is
faster for young objects), hence pointers have to be moved on the C
stack

> What kind of speed factor can you loose when using for example Boehm
> GC in OCaml VM instead of OCaml GC ?

I don't know, but I would expect something like several programs
running 2 or 3 times slower at least with a bytecode interpreter using
Boehm's GC instead of the current Ocaml bytecode VM & GC. In my simple
experiments (with Qish), a Boehm allocation is about 10 times slower
than a C malloc, which is at least 10 times slower than the usual
pointer increment of a copying generational GC (like in Ocaml or
Qish). So I would guess (but I didn't test) that a Boehm allocation
might be nearly 100 times slower than an Ocaml tuple allocation.


> Which would be better of theses two ?
>     * a conservative GC everywhere (Boehm GC for example)

I believe it would slow down significantly the OCaml VM.

>     * OCaml GC but with all C allocations made directly into the old
> generation (for conservativness)

You still has to tell the (exact) minor collector about these
allocations (ie something like putting them on the store list) - so
you won't gain much in C coding style, and you'll lose in
performance. The result of C primitives is usually a temporary value
(e.g. because C primitives tend to be wrapped by Ocaml code doing
checking and changing C errors into Ocaml exceptions), so you'll
probably better put then in the young generation.

> I'm also of course interested in Damien thoughts about theses questions.

Of course, Damien Doligez is much more expert than I am. I'll be
delighted to read his thoughts on this.


All my thoughts above are just thoughts, I did not take the time to
experiment it (and putting Boehm's GC inside Ocaml VM is not a trivial
task), so I may be grossly wrong.

This interesting discussion triggers another interesting question:
would Ocaml coders still use Ocaml if its implementation was (say) 3
or 10 times slower than the current implementation? Alternatively, do
people use my OcamlJitRun program which could provide (on several
programs) a speedup of nearly 2 on their bytecode program (which don't
have to be changed neither in their C primitives nor in the byteocde,
hence the Ocaml source code).

-- 
Basile STARYNKEVITCH -- basile dot starynkevitch at inria dot fr
other email : basile at starynkevitch dot net
Project cristal.inria.fr - temporarily
http://cristal.inria.fr/~starynke --- all opinions are only mine 

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners