Version française
Home     About     Download     Resources     Contact us    

This site is updated infrequently. For up-to-date information, please visit the new OCaml website at

Browse thread
[Caml-list] Freeing dynamically loaded code
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2003-12-15 (11:34)
From: Nuutti Kotivuori <naked+caml@n...>
Subject: Re: [Caml-list] Freeing dynamically loaded code
Basile Starynkevitch wrote:
> On Fri, Dec 12, 2003 at 09:04:24PM +0200, Nuutti Kotivuori wrote:
>> So, I went down the dirty depths of Dynlink and friends, and
>> bytecomp and byterun, and all that - and I think I have a rather
>> good image of the problem.
>> I would wish for Dynlinkable code to be garbage collected.
>> ==========================================================
>> When a file is loaded with Dynlink.loadfile, most of what's read
>> and executed and handled is stored in the normal OCaml heap - and
>> hence garbage collected properly when there are no references
>> anymore.
>> But, the actual executable code isn't. Basically what is done to
>> that is that the buffer is allocated by Meta.static_alloc, the code
>> is read there, and then Meta.reify_bytecode is invoked on it -
>> which merely wraps the pointer in a Closure_tag block it allocates.

Thanks for your input on this matter.

> Please note that for code to be GC-ed, the garbage collector has to
> handle specially every closure, looking into the code pointed by the
> GC, etc. This might slow down the GC significantly (given that the
> pointer to code is usually in the body of the code chunk, not to its
> beginning).

Not necessarily at all - if code pointers are handled as pointers to
the start of a block and an offset, the GC can work exactly as it
did. This ofcourse introduces runtime overhead in calling closures. If
the GC is augmented to handle blocks of type Closure_tag specially,
but still so that they contain a pointer to the start of the
memoryblock as well, then there will be no runtime overhead in calling
closures, minimal overhead in garbage collection (update of two
pointers instead of one when moving a block), and only an increased
storage of one word for each closure.

> Also both the bytecode and the native compiler should share a common
> behavior on this.

Um, I believe this concerns only interpreting bytecode. We are talking
about dynamic linking - loading code at runtime. If the code is
statically loaded - there hardly is any need to garbage collect it. It
could result in small gains in code size for code parts that are
executed only once and then never again, but that's minimal. Same
thing for the native compiler - there's no need to free actual code
pages from the executable if we are running native code. And
dynamically linking nativecode - that is, taking in a .o file to a
running executable, resolving it's unresolved symbol references at
runtime, and being able to access it - doesn't seem feasible at
all. And dynamically linking native code to a bytecode executable
seems even less feasible. Dynamically loading bytecode from an
executable compiled natively with ocamlopt doesn't seem to work
directly, but Asmdynlink (part of CDK) seems to do it with some
hacking, so it shouldn't be impossible - but this should not break it

So, for this thing we are *only* talking about dynamically linking
bytecode - and being able to free the code when it is no longer

> Besides, most of the runtime (including the bytecode interpreter)
> rely upon the fact that code is not moved (ie remains at a fixed
> address). If it was moved, the GC would have to update the code
> pointer referenced in closures which would be difficult. (Also, for
> moving machine code in ocamlopt, you'll have to flush -in a system
> dependent way- the instruction cache, which is expensive). Not
> moving code means that you cannot copy or compact it, as the GC does
> for most values.

GC would only have to update the Code pointer in closures if that's a
plain pointer to the actual code address to jump to. If for example
_every_ closure would have a pointer to the start of an allocated
block, and an offset from it - the GC would not have to be modified at
all. The runtime cost in this case example would be two loads and an
add compared to one load.

And as already mentioned, moving machine code is not meaningful as the
code pages are not linked or allocated by us, but by the OS (or
libc/ELF runtime).

> I understand your wish (and in an ideal world I do share it) but I
> also think that making code garbage collected would impact a big
> part of the ocaml runtime system (and would, for example, probably
> make ocaml's GC slower, even for most of the applications which
> don't load any code).

Doing it only for dynamically linked bytecode, and accepting a runtime
performance cost for them compared to others, I believe it is doable
without impacting other performance at all. And the changes could be
rather well localized.

> FWIW, some old versions of SML/NJ (maybe 0.93...) did actually
> garbage-collect and move machine code, so this is in principle
> doable, but it is difficult and involve a serie of tradeoffs
> (different from those of Ocaml3). Also, most of Common Lisp
> implementations probably collect code (even in machine code form).

I believe Common Lisp is a bit different in this respect, because it
actually generates code at runtime - something that I don't think
OCaml does at all.

> So I would believe that to make code GC-able would require a big lot
> of work, and I understand that it is not currently a top priority of
> the Cristal team.

I understand. However, even if the Cristal team is not interested in
spending effort towards making this happen, for the moment, I am.

Alain Frisch and me have been discussing a possibility of implementing
this with a rather minimal change to the runtime system, without
touching garbage collection at all - or more like Alain has been
describing his idea, and I have been trying to understand.

If we can figure out the final implementation plan on this - I will
probably start implementing this rather soon.

> Making code garbageable is a big design decision which should be
> done when starting to implement a language. It cannot be made
> afterwards without lot of pains.

Yes, achieving true garbage collection on all parts of the code
without incurring a runtime cost nor making the garbage collector
exceedingly inefficient is indeed a hard thing to get proper. However,
adding limited garbage collection for some parts of the code might
still be doable.

Best wishes,
-- Naked

To unsubscribe, mail Archives:
Bug reports: FAQ:
Beginner's list: