Browse thread
How to re-implement the GC?
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
| Date: | -- (:) |
| From: | Jon Harrop <jonathandeanharrop@g...> |
| Subject: | RE: [Caml-list] Re: How to re-implement the GC? |
Hi Eray, Retrofitting a new multicore-friendly GC onto OCaml is difficult for two main reasons: 1. You want plug-and-play GCs but the OCaml compiler is tightly coupled to the old GC (although OC4MC has decoupled them!). 2. Recovering similar performance whilst reusing the same data representation is extremely difficult because the current design relies so heavily on lightweight allocation. You really want to change the data representation to avoid unnecessary boxing (e.g. never box or tag int, floats or tuples) in order to reduce the allocation rate and, consequently, the stress on the garbage collector but OCaml cannot express value types and its ability to represent polymorphic recursion makes this extremely difficult to implement. As Sylvain has said, OC4MC is your best bet if you want to try to write a new GC for OCaml. However, making more extensive changes has the potential to address many more problems (e.g. convey run-time type information for generic printing) so you might consider alternatives like trying to compile OCaml's bytecode into HLVM's IR for JIT compilation because HLVM already has a multicore friendly GC and it is much easier to develop. > Ah, that's interesting. I wonder if it provides any real speedup on new > architectures compared to storing the pointer in RAM. For a multicore GC, using a register to refer to thread-local data is a huge win because accessing thread-local data is very slow. I made that mistake in HLVM's first multicore-capable GC and it was basically useless as a consequence because all of the time was spent looking up thread-local data. When I started passing the thread-local data around as an extra argument to every function (not necessarily consuming a register all the time because LLVM is free to spill it), performance improved enormously. Cheers, Jon.