Version française
Home     About     Download     Resources     Contact us    

This site is updated infrequently. For up-to-date information, please visit the new OCaml website at

Browse thread
[Caml-list] Q: automatic forgetting cache, module Weak, Gc control
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2004-06-30 (11:06)
From: Jan Kybic <kybic@f...>
Subject: [Caml-list] Q: automatic forgetting cache, module Weak, Gc control

        I am implementing a complicated numerical computation
algorithm in OCaml. (Basically, it is a Fast Mulipole Method
accceleration for a Boundary Element Method.) There is a lot of
intermediate results of several different types that are expensive to
calculate but that are used repeatedly in subsequent
calculations. Therefore, I am caching (memoizing) these intermediate
results using a hash table, like this:

  val f : ('a -> 'b) -> ('a -> 'b)

  let memo f = 
    let h = Hashtbl.create 11 in
    fun x -> try 
      Hashtbl.find h x 
    with Not_found -> 
      let r = f x in ( Hashtbl.add h x r; r )

This works fine for smaller problems and the calculation is much
accelerated. However, when the problem gets big and the amount of
memory needed to store the intermediate results exceeds the amount of
available RAM, the operating system (Linux) starts to swap the cache
memory to disk. At this point, it is no longer advantageous to cache
the values, it is actually faster to recalculate them then to retrieve
them from the disk. 

The solution seems to be to implement a forgetting LRU
(last-recently-used) cache of a fixed maximum size. Here come my

1) Has anyone implemented that in Ocaml already? 
   I have found
   Do you have any experience with it?

2) I have several different types of objects (functions) to cache and
   it is difficult to find out a priori, how much memory should be
   devoted to each type, if each cache is implemented separetely.  It
   would be nice to be able to make all caches share the same pool but
   storing values of different type (and size) in the same structure
   conflicts with the type safety. Should I use the Obj module to
   circumvent this, or is there a more elegant way around?

3) Many (but not all) of the precomputed values are floats. Does it
   pay off to treat them separately? Is there a way to avoid storing
   them boxed, say in a hash-table?

4) An ideal way would seem to be using the Weak module and the weak hash
   table implemented there. Will that work as expected? From reading
   the documentation I have the impression that the garbage collection
   will probably collect the weak values too early, at first minor
   collection. Is it correct? Is there a way to tune the garbage
   collector to leave the weak values alone as long as the total
   memory usage is below a certain threshold?

Thanks very much in advance for all your thoughts and ideas.


Jan Kybic <>                  tel. +420 2 2435 7264
       or <>,

To unsubscribe, mail Archives:
Bug reports: FAQ:
Beginner's list: