English version
Accueil     À propos     Téléchargement     Ressources     Contactez-nous    

Ce site est rarement mis à jour. Pour les informations les plus récentes, rendez-vous sur le nouveau site OCaml à l'adresse ocaml.org.

Browse thread
[Caml-list] Q: automatic forgetting cache, module Weak, Gc control
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2004-07-01 (16:26)
From: Jan Kybic <kybic@f...>
Subject: [Caml-list] Re: Q: automatic forgetting cache, module Weak, Gc control
[This thread is about how to implement caching (memoizing) of function
values of different type with limited memory.]

> A more labor intensive idea would be to manage the
> caches yourself. This has the advantage -- and disadvantage --
> that you're totally in control.

I tried the weak hashtable from the Weak module. It did not work well,
apparently the values got garbage collected almost immediately.
I do not think there is a way to avoid it.

I also tried the LRU module of Dustin Sallings but it is too slow for me.

> One idea -- just use a fixed sized cyclic buffer,
> keep usage stats, and manually tune the buffer sizes.

I think the right idea is to insert each cached values into two
structures: a weak hashtable so that the value can be found fast, and 
another global FIFO type structure that will start to drop oldest values when
there is not enough memory. For efficiency, the FIFO structure will 
hold blocks (arrays). As the FIFO structure is global and will have to
hold different types of data, storing Obj.t seems to be apropriate.

I also have to deal with the fact that I want to cache different
fucntions that do not take the same effort to evaluate. I plan to
implement a self tuning strategy - when evaluating the function for
the first time, we will measure how much time it takes and then adjust
the block size, so that each block represents about the same
computational effort. 

I am also thinking about other strategies taking into account the
sizes of the produced results. In this case the global structure would
be an ordered set.

The disadvantage of putting the values into two structures is the
memory overhead. Maybe I should avoid using Weak hashtables
alltogether and store the values in a set of normal hashtables,
dropping some of them, if necessary... It is difficult to get the
memory/speed tradeoff right.

> Another more flexible and potentially expensive
> technique is to cache everything in a thread-safe
> manner, and run a separate 'reaper' thread to purge
> the most useless values. This has the advantage that
> the cache control is lexically isolated and so 
> can be worked on independently. Obviously needing
> to use a thread-lock to put values in the cache
> will cost big time. 

Yes, but what is the most useless value and how to find it


Jan Kybic <kybic@ieee.org>                  tel. +420 2 2435 7264
       or <kybic@fel.cvut.cz>,     http://cmp.felk.cvut.cz/~kybic

To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners