Version française
Home     About     Download     Resources     Contact us    

This site is updated infrequently. For up-to-date information, please visit the new OCaml website at

Browse thread
[Caml-list] swapping large data structures from/to files
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2004-04-08 (15:58)
From: Basile Starynkevitch <basile.starynkevitch@i...>
Subject: Re: [Caml-list] swapping large data structures from/to files
On Thu, Apr 08, 2004 at 02:51:11PM +0000, Sebastien Ferre wrote:

> I am interested in handling so large data structures
> that they don't fit in main memory. 

I am curious - what are your huge DAGs? (bio-informatics

> I need 2 things:

> 1. Persistency of the data structure, preferably in
> a file (similarly to NDBM, say).

Did you look into Persil on my home page (see my sig)? It does provide
persistency into small segmented files (which works reasonably for
small data, since the whole file gets copied at end of process) or
with MySQL4. If you need, I could add another persistent store for it
(but I think that using a transactional database with Persil is much
better for big persistent data).

The most important issue is: do you need some kind of transaction
mechanism?  I could write some better file based persistent store iff
you don't need [nested] transactions (with commit & abort ability)!

You might also use Bigarray-s which can be mapped to files.

> 2. Customized swapping strategy of elements of the data
> structure, what should be more efficient than the
> virtual memory.

I'm not sure to fully understand your point. Persil does give the
ability to unload & relead persistent values on (explicit) demand.

Do you agree to explicitly say in your application (by appropriate
calls) I won't need any more this and this values? Or do you want the
system to guess them by yourself.

(For completness, you can give hints to the VM system with the madvise
system call, but it won't work with Ocaml - because values may be
moved by the GC).

> Typically, my data structure is a DAG, and I wish to
> keep in memory only a limited amount of nodes at a time.

Is schema evolution a concern for you? Ie if you change the types
implementing your DAG, how do you deal with the huge persistent data
in that case? (Persil does not handle this issue, since it uses the
Marshal module)

> Hence the necessaty for swapping. It is also important
> to have as much as possible in memory, and not merely
> accessing the file, for efficiency reasons.

> Has anything be done in this direction ?
> The library Dbm is fine to me for the persistency,
> but it does not work on every platform :-(.
> ( Would Dbm be difficult to rewrite in OCaml ?)

I think that there are quite portable versions of Dbm (or BSD DB).

> Sébastien Ferré

(You can answer me in French if you wish; if you CC the list, let's
continue in english)

Basile STARYNKEVITCH -- basile dot starynkevitch at inria dot fr
Project - phone +33 1 3963 5197 - mobile 6 8501 2359 --- all opinions are only mine 

To unsubscribe, mail Archives:
Bug reports: FAQ:
Beginner's list: