Version française
Home     About     Download     Resources     Contact us    

This site is updated infrequently. For up-to-date information, please visit the new OCaml website at

Browse thread
mboxlib reloaded ;-)
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2007-04-27 (23:12)
From: Oliver Bandel <oliver@f...>
Subject: Re: [Caml-list] mboxlib reloaded ;-)

only a short note, because I tonight will not explore it in detail...

On Fri, Apr 27, 2007 at 05:29:11PM +0100, Richard Jones wrote:
> On Fri, Apr 27, 2007 at 03:54:25PM +0200, Oliver Bandel wrote:
> > Hello,
> > 
> > after two years of doing nothing on it,
> > I today found my mboxlib, I started to
> > write in 2005.
> > 
> > I have put the mli-file on the web and
> > maybe the library itself will follow
> > during the next time.
> > 
> > Any feedback, questions and suggestions are welcome.
> > 
> >
> The source for COCANWIKI[1] contains extensive support for threading
> of mail messages, based on JWZ's algorithm:

Nice... you speak of an optimized algorithm for threading.
I didn't explored your solution nor did I explored your
paper in detail (tomorrow I think I have the time to do it),
but IMHO the best thing for handling message-threads
is to use tries-datastructure with messgae-id's
as identifers (instead of char's, as they are used normally).

So: did you reimplemented the tries-datastructure
as abstraction on message ID's, or did you
made it different?

> You are of course welcome to copy this.  If there are any license
> issues let me know & I can fix them.
> I'd also like to point you to another useful JWZ doc:

Well, the same here: tomorrow I can look at itin more detail;
but the problem of fast mbox-usage I today also found out as
a problem, as I first time used a test-mbox of about 100 MB.
Normally I would use some MB's of size, because I think
ths is the normal size; but I had some dscussions on the
berlin Linux user group, and some people were anbnoyed that
mutt needs some seconds to read in mbox-files of about
80 MB's.

So, I then checked my mboxlib and saw that it is quite slow,
compared to what I expected ( expect! I did not tried it
on my development machine because I have nomutt installed there)
and even if native-code smuch faster, it's nevertheless slow... I thought I have to redesign my scanner-stage.
(I use Str-module and ocamnllex mixed together; maybe
 using a plain selfwritten  OCaml-scanner might be better here).


P.S.: 12 seconds for 100 MB seems tobe quite slow...
      I very often call the lexer, and that might be done
      Maybe your pages will show some useful attempts.