Version française
Home     About     Download     Resources     Contact us    
Browse thread
mboxlib reloaded ;-)
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Oliver Bandel <oliver@f...>
Subject: Re: [Caml-list] mboxlib reloaded ;-)
Hi,

only a short note, because I tonight will not explore it in detail...

On Fri, Apr 27, 2007 at 05:29:11PM +0100, Richard Jones wrote:
> On Fri, Apr 27, 2007 at 03:54:25PM +0200, Oliver Bandel wrote:
> > Hello,
> > 
> > after two years of doing nothing on it,
> > I today found my mboxlib, I started to
> > write in 2005.
> > 
> > I have put the mli-file on the web and
> > maybe the library itself will follow
> > during the next time.
> > 
> > Any feedback, questions and suggestions are welcome.
> > 
> >   http://me.in-berlin.de/~first/software/libraries/mboxlib/
> 
> The source for COCANWIKI[1] contains extensive support for threading
> of mail messages, based on JWZ's algorithm:
> 
> http://www.jwz.org/doc/threading.html

Nice... you speak of an optimized algorithm for threading.
I didn't explored your solution nor did I explored your
paper in detail (tomorrow I think I have the time to do it),
but IMHO the best thing for handling message-threads
is to use tries-datastructure with messgae-id's
as identifers (instead of char's, as they are used normally).

So: did you reimplemented the tries-datastructure
as abstraction on message ID's, or did you
made it different?


> 
> You are of course welcome to copy this.  If there are any license
> issues let me know & I can fix them.
> 
> I'd also like to point you to another useful JWZ doc:
> 
> http://www.jwz.org/doc/mailsum.html

Well, the same here: tomorrow I can look at itin more detail;
but the problem of fast mbox-usage I today also found out as
a problem, as I first time used a test-mbox of about 100 MB.
Normally I would use some MB's of size, because I think
ths is the normal size; but I had some dscussions on the
berlin Linux user group, and some people were anbnoyed that
mutt needs some seconds to read in mbox-files of about
80 MB's.

So, I then checked my mboxlib and saw that it is quite slow,
compared to what I expected ( expect! I did not tried it
on my development machine because I have nomutt installed there)
and even if native-code smuch faster, it's nevertheless slow...
...so I thought I have to redesign my scanner-stage.
(I use Str-module and ocamnllex mixed together; maybe
 using a plain selfwritten  OCaml-scanner might be better here).


Ciao,
   Oliver

P.S.: 12 seconds for 100 MB seems tobe quite slow...
      I very often call the lexer, and that might be done
      smarter.
      Maybe your pages will show some useful attempts.