Version française
Home     About     Download     Resources     Contact us    
Browse thread
[Caml-list] Searching large lists
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Jerome Vouillon <jerome.vouillon@i...>
Subject: Re: [Caml-list] Searching large lists
On Thu, Nov 08, 2001 at 06:06:57AM -0800, Andrew Lawson wrote:
>      I have a list containing up to 100,000 strings
> between 10 and 200 characters in length. I want to
> produce a list of those that match a regular
> expression. It seems that the obvious way is to
> List.filter with a predicate returning true if the
> string matches, however in my case this can take up to
> 15 seconds. Has anyone got any ideas for speeding this
> up?

The Str library is really slow.

For Unison (http://www.cis.upenn.edu/~bcpierce/unison/), we wrote our
own regular expression library to get acceptable performances.

You should try PCRE (http://sourceforge.net/projects/pcre-ocaml/) or maybe RE
(http://sourceforge.net/projects/libre/).

If you compile to native code, the RE library should be the fastest in
your case (probably about 5 to 10 times faster than PCRE).  It is
still under development though, so some features are missing.

-- Jerome
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr