Version française
Home     About     Download     Resources     Contact us    

This site is updated infrequently. For up-to-date information, please visit the new OCaml website at

Browse thread
substring match like "strstr"
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Julian Assange <proff@i...>
Subject: Re: substring match like "strstr"
Gerd Stolpmann <> writes:

> >The results (execution time in seconds) were as follows.
> >
> >  strstr   55.74
> >  regexp  154.37
> >  OCamlA  302.57
> >  OCamlB  129.23
> I did not expect that OCamlB performs so well; so I suggested OcamlA and
> regexp (which are both fast for the bytecode interpreter, too). I think it
> depends also on the problem size (especially on the length of the substring).
> Gerd

If you are really calling strstr 400,000,000 times, I strongly suggest
the use of Boyer-Moore. Is your alphabet amino-acids or base-pairs? If
the latter, I have developed parallel hashing version of Boyer-More
(lets call it Assange-Boyer-More :), which beats the pants off
Knuth-Moris-Pratt. 500 long substrings searches are only about 50%
slower than 1, although for this application, if you have the memory,
you might also want to look at suffix trees or suffix arrays.

 Julian Assange        |If you want to build a ship, don't drum up people
                       |together to collect wood or assign them tasks          |and work, but rather teach them to long for the endless  |immensity of the sea. -- Antoine de Saint Exupery