[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
| Date: | -- (:) |
| From: | Julian Assange <proff@i...> |
| Subject: | Re: substring match like "strstr" |
Gerd Stolpmann <gerd@gerd-stolpmann.de> writes:
> >The results (execution time in seconds) were as follows.
> >
> > strstr 55.74
> > regexp 154.37
> > OCamlA 302.57
> > OCamlB 129.23
>
> I did not expect that OCamlB performs so well; so I suggested OcamlA and
> regexp (which are both fast for the bytecode interpreter, too). I think it
> depends also on the problem size (especially on the length of the substring).
>
> Gerd
If you are really calling strstr 400,000,000 times, I strongly suggest
the use of Boyer-Moore. Is your alphabet amino-acids or base-pairs? If
the latter, I have developed parallel hashing version of Boyer-More
(lets call it Assange-Boyer-More :), which beats the pants off
Knuth-Moris-Pratt. 500 long substrings searches are only about 50%
slower than 1, although for this application, if you have the memory,
you might also want to look at suffix trees or suffix arrays.
--
Julian Assange |If you want to build a ship, don't drum up people
|together to collect wood or assign them tasks
proff@iq.org |and work, but rather teach them to long for the endless
proff@gnu.ai.mit.edu |immensity of the sea. -- Antoine de Saint Exupery