Version française
Home     About     Download     Resources     Contact us    
Browse thread
substring match like "strstr"
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: eijiro_sumii@a...
Subject: Re: substring match like "strstr"
Hi everyone,

Thank you very much for a lot of kind advice!

Jean-Christophe Filliatre <Jean-Christophe.Filliatre@lri.fr> wrote:
> I   missed  the   beginning  of   the  discussion,   but  implementing
> Knuth-Morris-Pratt is quite easy. The  code is given below (26 lines),

Actually, I found the (almost) same code on the web and tried it.  The
results (again, execution time in seconds) were:

		SPARC	Pentium
	strstr	52.68	57.050
	kmp	111.52	143.490

The SPARC machine is the same as before.  The Pentium machine is
running Linux 2.2.17 inside VMWare 2.0.3 (with 48MB virtual main
memory) on Windows 2000 with a Pentium II 400MHz processor and 128MB
main memory.

Since my program calls the function more than 3000 times for _each_
pattern, I did partial application, of course.  Even so, kmp in OCaml
still performed much worse than strstr in libc (both Sun's and GNU's),
at least in this program.

If you'd like to check it yourself, the source code is available at
the same URL as before
(http://www.yl.is.s.u-tokyo.ac.jp/~sumii/tmp/hc.tar.gz).

Julian Assange <proff@iq.org> wrote:
> I'm interested in this library. Can you tell me a little more? What's
> your ETA?

It is explained a little at:

  http://www.hypothesiscreator.net/

It has been developed in C++, but we began to port it to OCaml for
clarity and convenience (and fun:-).

Julian Assange <proff@iq.org> wrote:
> If you are really calling strstr 400,000,000 times, I strongly suggest
> the use of Boyer-Moore. Is your alphabet amino-acids or base-pairs? If

Can Boyer-Moore (in OCaml) be _much_ faster than KMP, so that it beats
strstr in libc?

John Prevost <jmp@arsdigita.com> wrote:
> I just grabbed the glibc version, and found the following comment at
> the head of it:

That was exactly what I found too (and mentioned a bit in my previous
message).  Maybe I should also disassemble and check Sun's strstr...?

Regards,

Eijiro