Version franēaise
Home     About     Download     Resources     Contact us    
Browse thread
[Caml-list] The invert Benchmark
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Jean-Christophe Filliatre <Jean-Christophe.Filliatre@l...>
Subject: Re: [Caml-list] The invert Benchmark

Christophe TROESTLER writes:
 > Dear Caml riders,
 > 
 > I found by chance the "The invert Benchmark"
 > (http://www.lib.uchicago.edu/keith/crisis/benchmarks/invert/).  As you
 > will notice the Caml code (even compiled) performs poorly.  I guess
 > part of the problem is due to using Map when Hashtbl is more suited.
 > So I tried to rewrite the code using Hashtbl (attached to this mail).
 > What I got some trouble to figure out is how to get a list of the keys
 > where each of the keys appears only once.  I eventually went the easy
 > way.  Anybody got better ideas to improve efficiency?  Could a "keys"
 > function be an interesting addition to Hashtbl???

As  suggested  by  Xavier,  String.concat  substituted  for  the  join
function really helps.

A simple profiling  with gprof also showed that both  in your code and
in  the  original one,  a  lot of  time  was  spent building  regexps!
Factorizing  out Str.{regexp,regexp_string} "\t"  (i.e. doing  it only
once in a global variable) really improves performances.

Anyway, using Str.split to cut a  string in two is a bit costly; doing
it directly as Markus did also improves performances.

-- 
Jean-Christophe Filliātre (http://www.lri.fr/~filliatr)

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners