Version française
Home     About     Download     Resources     Contact us    

This site is updated infrequently. For up-to-date information, please visit the new OCaml website at

Browse thread
strange timing between search trees
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2008-01-25 (02:34)
From: Jon Harrop <jon@f...>
Subject: Re: [Caml-list] strange timing between search trees
On Friday 25 January 2008 01:32:49 Christophe Raffalli wrote:
> Dear list members,
> I wanted to compare 2-3-4 trees (look in wikipedia if you do not know
> them) with the balanced trees of the standard library.
> I expected the 2-3-4 to be much faster for search because they use much
> less indirections. However, I thought
> that construction should make little difference ...
> I was wrong :
> Construction is arround 20% faster for 2-3-4 trees
> Search is slower arround 5-10% for 2-3-4 trees (the diff gets smaller
> when the trees are larger which is expected)
> I wonder if the difference in code size is the explanation (the search
> function for balanced trees is really small and fits better
> in cache) ?

I doubt it. You can make the built-in Set module much faster by increasing the 
code size.

> I attach my code with two files (the code and the test, compile with
> ocamlopt unix.cmxa
> Any remarks or comments ?

I get quantitatively similar performance results. You're missing a lot of 
information in your benchmarking though:

What happens if the sets are constructed in-order (affects locality)?

What happens if you iterate the benchmark over an array of preallocated sets?

I did lots of benchmarking along these lines for the optimization chapter in 
OCaml for Scientists (including benchmarking sets). I found that there are 
several alternative set implementations out there but they all give almost 
identical performance (as you're observing). However, the AVL trees of the 
built-in sets provide asymptotically more efficient set-theoretic operations 
(union, diff, inter) than most other implementations.

I also found that the benchmarking strategy more representative of the real 
code that I had was to iterate over preallocate sets, i.e. more cache misses.

Ultimately, I found it much more productive to simply optimize the Set module 
that comes with OCaml. I'll describe how in a later OCaml Journal article but 
the main ideas are to get the comparison function inlined (OCaml doesn't 
inline across functor boundaries) and add a new node type for leaves. The 
latter greatly reduces the stress on the GC, which is the main performance 
bottleneck of the current implementation, and I found it to be up to 30% 

We've also discussed this in detail before here so you might find more 
information in the caml-list archives.


Dr Jon D Harrop, Flying Frog Consultancy Ltd.