Browse thread
strange timing between search trees
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
| Date: | -- (:) |
| From: | Jon Harrop <jon@f...> |
| Subject: | Re: [Caml-list] strange timing between search trees |
On Friday 25 January 2008 01:32:49 Christophe Raffalli wrote: > Dear list members, > > I wanted to compare 2-3-4 trees (look in wikipedia if you do not know > them) with the balanced trees of the standard library. > I expected the 2-3-4 to be much faster for search because they use much > less indirections. However, I thought > that construction should make little difference ... > > I was wrong : > Construction is arround 20% faster for 2-3-4 trees > Search is slower arround 5-10% for 2-3-4 trees (the diff gets smaller > when the trees are larger which is expected) > > I wonder if the difference in code size is the explanation (the search > function for balanced trees is really small and fits better > in cache) ? I doubt it. You can make the built-in Set module much faster by increasing the code size. > I attach my code with two files (the code and the test, compile with > ocamlopt unix.cmxa set234.ml test234tree.ml) > > Any remarks or comments ? I get quantitatively similar performance results. You're missing a lot of information in your benchmarking though: What happens if the sets are constructed in-order (affects locality)? What happens if you iterate the benchmark over an array of preallocated sets? I did lots of benchmarking along these lines for the optimization chapter in OCaml for Scientists (including benchmarking sets). I found that there are several alternative set implementations out there but they all give almost identical performance (as you're observing). However, the AVL trees of the built-in sets provide asymptotically more efficient set-theoretic operations (union, diff, inter) than most other implementations. I also found that the benchmarking strategy more representative of the real code that I had was to iterate over preallocate sets, i.e. more cache misses. Ultimately, I found it much more productive to simply optimize the Set module that comes with OCaml. I'll describe how in a later OCaml Journal article but the main ideas are to get the comparison function inlined (OCaml doesn't inline across functor boundaries) and add a new node type for leaves. The latter greatly reduces the stress on the GC, which is the main performance bottleneck of the current implementation, and I found it to be up to 30% faster. We've also discussed this in detail before here so you might find more information in the caml-list archives. HTH. -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/products/?e