English version
Accueil     À propos     Téléchargement     Ressources     Contactez-nous    

Ce site est rarement mis à jour. Pour les informations les plus récentes, rendez-vous sur le nouveau site OCaml à l'adresse ocaml.org.

Browse thread
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2009-03-05 (02:27)
From: Stefan Monnier <monnier@i...>
Subject: Re: stl?
> Less than it might seem.  First of all, you have the allocation costs to
> hold the new vector- there's 3-5 clocks right there.  Let's say the vectors
> are 3 elements long.  So you have six loads and three stores- 
> assume stores are free, and loads cost 1 clock apeice- they're in L1 cache,
> and they're getting preloaded.  Plus the 3 clocks for the floating point
> adds.  So that's 12-15 clock cycles right there.  Say I'm being pessimistic
> by a factor of 2, and it's really only 6-8 clock cycles.  Vr.s 10 clock
> cycles or so for the call via functor.  Now we're down into the realm of
> only doubling the cost of the operation, not an order of magnitude increase.
> A single mispredicted branch or L1 cache miss that has to go out to L2
> cache- not even main memory, just L2!- would blow the overhead of calling
> via functor out of the water.

This presumes that the indirect function call only happens once
per vector.  But it may very well be once per vector plus once
per element (think of parameterizing list traversal over `map': since
it's abstracted, not only is the call to `map' indirect, but map's own
calls to the loop body are indirect as well).