English version
Accueil     À propos     Téléchargement     Ressources     Contactez-nous    

Ce site est rarement mis à jour. Pour les informations les plus récentes, rendez-vous sur le nouveau site OCaml à l'adresse ocaml.org.

Browse thread
Ocaml sums the harmonic series -- four ways, four benchmarks: floating point performance
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2005-01-13 (19:01)
From: Will M. Farr <farr@M...>
Subject: Re: [Caml-list] Ocaml sums the harmonic series -- four ways, four benchmarks: floating point performance
Is the PowerPC ocamlopt back-end less optimized than the x86?  I didn't 
realize that ocamlopt did enough optimizations that the backend would 
be substantially different on the different architectures (in the 
manual they say that it compiles the code essentially as written -- no 
loop unrolling, etc).  Are you sure that there isn't just a built-in 
instruction on the x86 that adds an int to a float?


On 13 Jan 2005, at 12:29 PM, John Prevost wrote:

> On Thu, 13 Jan 2005 10:53:16 -0500, Will M. Farr <farr@mit.edu> wrote:
>> Each invocation was compiled with "ocamlopt -unsafe -noassert
>> -o harmonic harmonic.ml".  It looks like using references and
>> loops is *by far* the fastest (and also that my PowerBook is
>> pretty slow to convert int->float, but I don't think this is
>> related to ocaml, since the C version does the same thing).
> Note that this is dependent on what CPU you're using.  On my test
> system (700MHz AMD Athlon with 256MB of memory), I saw this behavior:
> time ./harmonic 1000000000:
> harmonic:
>   you: 2m01.590s .. 0m00.790s
>    me: 0m30.811s .. 0m00.120s
> harmonic2:
>   you: 2m00.340s .. 0m00.440s
>    me: 0m30.847s .. 0m00.140s
> harmonic3:
>   you: 1m44.350s .. 0m00.740s
>    me: 0m38.002s .. 0m00.130s
> harmonic4:
>   you: 1m12.680s .. 0m00.430s
>    me: 1m14.603s .. 0m00.220s
> So on this system, harmonic4 is by far the slowest, and the fastest
> version is the one that uses float_of_int and tail recursion.  It's
> unclear to me how much of this is that the Intel compiler is simply
> better optimized than the PPC compiler.
> John.