Browse thread
Ocaml sums the harmonic series -- four ways, four benchmarks: floating point performance
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: | 2005-01-13 (19:01) |
From: | Will M. Farr <farr@M...> |
Subject: | Re: [Caml-list] Ocaml sums the harmonic series -- four ways, four benchmarks: floating point performance |
Is the PowerPC ocamlopt back-end less optimized than the x86? I didn't realize that ocamlopt did enough optimizations that the backend would be substantially different on the different architectures (in the manual they say that it compiles the code essentially as written -- no loop unrolling, etc). Are you sure that there isn't just a built-in instruction on the x86 that adds an int to a float? Will On 13 Jan 2005, at 12:29 PM, John Prevost wrote: > On Thu, 13 Jan 2005 10:53:16 -0500, Will M. Farr <farr@mit.edu> wrote: >> Each invocation was compiled with "ocamlopt -unsafe -noassert >> -o harmonic harmonic.ml". It looks like using references and >> loops is *by far* the fastest (and also that my PowerBook is >> pretty slow to convert int->float, but I don't think this is >> related to ocaml, since the C version does the same thing). > > Note that this is dependent on what CPU you're using. On my test > system (700MHz AMD Athlon with 256MB of memory), I saw this behavior: > > time ./harmonic 1000000000: > > harmonic: > you: 2m01.590s .. 0m00.790s > me: 0m30.811s .. 0m00.120s > > harmonic2: > you: 2m00.340s .. 0m00.440s > me: 0m30.847s .. 0m00.140s > > harmonic3: > you: 1m44.350s .. 0m00.740s > me: 0m38.002s .. 0m00.130s > > harmonic4: > you: 1m12.680s .. 0m00.430s > me: 1m14.603s .. 0m00.220s > > So on this system, harmonic4 is by far the slowest, and the fastest > version is the one that uses float_of_int and tail recursion. It's > unclear to me how much of this is that the Intel compiler is simply > better optimized than the PPC compiler. > > John.