Version française
Home     About     Download     Resources     Contact us    
Browse thread
[Caml-list] speed
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Brian Hurt <brian.hurt@q...>
Subject: Re: Coyote Gulch test in Caml (was Re: [Caml-list] speed )
On Tue, 21 Jan 2003, Nickolay Semyonov-Kolchin wrote:

> On Tuesday 21 January 2003 02:23, David Chase wrote:
> 
> Speed and accuracy are different things. Matlab class software need
> accuracy, most computer games need speed. This is the reason of
> "-ffast-math" key in gcc. Ocaml lacks such key, and always produce
> ineffecient floating-point code.
> 

>From gcc's info:

`-ffast-math'
     This option allows GCC to violate some ANSI or IEEE rules and/or
     specifications in the interest of optimizing code for speed.  For
     example, it allows the compiler to assume arguments to the `sqrt'
     function are non-negative numbers and that no floating-point values
     are NaNs.

     This option should never be turned on by any `-O' option since it
     can result in incorrect output for programs which depend on an
     exact implementation of IEEE or ANSI rules/specifications for math
     functions.

Which raises a couple of questions.  The first question is wether 
-ffast-math mainly violates ANSI or IEEE rules.  If ANSI, we're OK- we 
just define the Ocaml rules so we don't have to violate them.

But then this brings up the issue of conformity vr.s performance.  For
example- the x86 has its 80-bit FP registers in 8087-legacy mode, but
64-bit registers if you're using SSE2.  And PowerPC and PA-RISC both have
extended precision fused multiply-adds (that keep higher precision, i.e.
don't round, between the multiply and the adds).  For that matter, could a 
"conforming" implementation of Ocaml use the 32-bit single precision SSE-1 
registers?

http://www.cs.berkeley.edu/~wkahan/JAVAhurt.pdf
http://www.cs.berkeley.edu/~wkahan/Curmudge.pdf

As a general rule, I perfer the higher precision when it doesn't hurt 
enormously.  Basically, keeping things at at least 64-bit IEEE FP is a 
good idea- except in special cases, the speed advantage of going down to 
single precision.

Oh, and if we're talking about performance, memory behavior is much more 
important than precision of floating point primitives (so long as FP is in 
hardware).  A complex FP operation may take tens of clock cycles- but a 
cache miss now takes hundreds.  The most important paper about numeric 
performance of Ocaml might be this one:
http://www.cs.princeton.edu/~mjrg/fpca95.ps.Z

Brian


-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners