Version française
Home     About     Download     Resources     Contact us    
Browse thread
Ocamlopt code generator question
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Jon Harrop <jon@f...>
Subject: Re: [Caml-list] Ocamlopt x86-32 and SSE2
On Monday 11 May 2009 00:12:49 Matteo Frigo wrote:
> Do you guys have any sort of empirical evidence that scalar SSE2 math is
> faster than plain old x87?

I believe the motivation is to make good performance tractible in ocamlopt so 
it is more about the ease of code generation rather than the inherent 
performance characteristics of the two approaches.

> I ask because every time I tried compiling FFTW with gcc -m32
> -mfpmath=sse, the result has been invariably slower than the vanilla x87
> compilation.  (I am talking about scalar arithmetic here.  FFTW also
> supports SSE2 2-way vector arithmetic, which is of course faster.)
>
> I also remember trying similar experiments with other numerical code in
> the Pentium 4 dark ages, with similar results.  I don't see any reason
> why this should be the case, and maybe this is just a problem of gcc,
> but I don't think you should automatically assume that SSE2 math is
> faster without running a few experiments first.

As I understand it, this is very much a problem with ocamlopt and not with 
gcc. Specifically, floating point code compiled by ocamlopt on x86 gives 
mediocre performance for unknown reasons. Hence there is a desire to use more 
modern solutions that simplify the generation of performant code.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e