Version française
Home     About     Download     Resources     Contact us    
Browse thread
Ocamlopt x86-32 and SSE2
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Pascal Cuoq <Pascal.Cuoq@c...>
Subject: Ocamlopt x86-32 and SSE2
Here's an idea, I don't know if it is relevant, but it looks that
it could be a good compromise (option 2.5, if you will): how about
implementing floating-point operations as function calls
(the functions could be written in C and be part of the runtime library)
when the SSE2 instructions are not available? Is that simpler than
option 3?

Matteo Frigo <athena@fftw.org> wrote:
> Do you guys have any sort of empirical evidence that scalar SSE2  
> math is
> faster than plain old x87?

It's not speed I am after personally, but a correct implementation
of IEEE 754's round-to-nearest mode for doubles.
Also, the satisfying knowledge that the code of the compiler I use
is as tight is it can be and that I could understand it if I had to
some day.

Jon Harrop <jon@ffconsultancy.com> wrote:
> Note that you can use the same argument to justify not optimizing  
> the x86
> backend because power users should be using the (much more  
> performant) x64
> code gen.

I don't know where you get "much more performant" from.
For what I do, speed of floating-point operations is irrelevant, but
not the speed of the whole application. The whole application is
slightly slower (~10%) with the larger data words despite the improved
instruction set. Plus, memory is also a concern, and for users who
have less than 6GiB of memory, there are actually more addressable
data words in x86 mode.

Pascal