Browse thread
Ocamlopt x86-32 and SSE2
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: | 2009-05-11 (07:11) |
From: | Pascal Cuoq <Pascal.Cuoq@c...> |
Subject: | Ocamlopt x86-32 and SSE2 |
Here's an idea, I don't know if it is relevant, but it looks that it could be a good compromise (option 2.5, if you will): how about implementing floating-point operations as function calls (the functions could be written in C and be part of the runtime library) when the SSE2 instructions are not available? Is that simpler than option 3? Matteo Frigo <athena@fftw.org> wrote: > Do you guys have any sort of empirical evidence that scalar SSE2 > math is > faster than plain old x87? It's not speed I am after personally, but a correct implementation of IEEE 754's round-to-nearest mode for doubles. Also, the satisfying knowledge that the code of the compiler I use is as tight is it can be and that I could understand it if I had to some day. Jon Harrop <jon@ffconsultancy.com> wrote: > Note that you can use the same argument to justify not optimizing > the x86 > backend because power users should be using the (much more > performant) x64 > code gen. I don't know where you get "much more performant" from. For what I do, speed of floating-point operations is irrelevant, but not the speed of the whole application. The whole application is slightly slower (~10%) with the larger data words despite the improved instruction set. Plus, memory is also a concern, and for users who have less than 6GiB of memory, there are actually more addressable data words in x86 mode. Pascal