Version française
Home     About     Download     Resources     Contact us    

This site is updated infrequently. For up-to-date information, please visit the new OCaml website at

Browse thread
[Caml-list] speed
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2003-01-18 (23:40)
From: Shawn Wagner <shawnw@s...>
Subject: Re: Coyote Gulch test in Caml (was Re: [Caml-list] speed )
On Sat, Jan 18, 2003 at 05:49:47PM -0500, Oleg wrote:
> On Saturday 04 January 2003 01:31 pm, Xavier Leroy wrote:
> > Apparently, the ocamlopt-generated code
> > offers less instruction-level parallelism than the g++-generated code
> > for the float computations.  Still, I haven't really understood where
> > the factor of 2 comes from.  
> It's been a couple of weeks. I'm wondering if you got any new insights into 
> this?
> Just as wild guess: the code contains calls to "sin" and "cos" on the same 
> value. Perhaps GCC manages to optimize those into one call to "sincos"

It doesn't. I tried making a C++ version that does when I was fooling around
with it. Didn't really help. The single greatest speed increase I got (Which
did something like cut the runtime in half) was -ffast-math, which cuts out
the trig function calls in favor of direct use of the proper x86
instructions. But the inlined-sincos (__sincos()) in glibc causes a segfault
on my athlon when I tried using it. :P

Something like gcc's -ffast-math for ocamlopt would be nice, but improving
the scheduler is probably of more general use, making it able to target code
for specific CPUs like gcc does with good results. Targeting athlon instead
of i386 cut the almabench time by 30 seconds, for example.

I don't know anything about the code-generation bits of ocamlopt, though, so
I have no idea how big of a project that would be.

Shawn Wagner
To unsubscribe, mail Archives:
Bug reports: FAQ:
Beginner's list: