Version française
Home     About     Download     Resources     Contact us    
Browse thread
ocaml doesn't need to optimize on amd64??
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Jon Harrop <jon@f...>
Subject: Re: [Caml-list] ocaml doesn't need to optimize on amd64??
On Wednesday 09 January 2008 17:14:52 Jon Harrop wrote:
> On Wednesday 09 January 2008 14:22:00 Kuba Ober wrote:
> > Jon & al,
> >
> > why do you think that OCaml doesn't need to do certain
> > optimizations on amd64?
>
> OCaml does a (much) better job of code generation on AMD64.

I was in a bit of a rush when I wrote that and I'd like to explain what I 
meant in more detail.

OCaml's AMD64 code gen is so good that I have never heard of a reason to drop 
to C for high performance code. That is simply no longer an issue, so we are 
free to stay in the land of safety and high-level concepts which is 
enormously valuable in practice because it saves so much developer time.

In particular, this is more important than having a compiler that implements 
the high-level optimizations we discussed because such a compiler can never 
match the performance of C if its AMD64 code gen is not as good as OCaml's.

This is one of the reasons why I continue to choose OCaml over SML/NJ, MLton, 
GHC6.8, SBCL, Bigloo and almost all other FPL compilers: they have worse 
AMD64 code gens and imposes a significantly lower speed limit upon their 
users.

If you look at the ray tracer benchmark:

  http://www.ffconsultancy.com/languages/ray_tracer/results.html

Three of the five OCaml programs are faster than any SML compiled with MLton 
even though MLton implements lots of high-level optimizations that OCaml does 
not. You can draw two crucial conclusions from this:

1. Even though OCaml lacks some high-level optimizations, you don't have to 
put much effort in before OCaml beats MLton-compiled SML because OCaml's 
AMD64 code gen is so good. In particular, two of those three OCaml 
implementations don't even bother implementing any of the low-level 
optimizations that we've discussed at all!

2. OCaml lets you choose how much you want to optimize your code right up to 
the performance of C but MLton imposes quite a low speed limit: 55% slower 
than the front runners. Once you hit that limit with MLton your only option 
is to drop to C (or OCaml ;-). However, dropping into another language 
imposes its own performance hits and is even likely to undermine the 
compiler's optimization efforts.

So I agree that it would be very nice if OCaml implemented more optimizations 
along these lines but I choose OCaml with its excellent code gen over a more 
optimizing compiler that didn't have such a good code gen every day of the 
week and twice on Sundays.

AFAIK, these high-level optimizations are never likely to get implemented in 
OCaml. My first vote would actually go to a different (and more fundamentally 
important) optimization anyway: arbitrary unboxing. One of the nice things 
about F# and GHC is that you can specify in your type declaration that the 
type is to be stored unboxed. This can have profound performance 
implications.

Even in standalone code, unboxing arrays of complex numbers (which OCaml does 
not do) makes FFTs 5x faster. In the context of FFIs, the performance 
improvement can be much bigger because you can completely avoid the cost of 
copying huge quantities of data (e.g. color/texcoord/normal/position struct 
arrays in vertex buffer objects for OpenGL). In contrast, the overhead of 
lambda abstraction in numerical code is "only" a factor of 2.

PS: Kuba, your C code will most likely run a significantly faster in 64-bit as 
well.
-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e