Version française
Home     About     Download     Resources     Contact us    
Browse thread
ocaml doesn't need to optimize on amd64??
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Peng Zang <peng.zang@g...>
Subject: Re: [Caml-list] ocaml doesn't need to optimize on amd64??
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hey Jon,

I was wondering if you've had any experience with OCaml's speed and quality of 
code generate on Intel 64 platforms.  Eg. a Core 2 Duo or one of the more 
recent Xeons.  Thanks,

Peng

On Thursday 10 January 2008 05:56:42 pm Jon Harrop wrote:
> On Wednesday 09 January 2008 17:14:52 Jon Harrop wrote:
> > On Wednesday 09 January 2008 14:22:00 Kuba Ober wrote:
> > > Jon & al,
> > >
> > > why do you think that OCaml doesn't need to do certain
> > > optimizations on amd64?
> >
> > OCaml does a (much) better job of code generation on AMD64.
>
> I was in a bit of a rush when I wrote that and I'd like to explain what I
> meant in more detail.
>
> OCaml's AMD64 code gen is so good that I have never heard of a reason to
> drop to C for high performance code. That is simply no longer an issue, so
> we are free to stay in the land of safety and high-level concepts which is
> enormously valuable in practice because it saves so much developer time.
>
> In particular, this is more important than having a compiler that
> implements the high-level optimizations we discussed because such a
> compiler can never match the performance of C if its AMD64 code gen is not
> as good as OCaml's.
>
> This is one of the reasons why I continue to choose OCaml over SML/NJ,
> MLton, GHC6.8, SBCL, Bigloo and almost all other FPL compilers: they have
> worse AMD64 code gens and imposes a significantly lower speed limit upon
> their users.
>
> If you look at the ray tracer benchmark:
>
>   http://www.ffconsultancy.com/languages/ray_tracer/results.html
>
> Three of the five OCaml programs are faster than any SML compiled with
> MLton even though MLton implements lots of high-level optimizations that
> OCaml does not. You can draw two crucial conclusions from this:
>
> 1. Even though OCaml lacks some high-level optimizations, you don't have to
> put much effort in before OCaml beats MLton-compiled SML because OCaml's
> AMD64 code gen is so good. In particular, two of those three OCaml
> implementations don't even bother implementing any of the low-level
> optimizations that we've discussed at all!
>
> 2. OCaml lets you choose how much you want to optimize your code right up
> to the performance of C but MLton imposes quite a low speed limit: 55%
> slower than the front runners. Once you hit that limit with MLton your only
> option is to drop to C (or OCaml ;-). However, dropping into another
> language imposes its own performance hits and is even likely to undermine
> the compiler's optimization efforts.
>
> So I agree that it would be very nice if OCaml implemented more
> optimizations along these lines but I choose OCaml with its excellent code
> gen over a more optimizing compiler that didn't have such a good code gen
> every day of the week and twice on Sundays.
>
> AFAIK, these high-level optimizations are never likely to get implemented
> in OCaml. My first vote would actually go to a different (and more
> fundamentally important) optimization anyway: arbitrary unboxing. One of
> the nice things about F# and GHC is that you can specify in your type
> declaration that the type is to be stored unboxed. This can have profound
> performance
> implications.
>
> Even in standalone code, unboxing arrays of complex numbers (which OCaml
> does not do) makes FFTs 5x faster. In the context of FFIs, the performance
> improvement can be much bigger because you can completely avoid the cost of
> copying huge quantities of data (e.g. color/texcoord/normal/position struct
> arrays in vertex buffer objects for OpenGL). In contrast, the overhead of
> lambda abstraction in numerical code is "only" a factor of 2.
>
> PS: Kuba, your C code will most likely run a significantly faster in 64-bit
> as well.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.7 (GNU/Linux)

iD8DBQFHhrtgfIRcEFL/JewRAqgFAKCm1Px3uMP8nEq5lrIp/vHdbIvSsgCgiDSJ
NpRFs4RKR+W5b0S0n9SIkAo=
=Y81S
-----END PGP SIGNATURE-----