Version française
Home     About     Download     Resources     Contact us    
Browse thread
Odd performance result with HLVM
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Jon Harrop <jon@f...>
Subject: Re: [Caml-list] Odd performance result with HLVM
On Wednesday 04 March 2009 16:17:55 Mikkel Fahnøe Jørgensen wrote:
> When looking at the benchmark game and other benchmarks I have seen, I
> noticed that Haskell is almost as fast as OCaml and sometimes faster.
> Some Lisp implementations are also pretty fast.

From my ray tracer language comparison, my OCaml is ~50% faster than Haskell 
written by Lennart Augustsson and Lisp written by Juho Snellman, both of whom 
have extensive experience writing optimizing compilers for those languages 
whereas I did not:

  http://www.ffconsultancy.com/languages/ray_tracer/results.html

Moreover, I received dozens of implementations in Haskell and Lisp and these 
were the only vaguely competitive ones: most programmers are not able to 
write fast code in Haskell or Lisp primarily because their performance is so 
wildly unpredictable.

The Burrows-Wheeler block sorting data compression algorithm implemented in 
Haskell and discussed extensively for weeks in the context of performance is 
a good example of this: they never got within 10,000x the performance of C. 
There are many other examples where nobody was able to get within orders of 
magnitude of the performance of common languages.

GHC does have rudimentary support for parallelism and that makes it much 
easier to leverage 2-6 cores in Haskell compared to OCaml. However, that is 
merely a deficiency in the current OCaml implementation and is something that 
can be addressed. Moreover, the current Haskell implementation scales very 
poorly and is easily maxed out even on today's 8 core computers. For example, 
on a recent Mandelbrot benchmark from comp.lang.functional the Haskell is 
faster for 1-6 cores but stops seeing improvements beyond that whereas OCaml 
with process forking continues to see improvements up to all 8 cores and it 
faster overall on this machine as a consequence.

Although efficient concurrent garbage collectors are hard to write, parallel 
ones like the one in GHC are comparatively easy to write and still very 
useful.

> However, when you look at memory consumption OCaml uses considerably
> less memory, except for languages in the C family.
>
> I suspect that many real world performance scenarios, such as heavily
> loaded web servers and complex simulations, depend very much on memory
> consumption. This is both because of GC overhead and because of the
> slower memory pipeline the more cache levels are involved.
>
> So in case of a new JIT solution for OCaml, I believe it is important
> to observe this aspect as well.

OCaml's memory efficiency is certainly extremely good and it may be 
theoretically possible to preserve that in a new implementation that supports 
parallelism. That is absolutely not the goal of my work though: I only intent 
to get the simplest possible parallel GC working because I am interested 
primarily in high-performance numerics, string processing and visualization 
and not web servers.

However, I will endeavour to make the implementation as extensible as possible 
so that other people can create drop-in replacements that provide this kind 
of functionality. Improving upon my GC should be quite easy for anyone versed 
in the subject. Interestingly, my GC is written entirely in my own 
intermediate representation.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e