English version
Accueil     À propos     Téléchargement     Ressources     Contactez-nous    

Ce site est rarement mis à jour. Pour les informations les plus récentes, rendez-vous sur le nouveau site OCaml à l'adresse ocaml.org.

Browse thread
Re: HLVM ray tracer performance
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2010-01-11 (00:47)
From: Jeff Shaw <shawjef3@m...>
Subject: Re: [Caml-list] Re: HLVM ray tracer performance

> Are you running x64 or on Intel hardware? What results do you get for 12, 13
> or 14 instead of 9?
I am running an AMD Phenom 9950, but the Ubuntu I'm using is just 
32-bit. I tried 5/ray.hs with level=12 instead of 9 but it ran into a 
stack overflow problem. When I increased the stack size it completed but 
it also took more time than 1/ray.hs, which required no stack size 
increase. I made sure that the other arguments I fed it were the same. I 
think there is some problem that needs to be worked out in the 5/ray.hs. 
Maybe the problem is in ghc, I'm not sure. Below, ./ray5 is 5/ray.hs, 
and ./ray is 1/ray.hs

jeff@ubuntu:~/Desktop$ time ./ray 12 512 > /dev/null

real    0m21.479s
user    0m21.093s
sys    0m0.180s
jeff@ubuntu:~/Desktop$ time ./ray5 12 512 +RTS -K2000000000 > /dev/null

real    0m28.366s
user    0m25.674s
sys    0m2.608s
jeff@ubuntu:~/Desktop$ time ./ray 14 512 > /dev/null

real    0m23.544s
user    0m23.021s
sys    0m0.500s

I tried level=14 but I ran out of memory for 5/ray.ml and 5/ray.hs.

I considered that maybe I had saved the files from your website wrong, 
or mixed them up during compilation. So I ran the timer again with 
level=9 and level=12 and got all the same results. That is, level=9 is 
faster on 5/ray.hs but level=12 is faster with 1/ray.hs. So I don't 
think I'm making a simple manual labor error.

It seems that 5/ray.ml and 5/ray.hs aren't quite equivalent in some 
important way since 1/ray.ml is faster than 5/ray.ml for both level=9 
and level=12. Whether it's a code problem or compiler problem, I cannot say.

The stack size problem does not go away when I remove all the extra 
optimization arguments to ghc.