Version française
Home     About     Download     Resources     Contact us    
Browse thread
thousands of CPU cores
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Mike Lin <mikelin@m...>
Subject: Re: [Caml-list] thousands of CPU cores
On Mon, Jul 14, 2008 at 8:08 AM, Jon Harrop <jon@ffconsultancy.com> wrote:

>
> Perhaps the parallel GC could enable support for things like OpenMP but I
> personally would rather see a shift to similar functionality to that of
> Microsoft's TPL because (I assume) it is better for parallel tree
> operations
> that are themselves more common in languages like OCaml.


OpenMP is really great for parallelizing tight loops in numerical code,
which is one scenario in which I'd agree shared memory is much better than
message passing, at least as far as it scales. I wish I had this for my
OCaml CRF and M^3 network code!

But for higher level, map/reduce type of stuffs, I really think message
passing tends to gets you there. In such applications I am usually
interested in distributing across a compute farm anyway, for both CPU and
memory requirements. I started with a lame homerolled fork+Marshal library,
then moved on to Gerd's RPC stuff, now finally I'm playing with ocamlp3l...

Incidentally, it occurs to me that when one is optimizing the kind of tight
numerical loops that can really benefit from shared memory, the FIRST step,
before parallelizing, is to do away with any heap allocations in the loop.
The following is not a serious proposal, but just to kick the idea around -
what is the feasibility of removing the global interpreter lock for segments
of code which perform no heap allocations? i.e. what besides the GC is
stopping us?

Mike