Version française
Home     About     Download     Resources     Contact us    
Browse thread
thousands of CPU cores
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Xavier Leroy <Xavier.Leroy@i...>
Subject: Re: [Caml-list] thousands of CPU cores
J C wrote:
> I know that Caml team wanted to see if many-core shared-memory systems
> were going to stick around before bothering with Caml development that
> takes advantage of them.
> Well, it looks like they are here to stay, after all:
> http://news.cnet.com/8301-13924_3-9981760-64.html

As others mentioned already, nothing in this news item talks about
shared memory parallelism.  There are good reasons to think that the
illusion of shared memory cannot be maintained in the presence of
hundreds of computing elements, even using cc-NUMA techniques
(i.e. hardware emulation of shared memory on top of high-speed
point-to-point links).  Look at GPUs, which are the closest we have
today to a manycore system: 128 cores are available today, more is
in preparation, but the programming model is definitely not SMP.

I had the opportunity to discuss this with Anwar Ghuloum, the Intel
principal engineer quoted in the Cnet article (and the driving force
behind Intel's joining the Caml consortium, by the way).  His group is
definitely not committed to shared-memory approaches, but instead
investigates high-level data-parallel programming models with a strong
functional flavor that can be mapped both to shared memory and
non-shared memory systems.
(http://techresearch.intel.com/articles/Tera-Scale/1514.htm)

So, I am as convinced as ever that shared mutable data is a terrible
programming model for parallelism, because 1- it's awfully low-level,
error-prone and non-compositional, and 2- interesting parallel
hardware doesn't implement it anyway.  (Examples: clusters, GPUs, the
Cell processor.)  At best, shared memory is a low-level implementation
technique which, like manual memory management, pointer arithmetic and
self-modifying code, is best hidden in the OS and language runtime
system, but should never be exposed to the programmer.

The interesting question that this community should focus on
(rather than throwing fits about concurrent GC and the like) is coming
up with good programming models for parallelism.  I'm quite fond of
message passing myself, but agree that more constrained data-parallel
models have value as well.  As Gerd Stolpmann mentioned, various forms
of message passing can be exploited from OCaml today, but there is
certainly room for improvement there.

Jon Harrop wrote:
> Today's biggest shared-memory supercomputers already have thousands
> of cores.

??? All these supercomputers are clusters.

Sylvain Le Gall wrote:

> There is also the fact that using multi process allow you to go further
> than the memory limit per process (3GB for Linux/ 1GB for Windows). With
> the actual increase of amount of RAM, this can be an issue.

The correct solution to this issue isn't to artificially split your
program in multiple processes, but to move to 64-bit architectures.
You would be hard pressed to buy a desktop PC today that isn't 64-bit
capable.  Linux x86-64 works beautifully.

- Xavier Leroy