Browse thread
SMP multithreading
-
Wolfgang Draxinger
- Edgar Friendly
- Sylvain Le Gall
- Wolfgang Draxinger
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: | 2010-11-16 (06:46) |
From: | Edgar Friendly <thelema314@g...> |
Subject: | Re: [Caml-list] SMP multithreading |
On 11/15/2010 09:27 AM, Wolfgang Draxinger wrote: > Hi, > > I've just read > http://caml.inria.fr/pub/ml-archives/caml-list/2002/11/64c14acb90cb14bedb2cacb73338fb15.en.html > in particular this paragraph: > | What about hyperthreading? Well, I believe it's the last convulsive > | movement of SMP's corpse :-) We'll see how it goes market-wise. At > | any rate, the speedups announced for hyperthreading in the Pentium 4 > | are below a factor of 1.5; probably not enough to offset the overhead > | of making the OCaml runtime system thread-safe. > > This reads just like the "640k ought be enough for everyone". Multicore > systems are the standard today. Even the cheapest consumer machines > come with at least two cores. Once can easily get 6 core machines today. > > Still thinking SMP was a niche and was dying? > > So, what're the developments regarding SMP multithreading OCaml? > > > Cheers > > Wolfgang > At the risk of feeding a (possibly unintentional) troll, I'd like to share some possibly new thoughts on this ever-living topic. It looks like high-performance computing of the near future will be built out of many machines (message passing), each with many cores (SMP). One could use message passing for all communication in such a system, but a hybrid approach might be best for this architecture, with use of shared memory within each box and message passing between. Of course the best choice depends strongly on the particular task. In the long run, it'll likely be a combination of a few large, powerful cores (Intel-CPU style w/ the capability to run a single thread as fast as possible) with many many smaller compute engines (GPGPUs or the like, optimized for power and area, closely coupled with memory) that provides the highest performance density. The question of how to program such an architecture seems as if it's being answered without the functional community's input. What can we contribute? E.