English version
Accueil     À propos     Téléchargement     Ressources     Contactez-nous    

Ce site est rarement mis à jour. Pour les informations les plus récentes, rendez-vous sur le nouveau site OCaml à l'adresse ocaml.org.

Browse thread
Re: Why OCaml sucks
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2008-05-09 (19:55)
From: Jon Harrop <jon@f...>
Subject: Re: [Caml-list] Re: Why OCaml rocks
On Friday 09 May 2008 12:12:00 Gerd Stolpmann wrote:
> I think the parallelism capabilities are already excellent. We have been
> able to implement the application backend of Wink's people search in
> O'Caml, and it is of course a highly parallel system of programs. This
> is not the same class raytracers or desktop parallelism fall into - this
> is highly professional supercomputing. I'm talking about a cluster of
> ~20 computers with something like 60 CPUs.
> Of course, we did not use multithreading very much. We are relying on
> multi-processing (both "fork"ed style and separately started programs),
> and multiplexing (i.e. application-driven micro-threading). I especially
> like the latter: Doing multiplexing in O'Caml is fun, and a substitute
> for most applications of multithreading. For example, you want to query
> multiple remote servers in parallel: Very easy with multiplexing,
> whereas the multithreaded counterpart would quickly run into scalability
> problems (threads are heavy-weight, and need a lot of resources).

If OCaml is good for concurrency on distributed systems that is great but it 
is completely different to CPU-bound parallelism on multicores.

> > There are two problems with that:
> >
> > . You go back to manual memory management between parallel
> > threads/processes.
> I guess you refer to explicit references between processes. This is a
> kind of problem, and best handled by avoiding it.

Then you take a massive performance hit.

> > . Parallelism is for performance and performance requires mutable data
> > structures.
> In our case, the mutable data structures that count are on disk.
> Everything else is only temporary state.

Exactly. That is a completely different kettle of fish to writing high 
performance numerical codes for scientific computing.

> I admit that it is a challenge to structure programs in a way such that
> parallel programs not sharing memory profit from mutable state. Note
> that it is also a challenge to debug locks in a multithreaded program so
> that they run 24/7. Parallelism is not easy after all.

Parallelism is easy in F#.

> > Then you almost always end up copying data unnecessarily because you
> > cannot collect it otherwise, which increases memory consumption and
> > massively degrades performance that, in turn, completely undermines the
> > original point of parallelism.
> Ok, I understand. We are complete fools. :-)
> I think that the cost of copying data is totally overrated. We are doing
> this often, and even over the network, and hey, we are breaking every
> speed limit.

You cannot afford to pay that price for parallel implementations of most 
numerical algorithms.

> > The cost of interthread communication is then so high in
> > OCaml that you will rarely be able to obtain any performance improvement
> > for the number of cores desktop machines are going to see over the next
> > ten years, by which time OCaml will be 10-100x slower than the
> > competition.
> This is a quite theoretical statement. We will rather see that most
> application programmers will not learn parallelism at all, and that
> consumers will start question the sense of multicores, and the chip
> industry will search for alternatives.

On the contrary, that is not a theoretical statement at all: it already 
happened. F# already makes it much easier to write high performance parallel 
algorithms and its concurrent GC is the crux of that capability.

Dr Jon D Harrop, Flying Frog Consultancy Ltd.