Browse thread
Re: Why OCaml sucks
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
| Date: | -- (:) |
| From: | Jon Harrop <jon@f...> |
| Subject: | Re: [Caml-list] Re: Why OCaml rocks |
On Friday 09 May 2008 12:12:00 Gerd Stolpmann wrote: > I think the parallelism capabilities are already excellent. We have been > able to implement the application backend of Wink's people search in > O'Caml, and it is of course a highly parallel system of programs. This > is not the same class raytracers or desktop parallelism fall into - this > is highly professional supercomputing. I'm talking about a cluster of > ~20 computers with something like 60 CPUs. > > Of course, we did not use multithreading very much. We are relying on > multi-processing (both "fork"ed style and separately started programs), > and multiplexing (i.e. application-driven micro-threading). I especially > like the latter: Doing multiplexing in O'Caml is fun, and a substitute > for most applications of multithreading. For example, you want to query > multiple remote servers in parallel: Very easy with multiplexing, > whereas the multithreaded counterpart would quickly run into scalability > problems (threads are heavy-weight, and need a lot of resources). If OCaml is good for concurrency on distributed systems that is great but it is completely different to CPU-bound parallelism on multicores. > > There are two problems with that: > > > > . You go back to manual memory management between parallel > > threads/processes. > > I guess you refer to explicit references between processes. This is a > kind of problem, and best handled by avoiding it. Then you take a massive performance hit. > > . Parallelism is for performance and performance requires mutable data > > structures. > > In our case, the mutable data structures that count are on disk. > Everything else is only temporary state. Exactly. That is a completely different kettle of fish to writing high performance numerical codes for scientific computing. > I admit that it is a challenge to structure programs in a way such that > parallel programs not sharing memory profit from mutable state. Note > that it is also a challenge to debug locks in a multithreaded program so > that they run 24/7. Parallelism is not easy after all. Parallelism is easy in F#. > > Then you almost always end up copying data unnecessarily because you > > cannot collect it otherwise, which increases memory consumption and > > massively degrades performance that, in turn, completely undermines the > > original point of parallelism. > > Ok, I understand. We are complete fools. :-) > > I think that the cost of copying data is totally overrated. We are doing > this often, and even over the network, and hey, we are breaking every > speed limit. You cannot afford to pay that price for parallel implementations of most numerical algorithms. > > The cost of interthread communication is then so high in > > OCaml that you will rarely be able to obtain any performance improvement > > for the number of cores desktop machines are going to see over the next > > ten years, by which time OCaml will be 10-100x slower than the > > competition. > > This is a quite theoretical statement. We will rather see that most > application programmers will not learn parallelism at all, and that > consumers will start question the sense of multicores, and the chip > industry will search for alternatives. On the contrary, that is not a theoretical statement at all: it already happened. F# already makes it much easier to write high performance parallel algorithms and its concurrent GC is the crux of that capability. -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/products/?e