Version française
Home     About     Download     Resources     Contact us    

This site is updated infrequently. For up-to-date information, please visit the new OCaml website at

Browse thread
Re: OCaml is broken
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2009-12-24 (15:52)
From: Jon Harrop <jon@f...>
Subject: Re: [Caml-list] Re: multicore wish
On Thursday 24 December 2009 13:19:52 Goswin von Brederlow wrote:
> Jon Harrop <> writes:
> > No, in OCaml I fork every child. That is the only transparent way to give
> > the child a coherent view of the heap but it is extremely slow (~1ms):
> So if you add a (sleep 60) to the ocaml code then ocaml gets even
> more worse than F#? You are comparing an implementation that is
> specifically optimized to use things F# is good at and ocaml is
> bad. To give a fair comparison you need to optimize the implementation 
> to the language.

Post a better OCaml solution if you can.

> >> Each process then has a work queue which is works through. So they will
> >> always use the local data. Only when a queue runs dry they steal from
> >> another process and ruin locality.
> >
> > You are correctly describing the efficient solution based upon
> > work-stealing queues that F# uses but OCaml cannot express it.
> You mean that you didn't implement that way.

No, I mean OCaml cannot express it.

> I can easily express that with closures and pre-forked worker threads.

OCaml's threads do not run in parallel so that will not work.

> >> So I don't see where your argument fits. You are not creating childs
> >> on the fly. Only at the start and they run till all the work is done.
> >> At least in this example.
> >
> > No, every recursive invocation of the parallel quicksort spawns another
> > child on-the-fly. That's precisely why it parallelizes so efficiently
> > when you have wait-free work-stealing task deques and a shared heap. In
> > general, you rewrite algorithms into this recursive divide and conquer
> > form and parallelize when possible. You can parallelize a *lot* of
> > problems efficiently that way.
> That seems awfully ineficient. Then you have a million children
> running on maybe 4 cores and each job queue holds no job.
> I think we mean different things by children. By children I mean what
> the kernel sees as children. Newly forked threads. I think you mean 
> jobs that get put into the queues. There is no reason to fork a new
> system thread every time the parallel quicksort splits an array in
> two.

The children are lightweight tasks, not threads or processes.

> >> Why would you fork in invoke?
> >
> > Fork is currently the only transparent way to implement "invoke" but it
> > is extremely slow and unreliable.
> No, it is the insane way. Use closures.

Please demonstrate.

> > parallel programs that leverage multicores and run a *lot* faster than
> > anything that can be written in OCaml. You can even make this accessible
> > to OCaml programmers as a DSL with automatic interop to make high
> > performance parallel programming as easy as possible from OCaml.
> Or just implement it properly in ocaml. The problem part is the
> GC. The rest is easy.

No kidding.

Dr Jon Harrop, Flying Frog Consultancy Ltd.