Version française
Home     About     Download     Resources     Contact us    

This site is updated infrequently. For up-to-date information, please visit the new OCaml website at

Browse thread
Re: Threading and SharedMem (Re: [Caml-list] Re: Is OCaml fast?)
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2010-11-30 (16:07)
From: Gerd Stolpmann <info@g...>
Subject: Re: Threading and SharedMem (Re: [Caml-list] Re: Is OCaml fast?)
Am Dienstag, den 30.11.2010, 16:30 +0100 schrieb Stephan Houben:
> On 11/30/2010 02:22 PM, Gerd Stolpmann wrote:
> > I don't think this is the reason. Many people can ignore Windows,
> > actually.
> >
> > The problem is more that your whole program needs then to be
> > restructured - multi-processing implies a process model (which is the
> > master, which are the workers). With multi-threading you can start
> > threads at all times without having to worry about that (i.e. supports
> > "programming without design" if you want to take that as a negative
> > point).
> >
> > This is what I want to fix with my Netmulticore library - it defines a
> > framework allowing you to start new processes at any time without having
> > to worry about the process hierarchy.
> I have in fact read with much interest your blog at
> .
> Your approach there is to really have separate programs for
> server and client. However, one nice thing about fork is that you don't
> have to restructure your program; you can just call fork down somewhere
> in some subroutine where you decide it is convenient, start doing some
> multicore computation, finish and return, and the caller needs never know
> that you did that. So you can indeed program without design using fork.

Well, I would not recommend that in all cases: fork duplicates all
memory, and if this is a lot, you can end up consuming a lot of RAM.
Even worse, the GC of the forked subprocess has to manage all of the
RAM, including the part that is not required for doing the computation
in the subprocess (the copy-on-write optimization of the OS gets you
nothing here).

Also, there can be subtle interactions between the parent and the child,
e.g. file descriptors are inherited, affecting whether closed
descriptors can be recognized.

So, use with care, and not without design. Forking in the middle of a
bigger program can be quite disastrous.

> Of course, the advantage of your approach is that you can now distribute
> the work over multiple machines. So I guess there is an appropriate
> place for all of these techniques.

I was recently working on an improved fork machinery, with some
indirection between the request for creating a new worker process, and
the actual fork. That's this netmulticore library I'm talking about. The
new process is not a child of the process requesting the creation, but
always a child of a common master process. This avoids all the problems
(memory issues, file descriptor issues, and a few more), at the cost of
having to transmit state to the new process.

> > Also, many practical problems are only O(n log n), at most. The cost for
> > serialization of data through a pipe cannot be neglected here. This
> > makes shared memory attractive, even if it is only available in a
> > restricted form (like write once memory).
> Well, the original context was one of a benchmark which had an
> arbitrary rule that you can only use functions from the bundled libraries.
> And my proposal was to use the pipe for synchronisation and the shared memory
> for bulk communication.
> If we drop the arbitrary rule there are faster options than pipes.
> (e.g. POSIX semaphores in a shared memory segment).

Yes, it's really arbitrary. All remaining solutions are very restricted
then, and don't have much to do with what you would choose for solving a
real-world problem. This makes this benchmark irrelevant.

Gerd Stolpmann, Bad Nauheimer Str.3, 64289 Darmstadt,Germany
Phone: +49-6151-153855                  Fax: +49-6151-997714