Version française
Home     About     Download     Resources     Contact us    
Browse thread
Shared memory parallel application: kernel threads
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Hugo Ferreira <hmf@i...>
Subject: Re: [Caml-list] Shared memory parallel application: kernel threads
Hi,

Gerd Stolpmann wrote:
> On Fr, 2010-03-12 at 11:55 +0000, Hugo Ferreira wrote:
>> Hello,
>>
>> I need to implement (meta) heuristic algorithms that
>> uses parallelism in order to (attempt to) solve a (hard)
>> machine learning problem that is inherently exponential.
>> The aim is to take maximum advantage of the multi-core
>> processors I have access to.
>>
snip
>> My first concern is to take advantage of the multi-cores so:
>>
>> 1. The thread library is not the answer
>>     Chapter 24 - "The threads library is implemented by time-sharing on 
>> a
>>     single processor. It will not take advantage of multi-processor
>>     machines." [1]
>>
>> 2. LinuxThreads seems to be what I need
>>     "The main strength of this approach is that it can take full
>>      advantage of multiprocessors." [2]
> 
> I think you mix here several things up. LinuxThreads has nothing to do
> with ocaml. It is an implementation of kernel threads for Linux on the C
> level. It is considered as outdated as of today, and is usually replaced
> by a better implementation (NPTL) that conforms more strictly to the
> POSIX standard.
> 

Oops. Silly me.

> Ocaml uses for its multi-threading implementation the multi-threading
> API the OS provides. This might be LinuxThreads or NPTL or something
> else. So, on the lower half of the implementation the threads are kernel
> threads, and multi-core-enabled. 

Ok.Should have read more carefully. As stated in the manual "Two
implementations of the threads library are available, depending on the
capabilities of the operating system:" So I have a recent glibc and
therefore "multi-core-enabled" threads.

> However, Ocaml prevents that more than
> one of the kernel threads can run inside its runtime at any time. So
> Ocaml code will always run only on one core (but you can call C code,
> and this can then take full advantage of multi-cores).
> 

Ok. I was under the (wrong) impression that the native OS threads did
run simultaneously (multi-core) but were intermittently stopped due to
the GC. So threads won't help.

> This is the primary reason I am going with multi-processing in my
> projects, and why Ocamlnet focuses on it.
> 

Understood.

> The Netcamlbox module of Ocamlnet 3 might be interesting for you. Here
> is an example program that mass-multiplies matrices on several cores:
> 
> https://godirepo.camlcity.org/svn/lib-ocamlnet2/trunk/code/examples/camlbox/manymult.ml
> 
> Netcamlbox can move complex values to shared memory, so you are not
> restricted to bigarrays. The matrix example uses float array array as
> representation. Recursive variants should also be fine.
> 
> For providing shared data to all workers, you can simply load it into
> the master process before the children processes are forked off. Another
> option is (especially when it is a lot of data, and you cannot afford to
> have n copies) to create another camlbox in the master process before
> forking, and to copy the shared data into it before forking. This avoids
> that the data is copied at fork time.
> 

The main data set is large, so I will opt for the latter.

> One drawback of Netcamlbox is that it is unsafe, and violating the
> programming rules is punished with crashes. (But this also applies, to
> some extent, to multi-threading, only that the rules are different.)
> 

Not an issue for me.
Going to read-up on and install ocamlnet3.

Thanks,
Hugo F.


> Gerd
> 
>> Issue 1
>>
>> In the manual [3] I see only references to function for the creation
>> and  use of processes. I see no calls that allow me to simply generate
>> and assign a function (job) to a thread (such as val create : ('a -> 'b)
>>   -> 'a -> t in the Thread module). The unix library where LinuxThreads
>> is now integrated shows the same API. Am I missing something or
>> is their no way to launch "threaded functions" from the Unix module?
>> Naturally I assume that threads and processes are not the same thing.
>>
>> Issue 2
>>
>> If I cannot launch kernel-threads to allow for easy memory sharing, what
>> other options do I have besides netshm? The data I must share is defined
>> by a recursive variant and is not simple numerical data.
>>
>> I would appreciate any comments.
>>
>> TIA,
>> Hugo F.
>>
>>
>> [1] http://caml.inria.fr/pub/docs/manual-ocaml/manual038.html
>> [2] http://pauillac.inria.fr/~xleroy/linuxthreads/
>> [3] http://caml.inria.fr/pub/docs/manual-ocaml/libref/ThreadUnix.html
>> [4] http://caml.inria.fr/pub/docs/manual-ocaml/manual035.html
>>
>>
>>
>> _______________________________________________
>> Caml-list mailing list. Subscription management:
>> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
>> Archives: http://caml.inria.fr
>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>>
> 
>