Browse thread
Shared memory parallel application: kernel threads
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
| Date: | -- (:) |
| From: | Hugo Ferreira <hmf@i...> |
| Subject: | Re: [Caml-list] Shared memory parallel application: kernel threads |
Hi,
Gerd Stolpmann wrote:
> On Fr, 2010-03-12 at 11:55 +0000, Hugo Ferreira wrote:
>> Hello,
>>
>> I need to implement (meta) heuristic algorithms that
>> uses parallelism in order to (attempt to) solve a (hard)
>> machine learning problem that is inherently exponential.
>> The aim is to take maximum advantage of the multi-core
>> processors I have access to.
>>
snip
>> My first concern is to take advantage of the multi-cores so:
>>
>> 1. The thread library is not the answer
>> Chapter 24 - "The threads library is implemented by time-sharing on
>> a
>> single processor. It will not take advantage of multi-processor
>> machines." [1]
>>
>> 2. LinuxThreads seems to be what I need
>> "The main strength of this approach is that it can take full
>> advantage of multiprocessors." [2]
>
> I think you mix here several things up. LinuxThreads has nothing to do
> with ocaml. It is an implementation of kernel threads for Linux on the C
> level. It is considered as outdated as of today, and is usually replaced
> by a better implementation (NPTL) that conforms more strictly to the
> POSIX standard.
>
Oops. Silly me.
> Ocaml uses for its multi-threading implementation the multi-threading
> API the OS provides. This might be LinuxThreads or NPTL or something
> else. So, on the lower half of the implementation the threads are kernel
> threads, and multi-core-enabled.
Ok.Should have read more carefully. As stated in the manual "Two
implementations of the threads library are available, depending on the
capabilities of the operating system:" So I have a recent glibc and
therefore "multi-core-enabled" threads.
> However, Ocaml prevents that more than
> one of the kernel threads can run inside its runtime at any time. So
> Ocaml code will always run only on one core (but you can call C code,
> and this can then take full advantage of multi-cores).
>
Ok. I was under the (wrong) impression that the native OS threads did
run simultaneously (multi-core) but were intermittently stopped due to
the GC. So threads won't help.
> This is the primary reason I am going with multi-processing in my
> projects, and why Ocamlnet focuses on it.
>
Understood.
> The Netcamlbox module of Ocamlnet 3 might be interesting for you. Here
> is an example program that mass-multiplies matrices on several cores:
>
> https://godirepo.camlcity.org/svn/lib-ocamlnet2/trunk/code/examples/camlbox/manymult.ml
>
> Netcamlbox can move complex values to shared memory, so you are not
> restricted to bigarrays. The matrix example uses float array array as
> representation. Recursive variants should also be fine.
>
> For providing shared data to all workers, you can simply load it into
> the master process before the children processes are forked off. Another
> option is (especially when it is a lot of data, and you cannot afford to
> have n copies) to create another camlbox in the master process before
> forking, and to copy the shared data into it before forking. This avoids
> that the data is copied at fork time.
>
The main data set is large, so I will opt for the latter.
> One drawback of Netcamlbox is that it is unsafe, and violating the
> programming rules is punished with crashes. (But this also applies, to
> some extent, to multi-threading, only that the rules are different.)
>
Not an issue for me.
Going to read-up on and install ocamlnet3.
Thanks,
Hugo F.
> Gerd
>
>> Issue 1
>>
>> In the manual [3] I see only references to function for the creation
>> and use of processes. I see no calls that allow me to simply generate
>> and assign a function (job) to a thread (such as val create : ('a -> 'b)
>> -> 'a -> t in the Thread module). The unix library where LinuxThreads
>> is now integrated shows the same API. Am I missing something or
>> is their no way to launch "threaded functions" from the Unix module?
>> Naturally I assume that threads and processes are not the same thing.
>>
>> Issue 2
>>
>> If I cannot launch kernel-threads to allow for easy memory sharing, what
>> other options do I have besides netshm? The data I must share is defined
>> by a recursive variant and is not simple numerical data.
>>
>> I would appreciate any comments.
>>
>> TIA,
>> Hugo F.
>>
>>
>> [1] http://caml.inria.fr/pub/docs/manual-ocaml/manual038.html
>> [2] http://pauillac.inria.fr/~xleroy/linuxthreads/
>> [3] http://caml.inria.fr/pub/docs/manual-ocaml/libref/ThreadUnix.html
>> [4] http://caml.inria.fr/pub/docs/manual-ocaml/manual035.html
>>
>>
>>
>> _______________________________________________
>> Caml-list mailing list. Subscription management:
>> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
>> Archives: http://caml.inria.fr
>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>>
>
>