Registering threads created from C #6764

vicuna · 2015-01-29T17:35:45Z

Original bug ID: 6764
Reporter: braibant
Assigned to: @mshinwell
Status: closed (set by @mshinwell on 2015-05-06T15:31:20Z)
Resolution: not a bug
Priority: normal
Severity: text
Version: 4.02.1
Target version: 4.02.2+dev / +rc1
Category: documentation
Related to: #4702 #6776
Monitored by: @gasche @yallop

Bug description

Looking at section 19.10.1 "Registering threads created from C" in the manual, I am slightly puzzled by the documentation. I am not quite sure if that means what it means in the context of an OCaml library that's called from a third party C program using dynamic linking.

Obviously, there is no way to register all the threads that this third party application is going to create. Does that mean that it is not possible to use this kind of mix of OCaml/C? This does not seem to be the case reading the next paragraph in the manual, so I am a bit lost here.

vicuna · 2015-01-29T17:48:47Z

Comment author: @mshinwell

I think it is the case that you only need to use [caml_c_thread_register] from a particular thread if:

(a) the OCaml code is compiled to use threads; and
(b) the C code causes OCaml code to be called, or the OCaml runtime to be called (without actually running any OCaml code), from that thread.

If the OCaml code isn't compiled to use threads then you obviously also need to ensure that your C program does not attempt to execute OCaml code in a re-entrant or parallel manner.

I presume you have access to the third-party application's code, since you can't directly call an arbitrary OCaml function from a C program, owing to the difference in calling conventions. As such, I presume there is either a modification to the third-party program or some intervening veneer; in either case, you could call the thread registration function before the OCaml callback invocation, I suspect. In case it is useful: you can call [caml_c_thread_register] more than once from the same thread without trouble, and I think there's little performance penalty to doing so.

vicuna · 2015-01-29T19:54:03Z

Comment author: braibant

Just to clarify: I do not have access to the third party application. I am building a shared library that implements a given interface. The app loads the dynamic library and perform lookup for known symbols. There is indeed a shim around the OCaml code, implemented using inverted stubs from Ctypes.

I will hack the ctypes generated code to add the correct calls to the thread registration function before the OCaml callback invocation. (Btwm I indeed noticed that the [caml_c_thread_register] function can be called more than once, but was a bit afraid of the runtime cost. Since the need for that function was not obvious the way I understood the documentation, I argued against adding these calls everywhere.)

vicuna · 2015-01-29T21:17:37Z

Comment author: @mshinwell

Yeah, that solution sounds correct. I wouldn't worry about the runtime cost of [caml_c_thread_register], at least in the first instance.

vicuna · 2015-02-02T17:53:32Z

Comment author: braibant

Ok, I think I need to increase the severity of this issue:

I have patched ctypes to wrap calls to [register] and [unregister] around the calls to OCaml callbacks. The code that is generated is similar to the following gist.

https://gist.github.com/braibant/827b3b2158f94f461847

Running the third party application with this code yields a segfault. The backtrace is the following one

(gdb) bt
#0  0x00007fffeda42f62 in caml_thread_remove_info (th=th@entry=0xbc7650) at st_stubs.c:350
#1  0x00007fffeda43c60 in caml_c_thread_unregister () at st_stubs.c:595
#2  0x00007fffeda497cc in C_GetFunctionList (x102=0x7fffffffcfe0) at pkcs11-rev/pkcs11_rev_stubs.c:195

The content of th is as follows

(gdb) frame 0
#0  0x00007fffeda42f62 in caml_thread_remove_info (th=th@entry=0xbc7650) at st_stubs.c:350
350	  th->prev->next = th->next;

(gdb) print th
$3 = (caml_thread_t) 0xbc7650
(gdb) print th->next
$4 = (struct caml_thread_struct *) 0xbc7a50
(gdb) print th->prev
$5 = (struct caml_thread_struct *) 0x0

I think this issue is somewhat similar to the one described in #4702

Namely:

A C process creates a separate thread, loads Ocaml DLL and initializes Ocaml runtime.
The C process performs a call to C_GetFunctionList which is a C function that wraps a call to an OCaml callback with [register] and [unregister].

vicuna · 2015-02-03T07:55:38Z

Comment author: @mshinwell

Ack

vicuna · 2015-02-03T09:45:10Z

Comment author: @mshinwell

I think the problem is that you need to call [caml_thread_initialize] before [caml_c_thread_register]. It may be worth trying to fix the runtime so this requirement can be removed.

For posterity, I managed to reproduce the bug using the following nasty little test case. (Revised version here now that actually works...)

==> build.sh <==
#!/bin/bash

set -x -u -e

OCAML=/path/to/ocaml/installation
INCLUDE=$OCAML/lib/ocaml

gcc -fPIC -g -c -I$INCLUDE shared_lib_c.c
$OCAML/bin/ocamlopt.opt -thread unix.cmxa threads/threads.cmxa -g -output-obj -o shared_lib_proto.so shared_lib.ml shared_lib_c.o
gcc -g -shared -o shared_lib.so shared_lib_proto.so -L$OCAML/lib/ocaml -L$OCAML/lib/ocaml/threads -L$OCAML/lib/ocaml/unix -lthreads -lunix -lasmrun
gcc -g -o main -I$INCLUDE -ldl -lpthread main.c -lm

==> main.c <==
#include <dlfcn.h>
#include <caml/mlvalues.h>
#include <pthread.h>
#include <stdlib.h>
#include <sys/types.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>

static volatile int num_threads = 0;

static void* thread_body(void* lib)
{
pid_t tid;
useconds_t sleep_before, sleep_after;
void* fn;
tid = syscall(SYS_gettid);
sleep_before = rand() % 1000000; /* 1s /
sleep_after = rand() % 1000000; / 1s /
printf("thread_body tid %d, sleep before %ld us, sleep after %ld us, ",
tid, sleep_before, sleep_after);
printf("num active threads %d\n", ++num_threads);
fflush(stdout);
usleep(sleep_before);
fn = dlsym(lib, "call_ocaml_in_shared_lib");
if (!fn) abort();
if (((value ()(void)) fn)() != Val_unit) abort();
usleep(sleep_after);
printf("tid %d exiting\n", tid);
fflush(stdout);
--num_threads;
return NULL;
}

int main(int argc, char* argv[])
{
void* lib;
void* fn;

lib = dlopen("shared_lib.so", RTLD_NOW);
if (!lib) {
printf("%s\n", dlerror());
fflush(stdout);
abort();
}

fn = dlsym(lib, "call_ocaml_init");
if (!fn) abort();
((void (*)(char**)) fn)(argv);

printf("starting thread creation loop\n");
fflush(stdout);
while (1) {
pthread_t handle;
pthread_create(&handle, NULL, &thread_body, lib);
usleep(100000); /* 100ms */
}

return 0;
}

==> shared_lib_c.c <==
#include <caml/callback.h>
#include <caml/mlvalues.h>
#include <caml/threads.h>
#include <stdio.h>

void call_ocaml_init(char** argv)
{
caml_startup(argv); (* this will run the module initializers )
caml_release_runtime_system(); ( drop lock acquired by [caml_startup] *)
}

value call_ocaml_in_shared_lib(void)
{
value r;
value* cb;
caml_c_thread_register();
caml_acquire_runtime_system();
r = caml_callback(*caml_named_value("ocaml_in_shared_lib"), Val_int(42));
caml_release_runtime_system();
caml_c_thread_unregister();
return r;
}

==> shared_lib.ml <==
(* The next line makes sure that caml_thread_initialize is called.
(Otherwise, using caml_c_thread_register etc will cause a segfault,
as in PR 6764.)
)
let (_ : Thread.t) = Thread.self () ( ensure Thread is linked *)

let ocaml_in_shared_lib x =
Printf.printf "ocaml in shared lib %d%!" x

let () =
Callback.register "ocaml_in_shared_lib" ocaml_in_shared_lib

Build and run (on Linux, after editing the path to your OCaml installation in build.sh) with:
./build.sh
LD_LIBRARY_PATH=$(pwd) ./main

vicuna · 2015-02-03T10:29:11Z

Comment author: braibant

I can reprodue a segfault with your example but I cannot reproduce a working behavior: when building and running your example, num active threads keeps increasing, and I never see a call to the ocaml_in_shared_lib function.

Are you seeing a different behavior?

vicuna · 2015-02-03T10:30:01Z

Comment author: braibant

I copied your code here: https://github.com/braibant/ocaml-pr6764

vicuna · 2015-02-03T11:07:16Z

Comment author: @mshinwell

Sorry, I've fixed the test case properly now.
Note the "Thread.self" line in the .ml file!

vicuna · 2015-02-03T11:11:08Z

Comment author: @mshinwell

As a follow-up: maybe it should be ensured by the compiler that the [Thread] module's initializer is always run whenever -thread is specified. Or something.

Also: you can't just call [caml_thread_initialize] from C directly; it needs to be called from an OCaml context, I think. It's safest just to make sure the [Thread] module is initialized.

vicuna · 2015-02-03T14:59:01Z

Comment author: braibant

Another follow up: if I remove the call to Callback.register, I get another segfault. It is not clear from the manual that this call to Callback.register is mandatory. Unfortunately, the way Ctypes uses callback, they are not always registered, and it's hard to register them all...

vicuna · 2015-02-03T15:10:22Z

Comment author: @mshinwell

I think you must always use [Callback.register] if you are going to invoke the function from C, otherwise [caml_named_value] will return a null pointer, dereferencing of which will cause a segfault.

vicuna · 2015-02-03T15:48:07Z

Comment author: braibant

Ok, I mixed up things and my comment was not founded.

Just to be crystal clear: one need to use [Callback.register] if and only if the callback is called from C using the [caml_named_value], right?

vicuna · 2015-02-03T15:55:05Z

Comment author: @mshinwell

I believe that is correct, yes.

vicuna · 2015-02-04T08:47:18Z

Comment author: @mshinwell

Does your program now work?

vicuna · 2015-02-04T08:50:17Z

Comment author: braibant

I am still looking for the cause of a segfault, but it looks like it's coming from ctypes. I will keep you informed (and probably write a summary of this whole experiment)

vicuna · 2015-02-04T08:54:50Z

Comment author: @mshinwell

In case it helps: in the case of a delayed segfault due to heap corruption, you can sometimes narrow it down by calling [Gc.compact] in an attempt to trigger it earlier. (For example, you could do this before releasing the runtime lock in each callback.)

vicuna · 2015-02-09T11:11:04Z

Comment author: braibant

I am still working on a segfault in our dll. I have nailed down an issue in the code we call when the shared object is unloaded: the OCaml documentation is a bit scarce about what is required to cleanup the runtime. Any idea about how to do that in the multithreaded context of your above example?

(edit to link a related github pull request #71 )

vicuna · 2015-02-09T11:20:49Z

Comment author: @mshinwell

Not completely sure, but if you're unloading the runtime anyway as part of the DLL, I'm not sure why it matters. Can you give any more information as to the location of the segfault?

vicuna · 2015-02-09T17:19:03Z

Comment author: braibant

I have updated the code here https://github.com/braibant/ocaml-pr6764 to illustrate the issue.

vicuna · 2015-05-06T15:30:55Z

Comment author: @mshinwell

I'm going to close this issue, since it appears #6776 (killing the ticker thread) is the only fix required.

vicuna closed this as completed May 6, 2015

vicuna added the documentation label Mar 14, 2019

vicuna added this to the 4.02.2 milestone Mar 14, 2019

vicuna assigned mshinwell Mar 14, 2019

This was referenced Mar 14, 2019

The runtime does not expose a way to reliably kill the "tick" thread, leading to segfault after exiting the runtime #6776

Closed

C-threads and callbacks #4702

Closed

vicuna added the bug label Mar 20, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Registering threads created from C #6764

Registering threads created from C #6764

vicuna commented Jan 29, 2015

vicuna commented Jan 29, 2015

vicuna commented Jan 29, 2015

vicuna commented Jan 29, 2015

vicuna commented Feb 2, 2015

vicuna commented Feb 3, 2015

vicuna commented Feb 3, 2015

vicuna commented Feb 3, 2015

vicuna commented Feb 3, 2015

vicuna commented Feb 3, 2015

vicuna commented Feb 3, 2015

vicuna commented Feb 3, 2015

vicuna commented Feb 3, 2015

vicuna commented Feb 3, 2015

vicuna commented Feb 3, 2015

vicuna commented Feb 4, 2015

vicuna commented Feb 4, 2015

vicuna commented Feb 4, 2015

vicuna commented Feb 9, 2015

vicuna commented Feb 9, 2015

vicuna commented Feb 9, 2015

vicuna commented May 6, 2015

Registering threads created from C #6764

Registering threads created from C #6764

Comments

vicuna commented Jan 29, 2015

Bug description

vicuna commented Jan 29, 2015

vicuna commented Jan 29, 2015

vicuna commented Jan 29, 2015

vicuna commented Feb 2, 2015

vicuna commented Feb 3, 2015

vicuna commented Feb 3, 2015

vicuna commented Feb 3, 2015

vicuna commented Feb 3, 2015

vicuna commented Feb 3, 2015

vicuna commented Feb 3, 2015

vicuna commented Feb 3, 2015

vicuna commented Feb 3, 2015

vicuna commented Feb 3, 2015

vicuna commented Feb 3, 2015

vicuna commented Feb 4, 2015

vicuna commented Feb 4, 2015

vicuna commented Feb 4, 2015

vicuna commented Feb 9, 2015

vicuna commented Feb 9, 2015

vicuna commented Feb 9, 2015

vicuna commented May 6, 2015