New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The runtime does not expose a way to reliably kill the "tick" thread, leading to segfault after exiting the runtime #6776
Comments
Comment author: @whitequark I do not think st_thread_kill is a proper solution. For instance, some libc's (e.g. Bionic) do not implement any way to kill a thread. Either way, using pthread_kill is a known antipattern. Instead, the caml_thread_cleanup should set a flag to tell the tick thread to terminate and then join it. |
Comment author: braibant @whitequark: Do we agree that there are two different issue? 1) whether The current implementation does not use Do we agree that something like the following would work?
|
Comment author: @whitequark pthread_cancel has the exact same issues as pthread_kill. (I could not recall which exactly did st_thread_kill use). From the Bionic docs:
So, no, that would not work even for tick thread. |
Comment author: @whitequark Actually, why does the tick thread even exist? Its sole purpose is to send SIGVTALRM. Can't you use setitimer for that? |
Comment author: @damiendoligez
Probably because Windows. I agree that the proper way is to set a flag and then join. The only downside is that you might have to wait for the duration of a whole tick. |
Comment author: braibant I think that I am okay with waiting a whole tick, if the alternative is a segfault :). |
Comment author: @damiendoligez @braibant I'll do a fix soon, but I'd like to know if you would push for it to be included in 4.02.2 or if you're OK with waiting for 4.03. It's not a completely trivial fix and it'll have to be tested on all architectures, so there is a risk here. |
Comment author: braibant I will not push for it to be included in 4.02.2 per se, but would be very happy to test it as soon as possible. |
Comment author: @damiendoligez Here is a patch. I applied it in trunk (rev 15975). I can't test your program just now because it doesn't work on 64-bit Linux. |
Comment author: @xavierleroy In the absence of new information, let's consider this problem resolved by commit 15975. |
Original bug ID: 6776
Reporter: braibant
Status: closed (set by @xavierleroy on 2017-02-16T14:14:44Z)
Resolution: fixed
Priority: high
Severity: crash
Target version: 4.03.0+dev / +beta1
Fixed in version: 4.03.0+dev / +beta1
Category: runtime system and C interface
Related to: #6764
Monitored by: @whitequark
Bug description
In the wake of #6764 I am looking at embedding the OCaml runtime in a shared library. This shared library might be loaded and unloaded several time by an application that makes heavy use of threads.
The code hosted here https://github.com/braibant/ocaml-pr6764 (which was originally written by Mark Shinwell during the discussion of the aforementioned related issue) illustrates what happens when the shared object is loaded and unloaded several time in a row.
Debugging the issue with this example, it appear that the "tick" thread might still be alive after the dlclose function returns. The tick thread should have been cleaned up by the
caml_thread_cleanup
function, which is registered in theat_exit
calls. However, the implementation ofst_thread_kill
inotherlibs/systhreads/st_posix.c
isFrom http://man7.org/linux/man-pages/man3/pthread_cancel.3.html the thread cancellation happens asynchronously, and "the return status of pthread_cancel() merely informs the caller whether the cancellation request was successfully queued."
A suitable fix would be to use
pthread_join
after thepthread_cancel
. This would still not have the same behavior as the win32 implementation (which usesTerminateThread
and thus cannot execute any user-mode code).Another fix would be to replace the
st_thread_kill
incaml_thread_cleanup
with something a bit more agressive about termination (a newst_thread_real_kill
?)Steps to reproduce
git clone git@github.com:braibant/ocaml-pr6764.git
cd ocaml-pr6764
make
make run
File attachments
The text was updated successfully, but these errors were encountered: