You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Original bug ID: 6204 Reporter:@edwintorok Status: acknowledged (set by @damiendoligez on 2014-06-04T20:11:46Z) Resolution: open Priority: normal Severity: feature Platform: x86_64 OS: Linux Version: 4.01.0 Category: otherlibs Tags: patch Monitored by:@gasche@ygrek
Bug description
It would be useful to use the pthread provided facilities to detect mutex misuse
(EDEADLK/EPERM).
There already is a "-runtime-variant d" linker flag that enables more checks in the (GC) runtime. I propose same flag to enable more checks in st_stubs.c.
Attached patch implements this:
-with-debug-runtime causes a $(LIBDIR)/threadsd/libthreadnat.a to be installed (st_stubs built with -g -DDEBUG)
use PTHREAD_MUTEX_ERRORCHECK mutex types (exceptions raised when Mutex.lock/unlock is misused)
check the masterlock and give a fatal error message when double-release(EPERM) or double-acquire(EDEADLK) is detected
check return code of more pthread functions
Caveats:
only implemented for POSIX threads (st_posix.h)
only implemented for native code (ocamlopt), bytecode keeps using the non-debug runtime
if a C stub does just a caml_enter_blocking_section(),
and returns without raising an exception, then the missing caml_leave_blocking_section() is detected only at the next caml_enter_blocking_section()/caml_raise* call.
The patch is just a draft, more checking could be implemented later.
AFAIK there are some more changes planned for systhreads and I don't know how this patch would conflict with those: #5373 https://github.com/lucasaiu/ocaml
Suggestions welcome.
Steps to reproduce
Apply patch either with 'patch -p1 <combined.patch', or 'git am combined.patch' on top of latest trunk code:
git-svn-id: http://caml.inria.fr/svn/ocaml/trunk@14214 f963ae5c-01c2-4b8c-9fe0-0dff7051ff02
With my patch the bug is pinpointed to "equeue_ssl_single_shutdown", and a somewhat helpful fatal error is printed by OCaml:
Fatal error: cannot release OCaml master lock
The OCaml master lock is owned by another thread!
(Did you forget a caml_leave_blocking_section() call?)
$ ocamlbuild -use-ocamlfind ./http_mt.native ./http_mt.byte -tag debug -lflags "-runtime-variant d"
$ gdb -batch -ex "b exit" -ex "r" -ex "bt" -ex "quit" ./http_mt.native
Breakpoint 1 at 0x406a40
warning: Could not load shared library symbols for linux-vdso.so.1.
Do you need "set solib-search-path" or "set sysroot"?
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
OCaml runtime: debug mode
Initial minor heap size: 2048k bytes
Initial major heap size: 992k bytes
Initial space overhead: 80%
Initial max overhead: 500%
Initial heap increment: 992k bytes
Initial allocation policy: 0
!Growing heap to 2976k bytes
Growing heap to 3968k bytes
Fatal error: cannot release OCaml master lock
The OCaml master lock is owned by another thread!
(Did you forget a caml_leave_blocking_section() call?)
[Switching to Thread 0x7ffff3bf5700 (LWP 7320)]
Breakpoint 1, __GI_exit (status=2) at exit.c:99
99 exit.c: No such file or directory.
#0 __GI_exit (status=2) at exit.c:99 #1 0x000000000054da3d in caml_fatal_error (msg=0x573c50 "Fatal error: cannot release OCaml master lock\nThe OCaml master lock is owned by another thread!\n(Did you forget a caml_leave_blocking_section() call?)\n") at misc.d.c:53 #2 0x0000000000545041 in st_masterlock_release (m=0x8a6a80 <caml_master_lock>) at st_posix.h:196 #3 0x0000000000545a13 in caml_thread_enter_blocking_section () at st_stubs.c:178 #4 0x0000000000545c41 in caml_io_mutex_unlock_exn () at st_stubs.c:262 #5 0x000000000054b5d1 in caml_raise (v=140737352117712) at fail.d.c:57 #6 0x000000000054b7e7 in caml_raise_with_arg (tag=140737351355792, arg=11) at fail.d.c:93 #7 0x0000000000538f2d in equeue_ssl_single_shutdown () #8 0x000000000040f2a6 in camlUq_ssl__fun_3959 () at uq_ssl.ml:626 #9 0x000000000040e957 in camlUq_ssl__fun_4167 () at uq_ssl.ml:805 #10 0x0000000000514161 in camlList__map_1040 () at list.ml:55 #11 0x0000000000411c3f in camlUq_ssl__fun_4127 () at uq_ssl.ml:798 #12 0x0000000000444c7e in camlUnixqueue_pollset__forward_event_to_1571 () at unixqueue_pollset.ml:768 #13 0x0000000000441848 in camlEqueue__fun_1257 () at equeue.ml:166 #14 0x000000000051cc59 in camlQueue__iter_1050 () at queue.ml:134 #15 0x00000000004422d2 in camlEqueue__run_1072 () at equeue.ml:159 #16 0x00000000004466f5 in camlUnixqueue_pollset__fun_3318 () at unixqueue_pollset.ml:999 #17 0x000000000040d37f in camlHttp_mt__f_1037 () at http_mt.ml:34 #18 0x0000000000506c99 in camlThread__fun_1081 () at thread.ml:37 #19 0x00000000005699de in caml_start_program () #20 0x0000000000909160 in ?? () #21 0x0000000000000000 in ?? ()
A debugging session is active.
Inferior 1 [process 7308] will be killed.
Quit anyway? (y or n) [answered Y; input not from terminal]
This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc.
Original bug ID: 6204
Reporter: @edwintorok
Status: acknowledged (set by @damiendoligez on 2014-06-04T20:11:46Z)
Resolution: open
Priority: normal
Severity: feature
Platform: x86_64
OS: Linux
Version: 4.01.0
Category: otherlibs
Tags: patch
Monitored by: @gasche @ygrek
Bug description
It would be useful to use the pthread provided facilities to detect mutex misuse
(EDEADLK/EPERM).
There already is a "-runtime-variant d" linker flag that enables more checks in the (GC) runtime. I propose same flag to enable more checks in st_stubs.c.
Attached patch implements this:
Caveats:
and returns without raising an exception, then the missing caml_leave_blocking_section() is detected only at the next caml_enter_blocking_section()/caml_raise* call.
The patch is just a draft, more checking could be implemented later.
AFAIK there are some more changes planned for systhreads and I don't know how this patch would conflict with those:
#5373
https://github.com/lucasaiu/ocaml
Suggestions welcome.
Steps to reproduce
Apply patch either with 'patch -p1 <combined.patch', or 'git am combined.patch' on top of latest trunk code:
git-svn-id: http://caml.inria.fr/svn/ocaml/trunk@14214 f963ae5c-01c2-4b8c-9fe0-0dff7051ff02
Additional information
Example usage to debug the issue in OCamlnet described here:
https://sympa.inria.fr/sympa/arc/caml-list/2013-09/msg00342.html
With my patch the bug is pinpointed to "equeue_ssl_single_shutdown", and a somewhat helpful fatal error is printed by OCaml:
Fatal error: cannot release OCaml master lock
The OCaml master lock is owned by another thread!
(Did you forget a caml_leave_blocking_section() call?)
$ ocamlbuild -use-ocamlfind ./http_mt.native ./http_mt.byte -tag debug -lflags "-runtime-variant d"
$ gdb -batch -ex "b exit" -ex "r" -ex "bt" -ex "quit" ./http_mt.native
Breakpoint 1 at 0x406a40
warning: Could not load shared library symbols for linux-vdso.so.1.
Do you need "set solib-search-path" or "set sysroot"?
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
OCaml runtime: debug mode
Initial minor heap size: 2048k bytes
Initial major heap size: 992k bytes
Initial space overhead: 80%
Initial max overhead: 500%
Initial heap increment: 992k bytes
Initial allocation policy: 0
OCaml systhreads: debug mode
<>Starting new major GC cycle
OCaml runtime: heap check
![New Thread 0x7ffff7bfd700 (LWP 7312)]
[New Thread 0x7ffff73fc700 (LWP 7313)]
[New Thread 0x7ffff6bfb700 (LWP 7314)]
[New Thread 0x7ffff63fa700 (LWP 7315)]
[New Thread 0x7ffff5bf9700 (LWP 7316)]
[New Thread 0x7ffff53f8700 (LWP 7317)]
[New Thread 0x7ffff4bf7700 (LWP 7318)]
[New Thread 0x7ffff43f6700 (LWP 7319)]
[New Thread 0x7ffff3bf5700 (LWP 7320)]
[New Thread 0x7ffff33f4700 (LWP 7321)]
[New Thread 0x7ffff2bf3700 (LWP 7322)]
<>$Growing heap to 1984k bytes
Growing page table to 2048 entries
<>Starting new major GC cycle
OCaml runtime: heap check
!Growing heap to 2976k bytes
Growing heap to 3968k bytes
Fatal error: cannot release OCaml master lock
The OCaml master lock is owned by another thread!
(Did you forget a caml_leave_blocking_section() call?)
[Switching to Thread 0x7ffff3bf5700 (LWP 7320)]
Breakpoint 1, __GI_exit (status=2) at exit.c:99
99 exit.c: No such file or directory.
#0 __GI_exit (status=2) at exit.c:99
#1 0x000000000054da3d in caml_fatal_error (msg=0x573c50 "Fatal error: cannot release OCaml master lock\nThe OCaml master lock is owned by another thread!\n(Did you forget a caml_leave_blocking_section() call?)\n") at misc.d.c:53
#2 0x0000000000545041 in st_masterlock_release (m=0x8a6a80 <caml_master_lock>) at st_posix.h:196
#3 0x0000000000545a13 in caml_thread_enter_blocking_section () at st_stubs.c:178
#4 0x0000000000545c41 in caml_io_mutex_unlock_exn () at st_stubs.c:262
#5 0x000000000054b5d1 in caml_raise (v=140737352117712) at fail.d.c:57
#6 0x000000000054b7e7 in caml_raise_with_arg (tag=140737351355792, arg=11) at fail.d.c:93
#7 0x0000000000538f2d in equeue_ssl_single_shutdown ()
#8 0x000000000040f2a6 in camlUq_ssl__fun_3959 () at uq_ssl.ml:626
#9 0x000000000040e957 in camlUq_ssl__fun_4167 () at uq_ssl.ml:805
#10 0x0000000000514161 in camlList__map_1040 () at list.ml:55
#11 0x0000000000411c3f in camlUq_ssl__fun_4127 () at uq_ssl.ml:798
#12 0x0000000000444c7e in camlUnixqueue_pollset__forward_event_to_1571 () at unixqueue_pollset.ml:768
#13 0x0000000000441848 in camlEqueue__fun_1257 () at equeue.ml:166
#14 0x000000000051cc59 in camlQueue__iter_1050 () at queue.ml:134
#15 0x00000000004422d2 in camlEqueue__run_1072 () at equeue.ml:159
#16 0x00000000004466f5 in camlUnixqueue_pollset__fun_3318 () at unixqueue_pollset.ml:999
#17 0x000000000040d37f in camlHttp_mt__f_1037 () at http_mt.ml:34
#18 0x0000000000506c99 in camlThread__fun_1081 () at thread.ml:37
#19 0x00000000005699de in caml_start_program ()
#20 0x0000000000909160 in ?? ()
#21 0x0000000000000000 in ?? ()
A debugging session is active.
Quit anyway? (y or n) [answered Y; input not from terminal]
File attachments
The text was updated successfully, but these errors were encountered: