Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0006204OCamlOCaml runtime systempublic2013-10-06 13:442014-02-19 15:34
Reporteredwin 
Assigned To 
PrioritynormalSeverityfeatureReproducibilityN/A
StatusnewResolutionopen 
Platformx86_64OSLinuxOS Version
Product Version4.01.0 
Target VersionFixed in Version 
Summary0006204: debug mode for native otherlibs/systhreads
DescriptionIt would be useful to use the pthread provided facilities to detect mutex misuse
(EDEADLK/EPERM).
There already is a "-runtime-variant d" linker flag that enables more checks in the (GC) runtime. I propose same flag to enable more checks in st_stubs.c.

Attached patch implements this:
 * -with-debug-runtime causes a $(LIBDIR)/threadsd/libthreadnat.a to be installed (st_stubs built with -g -DDEBUG)
 * use PTHREAD_MUTEX_ERRORCHECK mutex types (exceptions raised when Mutex.lock/unlock is misused)
 * check the masterlock and give a fatal error message when double-release(EPERM) or double-acquire(EDEADLK) is detected
 * check return code of more pthread functions

Caveats:
 * only implemented for POSIX threads (st_posix.h)
 * only implemented for native code (ocamlopt), bytecode keeps using the non-debug runtime
 * if a C stub does just a caml_enter_blocking_section(),
and returns without raising an exception, then the missing caml_leave_blocking_section() is detected only at the next caml_enter_blocking_section()/caml_raise* call.

The patch is just a draft, more checking could be implemented later.
AFAIK there are some more changes planned for systhreads and I don't know how this patch would conflict with those:
http://caml.inria.fr/mantis/view.php?id=5373 [^]
https://github.com/lucasaiu/ocaml [^]

Suggestions welcome.
Steps To ReproduceApply patch either with 'patch -p1 <combined.patch', or 'git am combined.patch' on top of latest trunk code:
git-svn-id: http://caml.inria.fr/svn/ocaml/trunk@14214 [^] f963ae5c-01c2-4b8c-9fe0-0dff7051ff02
Additional InformationExample usage to debug the issue in OCamlnet described here:
https://sympa.inria.fr/sympa/arc/caml-list/2013-09/msg00342.html [^]

With my patch the bug is pinpointed to "equeue_ssl_single_shutdown", and a somewhat helpful fatal error is printed by OCaml:
Fatal error: cannot release OCaml master lock
The OCaml master lock is owned by another thread!
(Did you forget a caml_leave_blocking_section() call?)

$ ocamlbuild -use-ocamlfind ./http_mt.native ./http_mt.byte -tag debug -lflags "-runtime-variant d"
$ gdb -batch -ex "b exit" -ex "r" -ex "bt" -ex "quit" ./http_mt.native
Breakpoint 1 at 0x406a40
warning: Could not load shared library symbols for linux-vdso.so.1.
Do you need "set solib-search-path" or "set sysroot"?
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
### OCaml runtime: debug mode ###
Initial minor heap size: 2048k bytes
Initial major heap size: 992k bytes
Initial space overhead: 80%
Initial max overhead: 500%
Initial heap increment: 992k bytes
Initial allocation policy: 0
### OCaml systhreads: debug mode ###
<>Starting new major GC cycle
### OCaml runtime: heap check ###
![New Thread 0x7ffff7bfd700 (LWP 7312)]
[New Thread 0x7ffff73fc700 (LWP 7313)]
[New Thread 0x7ffff6bfb700 (LWP 7314)]
[New Thread 0x7ffff63fa700 (LWP 7315)]
[New Thread 0x7ffff5bf9700 (LWP 7316)]
[New Thread 0x7ffff53f8700 (LWP 7317)]
[New Thread 0x7ffff4bf7700 (LWP 7318)]
[New Thread 0x7ffff43f6700 (LWP 7319)]
[New Thread 0x7ffff3bf5700 (LWP 7320)]
[New Thread 0x7ffff33f4700 (LWP 7321)]
[New Thread 0x7ffff2bf3700 (LWP 7322)]
<>$Growing heap to 1984k bytes
Growing page table to 2048 entries
<>Starting new major GC cycle
### OCaml runtime: heap check ###
!Growing heap to 2976k bytes
Growing heap to 3968k bytes
Fatal error: cannot release OCaml master lock
The OCaml master lock is owned by another thread!
(Did you forget a caml_leave_blocking_section() call?)
[Switching to Thread 0x7ffff3bf5700 (LWP 7320)]

Breakpoint 1, __GI_exit (status=2) at exit.c:99
99 exit.c: No such file or directory.
#0 __GI_exit (status=2) at exit.c:99
#1 0x000000000054da3d in caml_fatal_error (msg=0x573c50 "Fatal error: cannot release OCaml master lock\nThe OCaml master lock is owned by another thread!\n(Did you forget a caml_leave_blocking_section() call?)\n") at misc.d.c:53
#2 0x0000000000545041 in st_masterlock_release (m=0x8a6a80 <caml_master_lock>) at st_posix.h:196
0000003 0x0000000000545a13 in caml_thread_enter_blocking_section () at st_stubs.c:178
0000004 0x0000000000545c41 in caml_io_mutex_unlock_exn () at st_stubs.c:262
0000005 0x000000000054b5d1 in caml_raise (v=140737352117712) at fail.d.c:57
0000006 0x000000000054b7e7 in caml_raise_with_arg (tag=140737351355792, arg=11) at fail.d.c:93
0000007 0x0000000000538f2d in equeue_ssl_single_shutdown ()
0000008 0x000000000040f2a6 in camlUq_ssl__fun_3959 () at uq_ssl.ml:626
0000009 0x000000000040e957 in camlUq_ssl__fun_4167 () at uq_ssl.ml:805
0000010 0x0000000000514161 in camlList__map_1040 () at list.ml:55
0000011 0x0000000000411c3f in camlUq_ssl__fun_4127 () at uq_ssl.ml:798
0000012 0x0000000000444c7e in camlUnixqueue_pollset__forward_event_to_1571 () at unixqueue_pollset.ml:768
0000013 0x0000000000441848 in camlEqueue__fun_1257 () at equeue.ml:166
0000014 0x000000000051cc59 in camlQueue__iter_1050 () at queue.ml:134
0000015 0x00000000004422d2 in camlEqueue__run_1072 () at equeue.ml:159
0000016 0x00000000004466f5 in camlUnixqueue_pollset__fun_3318 () at unixqueue_pollset.ml:999
0000017 0x000000000040d37f in camlHttp_mt__f_1037 () at http_mt.ml:34
0000018 0x0000000000506c99 in camlThread__fun_1081 () at thread.ml:37
0000019 0x00000000005699de in caml_start_program ()
0000020 0x0000000000909160 in ?? ()
0000021 0x0000000000000000 in ?? ()
A debugging session is active.

    Inferior 1 [process 7308] will be killed.

Quit anyway? (y or n) [answered Y; input not from terminal]
Tagspatch
Attached Filespatch file icon combined.patch [^] (13,043 bytes) 2013-10-06 13:44 [Show Content]

- Relationships

-  Notes
There are no notes attached to this issue.

- Issue History
Date Modified Username Field Change
2013-10-06 13:44 edwin New Issue
2013-10-06 13:44 edwin File Added: combined.patch
2014-02-19 15:34 doligez Tag Attached: patch


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker