Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0007609OCamltools (ocaml{lex,yacc,dep,debug,...})public2017-08-18 18:222017-09-23 18:12
Reporterpsteckler 
Assigned Toxleroy 
PrioritynormalSeveritymajorReproducibilityalways
StatusresolvedResolutionfixed 
PlatformPCOSLinux Mint x64 (Cinnamon)OS Version18
Product Version4.04.0 
Target Version4.06.0 +dev/beta1/beta2/rc1Fixed in Version4.06.0 +dev/beta1/beta2/rc1 
Summary0007609: Coq (coqtop.byte) hangs when run from ocamldebug
DescriptionA recent change in the Coq sources results in coqtop.byte to hang when run from ocamldebug. The startup banner appears, then ... nothing. For me, this behavior happens 100% of the time. Running coqtop.byte from the command line works fine.

The Coq sources are (of course) at:

  https://github.com/coq/coq [^]

A recent commit that shows the behavior is 16b0b83. A git bisect reveals that the behavior started with commit c3ca30ca41.

I provided a (yet-unmerged) PR to restore the old behavior at https://github.com/coq/coq/pull/977. [^]

All that the PR code does is run a Gc.minor and Gc.allocated_bytes at the toplevel of an OCaml module. Why that makes a difference isn't clear.

Maybe this is a bug in Coq (perhaps some C code is corrupting the heap in a way that a collection masks), but maybe it's something about how ocamldebug runs the byte code.

Perhaps the OCaml team has some insight.


Steps To ReproduceOn Linux x64, clone the Coq repo (probably the bug shows up using the tip, but you could reset to the commit mentioned above).

In the coq directory:

 ./configure -local
 make -j && make -j byte
 ocamldebug ./bin/coqtop.byte -coqlib `pwd`

If the bug is present, you'll see a startup banner, and the prompt will not appear.
TagsNo tags attached.
Attached Files

- Relationships

-  Notes
(0018188)
psteckler (reporter)
2017-08-18 21:36

I tried with OCaml 4.05.0 from OPAM, and I get the same hanging behavior.

Note: In my shell, I have "OCAMLRUNPARAM=b". After filing this bug report, I discovered that if I unset this environment variable, the hanging problem disappears.
(0018298)
xleroy (administrator)
2017-09-20 17:08

I was able to reproduce the hanging behavior with 4.05.0. ocamlrun loops in caml_flush because the channel is ill-formed and has channel->curr < channel->buff, hence caml_flush_partial makes no progress and returns false. Smells like memory corruption. However the issue is very hard to reproduce with ocamlrund running under gdb.
(0018299)
psteckler (reporter)
2017-09-20 18:47

Are there any experiments we can run to learn where the heap becomes corrupted?
(0018300)
xleroy (administrator)
2017-09-20 19:18

I was able to reproduce with ocamlrund under gdb. It's a use-after-free of the "dbg_out" channel, where the free() is caused by finalization after Pervasives.flush_all is called. I'll come up with a fix in a few days.
(0018303)
xleroy (administrator)
2017-09-21 09:30

The full scenario for the memory corruption:

- Debugged program starts, opens an out channel "dbg_out" to send data to ocamldebug via the debugging socket.
- Channel is created by caml_open_descriptor_in with refcount = 0 and is inserted in the doubly-linked list caml_all_opened_channels.
- Later, debugged program calls Pervasives.flush_all, which calls caml_ml_out_channels_list.
- caml_ml_out_channels_list wraps dbg_out in a Caml custom object of type "out_channel". dbg_out->refcount is increased to 1.
- Later, a GC finalizes this Caml custom object. dbg_out->refcount is decreased to 0 and the dbg_out structure is freed via caml_stat_free().
- Subsequent interactions with ocamldebug access the now-freed dbg_out structure. Disaster ensues.
(0018305)
xleroy (administrator)
2017-09-21 09:42

Fix is on its way: https://github.com/ocaml/ocaml/pull/1361 [^]
(0018312)
xleroy (administrator)
2017-09-23 18:12

Fixed in the forthcoming 4.06 release and in trunk.

- Issue History
Date Modified Username Field Change
2017-08-18 18:22 psteckler New Issue
2017-08-18 21:36 psteckler Note Added: 0018188
2017-09-20 17:08 xleroy Note Added: 0018298
2017-09-20 17:08 xleroy Status new => confirmed
2017-09-20 17:08 xleroy Target Version => 4.06.0 +dev/beta1/beta2/rc1
2017-09-20 18:47 psteckler Note Added: 0018299
2017-09-20 19:18 xleroy Note Added: 0018300
2017-09-20 19:18 xleroy Assigned To => xleroy
2017-09-20 19:18 xleroy Status confirmed => assigned
2017-09-21 09:30 xleroy Note Added: 0018303
2017-09-21 09:42 xleroy Note Added: 0018305
2017-09-23 18:12 xleroy Note Added: 0018312
2017-09-23 18:12 xleroy Status assigned => resolved
2017-09-23 18:12 xleroy Resolution open => fixed
2017-09-23 18:12 xleroy Fixed in Version => 4.06.0 +dev/beta1/beta2/rc1


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker