Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remotely debugging multi-threaded bytecode program causes segmentation fault. #4538

Closed
vicuna opened this issue Apr 23, 2008 · 6 comments
Closed

Comments

@vicuna
Copy link

vicuna commented Apr 23, 2008

Original bug ID: 4538
Reporter: jsk
Assigned to: @damiendoligez
Status: assigned (set by @mshinwell on 2017-03-10T10:13:03Z)
Resolution: open
Priority: normal
Severity: crash
Version: 3.10.2
Target version: later
Category: tools (ocaml{lex,yacc,dep,debug,...})
Monitored by: ogasawara jsk

Bug description

=== Operating System ===

Type: Ubuntu Linux
Version: 7.10
Platform: x86

=== OCaml Installation ===

Version: 3.10.2
Build: x86

=== Summary of Fault ===

When the attached program is compiled as a bytecode executable, remotely
debugging the program causes it to terminate with a segmentation fault,
when stepping through manually on the debugger.

=== Reproduction Steps ===

  1. Compile the attached program with ocamlc:

    ocamlc -custom -thread -g unix.cma threads.cma test.ml -o test.exe

  2. Start ocamldebug in remote debugging mode (manual loading):

    ocamldebug -s test.exe
    set loadingmode manual
    goto 0

  3. Manually start the program (possibly on another machine):

    CAML_DEBUG_SOCKET=<socket_name> ./test.exe

  4. Repeatedly issue the following command to ocamldebug:

    step

=== Fault Description ===

After some number of steps (different number each time), the executable
(test.exe) terminates with a "Segmentation Fault (core dumped)" message
(See the attached core dump files). Analysis of the core dumps with gdb
invariably points to the following location:

Program terminated with signal 11, Segmentation fault.
#0 0x08073570 in caml_interprete (prog=0x80ad1e8, prog_size=48680) at interp.c:284
284 curr_instr = *pc++;
(gdb) backtrace
#0 0x08073570 in caml_interprete (prog=0x80ad1e8, prog_size=48680) at interp.c:284
#1 0x0805bd3f in caml_main (argv=0xbf83daa4) at startup.c:414
#2 0x0805befb in main (argc=1, argv=0xbf83daa4) at main.c:56

File attachments

@vicuna
Copy link
Author

vicuna commented Apr 23, 2008

Comment author: jsk

Some further info:

By modifying the lines between "Mutex.lock" and "Mutex.unlock", it's possible to coerce ocamlrun into producing various errors (other than segmentation faults).

For example, replace the pair of print_string statements with the following statement:

Printf.printf "Thread %n Iteration %n"
    (Thread.id (Thread.self ())) i;

On my system, if this change is made, then the segmentation fault goes away, but only to be replaced with the following error:

Fatal error: bad opcode (65726854)

Cheers

Jonathan


Jonathan Knowles
Citrix Systems Research & Development

@vicuna
Copy link
Author

vicuna commented Aug 4, 2008

Comment author: @damiendoligez

reproduced with 3.11+dev14 on Mac OS X 10.5.4

@vicuna
Copy link
Author

vicuna commented Apr 20, 2010

Comment author: @damiendoligez

the bug is still here in 3.11.2 and 3.12.0+dev17 [Mac OS 10.6.3]

@vicuna
Copy link
Author

vicuna commented Jul 6, 2012

Comment author: @damiendoligez

Note that the debugger uses the fork() system call to do checkpointing of the process. The interactions between fork() and threads are tricky and platform-dependent, so it's not really a surprise that the debugger would fail on a multi-threaded program. Worse, it's not clear that we can do anything about it.

@vicuna
Copy link
Author

vicuna commented Mar 10, 2017

Comment author: @mshinwell

If ocamldebug is not supposed to work on threaded programs, can we add that to the manual (or add some warning in the code), and then close this rather old issue?

@github-actions
Copy link

This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants