Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segfault on windows in dynlink'd module's entry routine #7603

Closed
vicuna opened this issue Aug 8, 2017 · 16 comments
Closed

segfault on windows in dynlink'd module's entry routine #7603

vicuna opened this issue Aug 8, 2017 · 16 comments
Assignees
Milestone

Comments

@vicuna
Copy link

vicuna commented Aug 8, 2017

Original bug ID: 7603
Reporter: dwight.guth
Assigned to: @alainfrisch
Status: resolved (set by @alainfrisch on 2017-10-04T14:15:09Z)
Resolution: fixed
Priority: normal
Severity: crash
Platform: Windows Cygwin
Version: 4.04.2
Target version: 4.06.0 +dev/beta1/beta2/rc1
Category: platform support (windows, cross-compilation, etc)

Bug description

I have an OCAML program that I am trying to port to Windows (using the cygwin version of OCAML), and when it executes, it segfaults. I haven't been able to completely track down the cause of the error, but below is the stack trace in gdb, and I have been able to determine that adjusting the inlining settings seems to be capable of suppressing the bug, so I suspect that the issue is in the middle-end somewhere. I tried reducing the program further but it seems like the bug seems to only arise when the program reaches a certain point of complexity, so unfortunately the steps to reproduce the issue are a little involved.

#0 0x62c7e1c8 in camlRealdef__entry () from /home/dwightguth/rv-match/c-semantics/semantics/x86-gcc-limited-libc/cpp14-translation-kompiled/cpp14-translation-kompiled/program/realdef.cmxs
#1 0x0051d8e9 in caml_start_program ()
#2 0x00519617 in caml_callback ()
#3 0x0051d520 in caml_natdynlink_run ()
#4 0x0049ad25 in camlDynlink__fun_3854 () at dynlink.mlopt:176
#5 0x004a3739 in camlList__iter_1258 () at list.ml:77
#6 0x0049b285 in camlDynlink__loadunits_3569 () at dynlink.mlopt:179
#7 0x0049b315 in camlDynlink__load_3582 () at dynlink.mlopt:185
#8 0x00404f19 in camlPlugin__load_2861 ()
#9 0x00404e5b in camlInterpreter__entry ()
#10 0x00401fd0 in caml_startup.code_begin ()
#11 0x0051d8e9 in caml_start_program ()
#12 0x00506ef5 in caml_main ()
#13 0x0051e599 in main ()

Steps to reproduce

  1. install ocaml on Cygwin with ocamlfind and zarith and mlgmp packages

$ tar xvf program.tar.gz
$ cd program
$ make
$ ./interpreter.exe ./realdef.cma

This should complete successfully, but instead it triggers a segmentation fault.

File attachments

@vicuna
Copy link
Author

vicuna commented Aug 8, 2017

Comment author: @dra27

Which version of Cygwin are you using? (

cygcheck -dc cygwin
)

@vicuna
Copy link
Author

vicuna commented Aug 8, 2017

Comment author: dwight.guth

Cygwin Package Information
Package Version
cygwin 2.8.2-1

@vicuna
Copy link
Author

vicuna commented Aug 9, 2017

Comment author: @dra27

I'm able to reproduce this. The program doesn't segfault if everything is compiled with -g and bytecode appears to work (regardless of -g)

@vicuna
Copy link
Author

vicuna commented Aug 9, 2017

Comment author: dwight.guth

for what it's worth, I think the fact that it doesn't reproduce with -g might be a red herring. I made the file a bit larger by uncommenting some of the code I commented out (lines 23656-24377 of realdef.ml), and I was able to reproduce it both with and without -g. It seems to have some relation to the size of the code generated...

@vicuna
Copy link
Author

vicuna commented Sep 30, 2017

Comment author: @xavierleroy

Any chance of understanding this bug in the coming weeks? Should it be tentatively scheduled for 4.06 or not?

@vicuna
Copy link
Author

vicuna commented Oct 4, 2017

Comment author: @alainfrisch

I'm fighting a bit to install mlgmp. Is http://www-verimag.imag.fr/~monniaux/download/mlgmp_20120224.tar.gz the most up-to-date version?

@vicuna
Copy link
Author

vicuna commented Oct 4, 2017

Comment author: @alainfrisch

Ok, in case someone else needs to do install mlgmp:

  • You need to "make clean" to remove build artefact shipped in the source distribution.

  • Replace int32 -> int32_t, uint32 -> uint32_t in config.h.

  • To compile with 4.06 or trunk, add "-unsafe-string" to OCAMLFLAGS in Makefile.

  • "make install", then move the target directory into lib/ocaml/site-lib/gmp, and a META file with these two lines:

archive(byte) = "gmp.cma"
archive(native) = "gmp.cmxa"

@vicuna
Copy link
Author

vicuna commented Oct 4, 2017

Comment author: @alainfrisch

I was able to reproduce the segfault.

@vicuna
Copy link
Author

vicuna commented Oct 4, 2017

Comment author: @alainfrisch

Observation: moving the code out of the Def module to the toplevel in realdef.ml apparently eliminates the segfault.

@vicuna
Copy link
Author

vicuna commented Oct 4, 2017

Comment author: @alainfrisch

To simplify further investigation, I've uploaded program2.tar.gz, which depends only on a plain distribution of OCaml (not findlib, zarith, gmp). Also, in this version of the reproduction case, the main program is trivial and all the code is in the dynlinked code. Interestingly, inlining the content of prelude.ml into realdef.mf eliminates the segfault.

@vicuna
Copy link
Author

vicuna commented Oct 4, 2017

Comment author: @alainfrisch

Reproduced with the mingw port (32-bit).

@vicuna
Copy link
Author

vicuna commented Oct 4, 2017

Comment author: @alainfrisch

Ah, with the msvc port (32-bit), I get:

ocamlopt -w -A -shared -o realdef.cmxs constants.ml prelude.ml realdef.ml
dyndll944c5f.obj : fatal error LNK1183: invalid or corrupt file: extended relocation count 21341 less than 65535
** Fatal error: Error during linking

@vicuna
Copy link
Author

vicuna commented Oct 4, 2017

Comment author: @alainfrisch

I think I got it, this is a problem in flexdll (namely: the rewriting performed by flexdll reduce the number of relocations from above 65535 to below this limit, but the corresponding flag that indicates more than 65535 relocations was not reset).

@vicuna
Copy link
Author

vicuna commented Oct 4, 2017

Comment author: @alainfrisch

Fix on the flexdll side by:

ocaml/flexdll@6f7fc35

No change required on the ocaml side.

For once, link.exe was more helpful than ld, which happily accepted the invalid COFF file generated by flexdll.

@vicuna vicuna closed this as completed Oct 4, 2017
@vicuna
Copy link
Author

vicuna commented Oct 4, 2017

Comment author: @dra27

Is it worth adding the testcase to prevent regressions?

@vicuna
Copy link
Author

vicuna commented Oct 4, 2017

Comment author: @alainfrisch

It's really a large piece of code, and flexdll doesn't currently has any such non-regression test (not that it would be useless to have some). As for OCaml's testsuite, I don't think that would be the place to put it (esp. since using any currently released version of flexdll would trigger the bug).

@vicuna vicuna added this to the 4.06.0 milestone Mar 14, 2019
@vicuna vicuna added the bug label Mar 20, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants