New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make world.opt seems to crash on tip of trunk on up-to-date OS X #6239
Comments
Comment author: @alainfrisch Can you check that the version your tried is after revision 14294 from the upstream SVN? (i.e. check that asmrun/fail.c includes "callback.h") |
Comment author: @mshinwell Also, which version of Mac OS X is this? |
Comment author: @johnwhitington SVN 14302 builds fine on OS X 10.9 with the latest XCode. |
Comment author: yminsky I'm building with the latest xcode on 10.9. And the same box can build older versions, e.g., I built 4.00.1 on the same box after the build of trunk failed. I'm not sure what extra debug info would be helpful for tracking this down. It's clearly not an issue with all os x builds. |
Comment author: yminsky And I can confirm for Alain that it was exactly 14294 that I built. |
Comment author: @mshinwell I'm going to look at this on yminsky's machine. |
Comment author: @alainfrisch This should be fixed by commit 14307. |
Comment author: yminsky Trying the latest version (14307), I new get it to fail in a different place: ocaml-trunk $ lldb -- /Users/yminsky/Documents/code/ocaml-trunk/ocamlc.opt -nostdlib -I ../../stdlib -c -w +33..39 -warn-error A -g -nolabels unix.mli
|
Comment author: @mshinwell I'm trying to reproduce this now... |
Comment author: @mshinwell This is a horrid one. I couldn't reproduce it but then realized what's wrong: it's faulting because %rbp isn't 16-byte aligned on that 128-bit move in [large_malloc]. So it looks like this is very similar to mantis 5700. C functions have to be entered with %rsp mod 16 = 8. I have to go now, and I haven't yet identified exactly where this rule is being broken, but it should be enough for you (Alain!) to go on. My suspicion is that the assembly code of [caml_raise_exn] (and perhaps [caml_reraise_exn] in some cases) is being called with the wrong stack alignment. |
Comment author: @mshinwell I haven't managed to reproduce this yet. If anyone can reproduce it, please let me know. I expect to be able to get access to yminsky's machine in a couple of weeks. |
Comment author: @avsm I've successfully built trunk (r14390, remove camlp4) on OS X 10.9 and passed all tests with this gcc: $ gcc -v I've also tried a build with various Malloc options enabled to see if that'll make a difference, which it hasn't. Yaron, how much memory do you have in your laptop (mine is 8GB, so I should be in high memory too). $ env MallocScribble=1 MallocPreScribble=1 MallocGuardEdges=1 make world.opt Not sure what else to try to reproduce this one. |
Comment author: @damiendoligez I can't reproduce this problem on any of my Macs. I'm on r14310. |
Comment author: @damiendoligez XL found the bug. Xavier will explain the bug and post a patch soon. |
Comment author: @xavierleroy Consider: let f x = raise x Compile this with ocamlopt -g, and you'll see that the stack (initially = 8 mod 16) is not realigned to 0 mod 16 before calling caml_raise_exn. Why? because ocamlopt treats this function as a leaf function (!Proc.contains_calls = false) which does not need allocation of a proper stack frame. The criteria for a leaf function are pretty strict: it should
Can you spot the missing case? Yes, there is one: if the function contains a "raise" and is compiled with -g, a call to a C function (caml_stash_backtrace) can occur, so it must not be a tail function. This issue has been with us for a long time, but I believe it shows up only now because of Alain's recent optimization of constant exceptions. Before, raising such an exception would always allocate, causing the enclosing function to lose its leaf status. Now, we have more cases of useful functions that raise exceptions but don't allocate. The fix is pretty simple: set Proc.contains_calls to true if the function contains a "raise" (not of the "notrace" kind) and is compiled with -g. This fix is committed on SVN trunk, r14136, and a patch is attached. Please let us know if this fixes the crash; then, I'll port it to the 4.01 branch. |
Comment author: @avsm Confirmed the crash and the fix on OSX 10.9 and 4.02.0dev+trunk. |
Comment author: @johnwhitington Patch as applied in 14316 fixes the crash here. |
Comment author: @damiendoligez Confirmed the fix on OSX 10.7 with Xcode 4.6.3. |
Comment author: @xavierleroy Thanks for the confirmations. Fix also applied to 4.01 bugfix branch, r14320. Marking this PR as resolved. |
Original bug ID: 6239
Reporter: yminsky
Status: closed (set by @xavierleroy on 2015-12-11T18:25:21Z)
Resolution: fixed
Priority: normal
Severity: major
Fixed in version: 4.01.1+dev
Category: ~DO NOT USE (was: OCaml general)
Bug description
I don't know if others can reproduce this, but on my mac, trunk segfaults when you try to build world.opt. Here's the github id of the version I tried.
df7e6c1
Here's the error message I got:
../boot/ocamlrun ../ocamlopt -nostdlib -I ../stdlib -I ../utils -I ../parsing -I ../typing -I ../bytecomp -I ../asmcomp -I ../driver -I ../toplevel -o read_cmt.opt ../utils/misc.cmx ../utils/warnings.cmx ../utils/tbl.cmx ../utils/consistbl.cmx ../utils/config.cmx ../utils/clflags.cmx ../parsing/location.cmx ../parsing/longident.cmx ../parsing/lexer.cmx ../parsing/pprintast.cmx ../parsing/ast_helper.cmx ../parsing/ast_mapper.cmx ../typing/ident.cmx ../typing/path.cmx ../typing/types.cmx ../typing/typedtree.cmx ../typing/btype.cmx ../typing/subst.cmx ../typing/predef.cmx ../typing/datarepr.cmx ../typing/cmi_format.cmx ../typing/env.cmx ../typing/ctype.cmx ../typing/oprint.cmx ../typing/primitive.cmx ../typing/printtyp.cmx ../typing/mtype.cmx ../typing/envaux.cmx ../typing/typedtreeMap.cmx ../typing/typedtreeIter.cmx ../typing/cmt_format.cmx ../typing/stypes.cmx untypeast.cmx tast_iter.cmx cmt2annot.cmx read_cmt.cmx
cd ocamldoc && /Applications/Xcode.app/Contents/Developer/usr/bin/make opt.opt
/Users/yminsky/Documents/code/ocaml-trunk/ocamlopt.opt -nostdlib -I ../stdlib -pp ./remove_DEBUG -I ../parsing -I ../utils -I ../typing -I ../driver -I ../bytecomp -I ../tools -I ../toplevel/ -I ../stdlib -I ../otherlibs/str -I ../otherlibs/dynlink -I ../otherlibs/unix -I ../otherlibs/num -I ../otherlibs/graph -warn-error A -c odoc_config.ml
/bin/sh: line 1: 73228 Segmentation fault: 11 ${CAMLOPT_BIN} -nostdlib -I ../stdlib -pp './remove_DEBUG' -I ../parsing -I ../utils -I ../typing -I ../driver -I ../bytecomp -I ../tools -I ../toplevel/ -I ../stdlib -I ../otherlibs/str -I ../otherlibs/dynlink -I ../otherlibs/unix -I ../otherlibs/num -I ../otherlibs/graph -warn-error A -c odoc_config.ml
make[3]: *** [odoc_config.cmx] Error 139
make[2]: *** [ocamldoc.opt] Error 2
make[1]: *** [opt.opt] Error 2
make: *** [world.opt] Error 2
Steps to reproduce
I've attached the log of the build, as well as some stack-traces from re-running the failing command using lldb
Additional information
ocaml-trunk $ lldb -- /Users/yminsky/Documents/code/ocaml-trunk/ocamlopt.opt -nostdlib -I ../stdlib -pp ./remove_DEBUG -I ../parsing -I ../utils -I ../typing -I ../driver -I ../bytecomp -I ../tools -I ../toplevel/ -I ../stdlib -I ../otherlibs/str -I ../otherlibs/dynlink -I ../otherlibs/unix -I ../otherlibs/num -I ../otherlibs/graph -warn-error A -c odoc_config.ml
Current executable set to '/Users/yminsky/Documents/code/ocaml-trunk/ocamlopt.opt' (x86_64).
(lldb) run
Process 73259 launched: '/Users/yminsky/Documents/code/ocaml-trunk/ocamlopt.opt' (x86_64)
Process 73259 stopped
large_malloc + 50, queue = 'com.apple.main-thread, stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT) frame #0: 0x00007fff89c30d49 libsystem_malloc.dylib
large_malloc + 50libsystem_malloc.dylib`large_malloc + 50:
-> 0x7fff89c30d49: movaps %xmm0, -64(%rbp)
0x7fff89c30d4d: cmoveq %r13, %r14
0x7fff89c30d51: shlq %cl, %r14
0x7fff89c30d54: cmpq $134217727, %r14
(lldb) bt
large_malloc + 50, queue = 'com.apple.main-thread, stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT) frame #0: 0x00007fff89c30d49 libsystem_malloc.dylib
large_malloc + 50frame Fix README #1: 0x00007fff89c363b6 libsystem_malloc.dylib
szone_malloc_should_clear + 287 frame #2: 0x00007fff89c3887c libsystem_malloc.dylib
malloc_zone_malloc + 71frame Extend record punning to allow destructuring. #3: 0x00007fff89c39290 libsystem_malloc.dylib
malloc + 42 frame #4: 0x00000001001c7f6e ocamlopt.opt
caml_stat_alloc + 14frame Fix a typo in close_functions. #5: 0x00000001001c3f99 ocamlopt.opt
caml_init_frame_descriptors + 185 frame #6: 0x00000001001d9e03 ocamlopt.opt
caml_next_frame_descriptor + 35frame Extend ocamllex with actions before refilling #7: 0x00000001001d9efd ocamlopt.opt
caml_stash_backtrace + 93 frame #8: 0x00000001001da95e ocamlopt.opt
caml_raise_exn + 54frame FreeBSD 10 uses clang by default, with gcc not available by default #9: 0x000000010019d251 ocamlopt.opt
.L200 + 13 (lldb) frame select 4 frame #4: 0x00000001001c7f6e ocamlopt.opt
caml_stat_alloc + 14ocamlopt.opt
caml_stat_alloc + 14: -> 0x1001c7f6e: testq %rax, %rax 0x1001c7f71: je 0x1001c7f7a ; caml_stat_alloc + 26 0x1001c7f73: addq $8, %rsp 0x1001c7f77: popq %rbx (lldb) register read General Purpose Registers: rbx = 0x0000000000080000 rbp = 0x00007fff5fbff5a8 rsp = 0x00007fff5fbff598 r12 = 0x00007fff5fbff568 r13 = 0x00000001001da921 ocamlopt.opt
caml_start_program + 165r14 = 0x0000000000010000
r15 = 0x00000001010009a0
rip = 0x00000001001c7f6e ocamlopt.opt`caml_stat_alloc + 14
13 registers were unavailable.
File attachments
The text was updated successfully, but these errors were encountered: