New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Illegal instruction with ocamlopt on Mac Intel #4348
Comments
Comment author: jabial TO REPRODUCE:
|
Comment author: jabial It is stack misalignment. |
Comment author: jabial WORKAROUND: set environment DYLD_BIND_AT_LAUNCH prior to running |
Comment author: jabial I FOUND THE CULPRIT! (WRONG PATCH REMOVED) |
Comment author: jabial Oops. Very obviously this has the risk of corrupting the stack, since it is stored in reverse order. Correct patch is given below (OLDER VERSION REMOVED). |
Comment author: jabial THIS WAS A TEMPORARY VERSION OF THE PATCH AND IS NOW SUPERSEDED BY THE .PATCH ATTACHMENT NOT! Final version of the patch, which changes nothing to non-macintels (WARNING: this patch is for the .11+dev version, I haven't tested it with .10): --- otherlibs/systhreads/posix.c 29 Jan 2007 12:11:17 -0000 1.55 static void caml_io_mutex_unlock_exn(void)
|
Comment author: jabial UPDATE : patch now tested with release version .10 (which has the same posix.c file as the dev version) ; it works. I think the issue fixed is important enough that a minor revision should be created for the sake of the Mac crowd. |
Comment author: jabial OK, this is just symptomatic treatment. The real problem is in assembly code generation ; stack should be realigned before or after jumping to caml_ml_array_bound_error, I'm looking into it. EDIT: no, it should not, because since those functions never return, it would never be restored. I came back to doing it in caml_io_mutex_unlock_exn, but since I couldn't put the finger on the source of the misalignment, I made a patch (uploaded a few days ago) that works whatever the misalignment. |
Comment author: jabial Attached the real patch, that will work whatever the misalignment. Optimized for speed not space : no test here, so it will consume at least 8 bytes of stack space, and even 16 if the stack is not misaligned. WORKS WITH 3.10, 3.10+dev, 3.11+dev |
Comment author: jabial For those who want to see the patch without downloading: Index: otherlibs/systhreads/posix.cRCS file: /caml/ocaml/otherlibs/systhreads/posix.c,v static void caml_io_mutex_unlock_exn(void)
+#endif
+#endif /* The "tick" thread fakes a SIGVTALRM signal at regular intervals. */ |
Comment author: @xavierleroy The patch is too fragile, it depends a lot on gcc's code generation conventions. Do you confirm that the problem was tracked down to the stack being unaligned when caml_array_bound_error is called? If so, I'd rather put the realignment code in caml_ml_array_bound_error. |
Comment author: jabial The new patch does not in my opinion depend on gcc conventions, because it calculates the stack misalignment itself. The only "fragile" feature is the fact that the space for the "struct channel *" is allocated BEFORE (and not after) the inline assembly is inserted. This could easily be solved by putting "struct channel *chan;" before the inline assembly and "chan = ..." after it. The fact that gcc preserves stack alignements, when it is properly aligned, guarantess that from this point on there will not be any problem. Since caml_array_bound_error calls caml_io_mutex_unlock_exn through caml_raise that will not return, how would you restore the stack afterwards? Still, I agree that it would be better to correct stack misalignment at exception raising, but that would require restoring the unaligned stack at exiting the exception handler, be it through a user-supplied continuation (with in a try/with block) or failing to the default one that terminates the program properly, or else the program with undoubtfully segfault when trying to get something from the stack. |
Comment author: @xavierleroy Fixed in 3.10 development branch. (Re-align the stack in caml_ml_array_bound_error before calling caml_array_bound_error, which never returns normally anyway.) |
Original bug ID: 4348
Reporter: jabial
Assigned to: @xavierleroy
Status: closed (set by @xavierleroy on 2007-10-09T13:33:31Z)
Resolution: fixed
Priority: normal
Severity: block
Version: 3.10.0
Category: ~DO NOT USE (was: OCaml general)
Monitored by: jabial n8gray
Bug description
I think it may be stack misalignment, but I'm not sure
.
The source code is attached (compile with threads with ocamlopt under mac intel then run to reproduce).
Here is gdb output:
$ gdb ./taskpool.native
GNU gdb 6.3.50-20050815 (Apple version gdb-573) (Fri Oct 20 15:50:43 GMT 2006)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-apple-darwin"...Reading symbols for shared libraries .. done
(gdb) r
Starting program: /Users/jabial/Documents/LIP6/svn/Papers/PPOPP2008/benchmark/misc/ocaml-threads-event/taskpool/taskpool.native
Reading symbols for shared libraries . done
Program received signal EXC_BAD_INSTRUCTION, Illegal instruction/operand.
0x8fe12f94 in __dyld_stub_binding_helper_interface ()
(gdb) bt
#0 0x8fe12f94 in __dyld_stub_binding_helper_interface ()
#1 0x00000000 in ?? ()
(gdb) disassemble
Dump of assembler code for function __dyld_stub_binding_helper_interface:
0x8fe12f82 <__dyld_stub_binding_helper_interface+0>: push %ebp
0x8fe12f83 <__dyld_stub_binding_helper_interface+1>: mov %esp,%ebp
0x8fe12f85 <__dyld_stub_binding_helper_interface+3>: sub $0x60,%esp
0x8fe12f88 <__dyld_stub_binding_helper_interface+6>: mov %eax,8(%esp)
0x8fe12f8c <__dyld_stub_binding_helper_interface+10>: mov %ecx,12(%esp)
0x8fe12f90 <__dyld_stub_binding_helper_interface+14>: mov %edx,16(%esp)
0x8fe12f94 <__dyld_stub_binding_helper_interface+18>: movdqa %xmm0,32(%esp)
0x8fe12f9a <__dyld_stub_binding_helper_interface+24>: movdqa %xmm1,48(%esp)
0x8fe12fa0 <__dyld_stub_binding_helper_interface+30>: movdqa %xmm2,64(%esp)
0x8fe12fa6 <__dyld_stub_binding_helper_interface+36>: movdqa %xmm3,80(%esp)
0x8fe12fac <__dyld_stub_binding_helper_interface+42>: mov 4(%ebp),%eax
0x8fe12faf <__dyld_stub_binding_helper_interface+45>: mov %eax,0(%esp)
0x8fe12fb3 <__dyld_stub_binding_helper_interface+49>: mov 8(%ebp),%eax
0x8fe12fb6 <__dyld_stub_binding_helper_interface+52>: mov %eax,4(%esp)
0x8fe12fba <__dyld_stub_binding_helper_interface+56>: call 0x8fe05276 <__dyld__ZN4dyld14bindLazySymbolEPK11mach_headerPm>
0x8fe12fbf <__dyld_stub_binding_helper_interface+61>: mov %eax,8(%ebp)
0x8fe12fc2 <__dyld_stub_binding_helper_interface+64>: movdqa 32(%esp),%xmm0
0x8fe12fc8 <__dyld_stub_binding_helper_interface+70>: movdqa 48(%esp),%xmm1
0x8fe12fce <__dyld_stub_binding_helper_interface+76>: movdqa 64(%esp),%xmm2
0x8fe12fd4 <__dyld_stub_binding_helper_interface+82>: movdqa 80(%esp),%xmm3
0x8fe12fda <__dyld_stub_binding_helper_interface+88>: mov 8(%esp),%eax
0x8fe12fde <__dyld_stub_binding_helper_interface+92>: mov 12(%esp),%ecx
0x8fe12fe2 <__dyld_stub_binding_helper_interface+96>: mov 16(%esp),%edx
0x8fe12fe6 <__dyld_stub_binding_helper_interface+100>: add $0x60,%esp
0x8fe12fe9 <__dyld_stub_binding_helper_interface+103>: pop %ebp
0x8fe12fea <__dyld_stub_binding_helper_interface+104>: add $0x4,%esp
0x8fe12fed <__dyld_stub_binding_helper_interface+107>: ret
0x8fe12fee <__dyld_stub_binding_helper_interface+108>: add %al,(%eax)
End of assembler dump.
File attachments
The text was updated successfully, but these errors were encountered: