Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0004348OCamlOCaml generalpublic2007-07-19 00:082007-10-09 15:33
Reporterjabial 
Assigned Toxleroy 
PrioritynormalSeverityblockReproducibilityalways
StatusclosedResolutionfixed 
PlatformOSOS Version
Product Version3.10.0 
Target VersionFixed in Version 
Summary0004348: Illegal instruction with ocamlopt on Mac Intel
DescriptionI think it may be stack misalignment, but I'm not sure
.
The source code is attached (compile with threads with ocamlopt under mac intel then run to reproduce).

Here is gdb output:
$ gdb ./taskpool.native
GNU gdb 6.3.50-20050815 (Apple version gdb-573) (Fri Oct 20 15:50:43 GMT 2006)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-apple-darwin"...Reading symbols for shared libraries .. done

(gdb) r
Starting program: /Users/jabial/Documents/LIP6/svn/Papers/PPOPP2008/benchmark/misc/ocaml-threads-event/taskpool/taskpool.native
Reading symbols for shared libraries . done

Program received signal EXC_BAD_INSTRUCTION, Illegal instruction/operand.
0x8fe12f94 in __dyld_stub_binding_helper_interface ()
(gdb) bt
#0 0x8fe12f94 in __dyld_stub_binding_helper_interface ()
#1 0x00000000 in ?? ()
(gdb) disassemble
Dump of assembler code for function __dyld_stub_binding_helper_interface:
0x8fe12f82 <__dyld_stub_binding_helper_interface+0>: push %ebp
0x8fe12f83 <__dyld_stub_binding_helper_interface+1>: mov %esp,%ebp
0x8fe12f85 <__dyld_stub_binding_helper_interface+3>: sub $0x60,%esp
0x8fe12f88 <__dyld_stub_binding_helper_interface+6>: mov %eax,8(%esp)
0x8fe12f8c <__dyld_stub_binding_helper_interface+10>: mov %ecx,12(%esp)
0x8fe12f90 <__dyld_stub_binding_helper_interface+14>: mov %edx,16(%esp)
0x8fe12f94 <__dyld_stub_binding_helper_interface+18>: movdqa %xmm0,32(%esp)
0x8fe12f9a <__dyld_stub_binding_helper_interface+24>: movdqa %xmm1,48(%esp)
0x8fe12fa0 <__dyld_stub_binding_helper_interface+30>: movdqa %xmm2,64(%esp)
0x8fe12fa6 <__dyld_stub_binding_helper_interface+36>: movdqa %xmm3,80(%esp)
0x8fe12fac <__dyld_stub_binding_helper_interface+42>: mov 4(%ebp),%eax
0x8fe12faf <__dyld_stub_binding_helper_interface+45>: mov %eax,0(%esp)
0x8fe12fb3 <__dyld_stub_binding_helper_interface+49>: mov 8(%ebp),%eax
0x8fe12fb6 <__dyld_stub_binding_helper_interface+52>: mov %eax,4(%esp)
0x8fe12fba <__dyld_stub_binding_helper_interface+56>: call 0x8fe05276 <__dyld__ZN4dyld14bindLazySymbolEPK11mach_headerPm>
0x8fe12fbf <__dyld_stub_binding_helper_interface+61>: mov %eax,8(%ebp)
0x8fe12fc2 <__dyld_stub_binding_helper_interface+64>: movdqa 32(%esp),%xmm0
0x8fe12fc8 <__dyld_stub_binding_helper_interface+70>: movdqa 48(%esp),%xmm1
0x8fe12fce <__dyld_stub_binding_helper_interface+76>: movdqa 64(%esp),%xmm2
0x8fe12fd4 <__dyld_stub_binding_helper_interface+82>: movdqa 80(%esp),%xmm3
0x8fe12fda <__dyld_stub_binding_helper_interface+88>: mov 8(%esp),%eax
0x8fe12fde <__dyld_stub_binding_helper_interface+92>: mov 12(%esp),%ecx
0x8fe12fe2 <__dyld_stub_binding_helper_interface+96>: mov 16(%esp),%edx
0x8fe12fe6 <__dyld_stub_binding_helper_interface+100>: add $0x60,%esp
0x8fe12fe9 <__dyld_stub_binding_helper_interface+103>: pop %ebp
0x8fe12fea <__dyld_stub_binding_helper_interface+104>: add $0x4,%esp
0x8fe12fed <__dyld_stub_binding_helper_interface+107>: ret
0x8fe12fee <__dyld_stub_binding_helper_interface+108>: add %al,(%eax)
End of assembler dump.
TagsNo tags attached.
Attached Files? file icon onetwothree.ml [^] (1,069 bytes) 2007-07-19 00:08 [Show Content]
? file icon taskpool.ml [^] (2,011 bytes) 2007-07-19 08:54 [Show Content]
patch file icon ocaml.patch [^] (1,264 bytes) 2007-07-24 11:09 [Show Content]

- Relationships

-  Notes
(0004107)
jabial (reporter)
2007-07-19 08:53
edited on: 2007-07-24 11:29

TO REPRODUCE:
- USE taskpool.ml, NOT onetwothree.ml
- COMPILE IT NATIVELY WITH OCAMLOPT on MAC INTEL
- RUN THE NATIVE CODE

(0004108)
jabial (reporter)
2007-07-19 09:20

It *is* stack misalignment.
(0004109)
jabial (reporter)
2007-07-19 09:21

WORKAROUND: set environment DYLD_BIND_AT_LAUNCH prior to running
(0004110)
jabial (reporter)
2007-07-19 11:44
edited on: 2007-07-19 15:47

I FOUND THE CULPRIT!
The problems occurs in caml_io_mutex_unlock_exn().
A quick fix (I trust the maintainers to add ifdefs to only do this for the mac intel platform) is: (unified diff follows)

(WRONG PATCH REMOVED)

(0004111)
jabial (reporter)
2007-07-19 14:34
edited on: 2007-07-19 16:42

Oops. Very obviously this has the risk of corrupting the stack, since it is stored in reverse order.

Correct patch is given below (OLDER VERSION REMOVED).

(0004112)
jabial (reporter)
2007-07-19 16:40
edited on: 2007-07-24 11:10

THIS WAS A TEMPORARY VERSION OF THE PATCH AND IS NOW SUPERSEDED BY THE .PATCH ATTACHMENT

NOT! Final version of the patch, which changes nothing to non-macintels (WARNING: this patch is for the .11+dev version, I haven't tested it with .10):

--- otherlibs/systhreads/posix.c 29 Jan 2007 12:11:17 -0000 1.55
+++ otherlibs/systhreads/posix.c 19 Jul 2007 14:38:43 -0000
@@ -277,7 +277,13 @@
 
 static void caml_io_mutex_unlock_exn(void)
 {
+#if defined(__APPLE__) && defined(__i386__)
+ __asm__("sub $0x08,%esp"); /* realign stack */
+#endif
   struct channel * chan = pthread_getspecific(last_channel_locked_key);
+#if defined(__APPLE__) && defined(__i386__)
+ __asm__("add $0x08,%esp"); /* restore stack */
+#endif
   if (chan != NULL) caml_io_mutex_unlock(chan);
 }

(0004113)
jabial (reporter)
2007-07-20 00:53

UPDATE : patch now tested with release version .10 (which has the same posix.c file as the dev version) ; it works.

I think the issue fixed is important enough that a minor revision should be created for the sake of the Mac crowd.
(0004115)
jabial (reporter)
2007-07-23 11:25
edited on: 2007-07-29 10:24

OK, this is just symptomatic treatment. The real problem is in assembly code generation ; stack should be realigned before or after jumping to caml_ml_array_bound_error, I'm looking into it.

EDIT: no, it should not, because since those functions never return, it would never be restored. I came back to doing it in caml_io_mutex_unlock_exn, but since I couldn't put the finger on *the* source of the misalignment, I made a patch (uploaded a few days ago) that works whatever the misalignment.

(0004117)
jabial (reporter)
2007-07-24 11:12
edited on: 2007-07-24 11:13

Attached the real patch, that will work whatever the misalignment. Optimized for speed not space : no test here, so it will consume at least 8 bytes of stack space, and even 16 if the stack is not misaligned.

WORKS WITH 3.10, 3.10+dev, 3.11+dev

(0004124)
jabial (reporter)
2007-07-29 10:27

For those who want to see the patch without downloading:

Index: otherlibs/systhreads/posix.c
===================================================================
RCS file: /caml/ocaml/otherlibs/systhreads/posix.c,v
retrieving revision 1.55
diff -u -r1.55 posix.c
--- otherlibs/systhreads/posix.c 29 Jan 2007 12:11:17 -0000 1.55
+++ otherlibs/systhreads/posix.c 24 Jul 2007 09:06:15 -0000
@@ -277,8 +277,22 @@
 
 static void caml_io_mutex_unlock_exn(void)
 {
+#if defined(__APPLE__) && defined(__i386__)
+ __asm__("subl $0x04,%esp\n\t" /* space to save the shift value */
+ "push %ebp\n\t" /* save ebp */
+ "movl %esp,%ebp\n\t" /* work with ebp */
+ "andl $0x0F,%ebp\n\t" /* modulo 16 */
+ "subl %ebp,%esp\n\t" /* realign the stack */
+ "movl %ebp,4(%esp)\n\t" /* save the shift value */
+ "addl %esp,%ebp\n\t" /* get back the unaligned stack address */
+ "movl (%ebp),%ebp"); /* restore ebp */
+#endif
   struct channel * chan = pthread_getspecific(last_channel_locked_key);
   if (chan != NULL) caml_io_mutex_unlock(chan);
+#if defined(__APPLE__) && defined(__i386__)
+ __asm__("addl 4(%esp),%esp\n\t" /* un-realign esp */
+ "addl $0x08,%esp"); /* give back the working stack space used */
+#endif
 }
 
 /* The "tick" thread fakes a SIGVTALRM signal at regular intervals. */
(0004128)
xleroy (administrator)
2007-08-01 16:10

The patch is too fragile, it depends a lot on gcc's code generation conventions.

Do you confirm that the problem was tracked down to the stack being unaligned when caml_array_bound_error is called? If so, I'd rather put the realignment code in caml_ml_array_bound_error.
(0004132)
jabial (reporter)
2007-08-01 19:19

The new patch does not in my opinion depend on gcc conventions, because it calculates the stack misalignment itself. The only "fragile" feature is the fact that the space for the "struct channel *" is allocated BEFORE (and not after) the inline assembly is inserted. This could easily be solved by putting "struct channel *chan;" before the inline assembly and "chan = ..." after it. The fact that gcc preserves stack alignements, when it is properly aligned, guarantess that from this point on there will not be any problem.

Since caml_array_bound_error calls caml_io_mutex_unlock_exn through caml_raise that will not return, how would you restore the stack afterwards?

Still, I agree that it would be better to correct stack misalignment at exception raising, but that would require restoring the unaligned stack at exiting the exception handler, be it through a user-supplied continuation (with in a try/with block) or failing to the default one that terminates the program properly, or else the program with undoubtfully segfault when trying to get something from the stack.
(0004172)
xleroy (administrator)
2007-10-09 15:33

Fixed in 3.10 development branch. (Re-align the stack in caml_ml_array_bound_error before calling caml_array_bound_error, which never returns normally anyway.)

- Issue History
Date Modified Username Field Change
2007-07-19 00:08 jabial New Issue
2007-07-19 00:08 jabial File Added: onetwothree.ml
2007-07-19 08:53 jabial Note Added: 0004107
2007-07-19 08:54 jabial File Added: taskpool.ml
2007-07-19 09:11 jabial Note Edited: 0004107
2007-07-19 09:20 jabial Note Added: 0004108
2007-07-19 09:20 jabial Note Edited: 0004107
2007-07-19 09:21 jabial Note Added: 0004109
2007-07-19 11:44 jabial Note Added: 0004110
2007-07-19 14:34 jabial Note Added: 0004111
2007-07-19 15:47 jabial Note Edited: 0004110
2007-07-19 16:40 jabial Note Added: 0004112
2007-07-19 16:42 jabial Note Edited: 0004111
2007-07-19 17:44 jabial Note Edited: 0004112
2007-07-20 00:53 jabial Note Added: 0004113
2007-07-23 11:25 jabial Note Added: 0004115
2007-07-24 11:09 jabial File Added: ocaml.patch
2007-07-24 11:10 jabial Note Edited: 0004112
2007-07-24 11:12 jabial Note Added: 0004117
2007-07-24 11:13 jabial Note Edited: 0004117
2007-07-24 11:29 jabial Note Edited: 0004107
2007-07-29 10:24 jabial Note Edited: 0004115
2007-07-29 10:27 jabial Note Added: 0004124
2007-08-01 16:10 xleroy Note Added: 0004128
2007-08-01 16:10 xleroy Assigned To => xleroy
2007-08-01 16:10 xleroy Status new => feedback
2007-08-01 19:19 jabial Note Added: 0004132
2007-10-09 15:33 xleroy Note Added: 0004172
2007-10-09 15:33 xleroy Status feedback => closed
2007-10-09 15:33 xleroy Resolution open => fixed


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker