Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0006038OCamlOCaml runtime systempublic2013-06-11 14:222014-06-02 19:14
Reporterkik 
Assigned Toxleroy 
PriorityhighSeverityminorReproducibilityalways
StatusresolvedResolutionfixed 
Platformi386OSLinux MintOS Version15
Product Version3.12.1 
Target Version4.02.0+devFixed in Version4.02.0+dev 
Summary0006038: segmentation fault in native function built by clang (misaligned stack)
DescriptionI'm using ocaml binding of llvm on i386 Linux.
But it crashes with segmentation fault on the same position every time.

I noticed:

* it crashes if llvm library is built by clang++ with optimization.
* there is "movaps xmm0, ofs(%esp)" opcode.
* stack pointer is misaligned.

It seems clang++ expects aligned stack but ocaml runtime does not align stack.


I do not know i386 Linux ABI requests it, but I think passing aligned stack from
 ocaml runtime removes surprising result.
Steps To Reproduce/* simple example generates "movaps" and request stack alignment */
/* compile: clang -O2 -c foo.c */
#include <stdio.h>
#include <caml/mlvalues.h>
#include <caml/memory.h>
#include <caml/alloc.h>
#include <caml/custom.h>

struct X
{
  int a, b, c, d;
};

__attribute__((noinline))
int foo(const struct X *x)
{
  printf("in foo\n");
  return x->a;
}

value bar(value unit)
{
  struct X x = { 0, 0, 0, 0 };
  CAMLparam1(unit);
  CAMLreturn(Val_int(foo(&x)));
}
Tagspatch
Attached Filespatch file icon 0001-PR-6038.patch [^] (22,573 bytes) 2013-06-19 17:13 [Show Content]
patch file icon 0001-Align-the-whole-stack-on-16-byte-boundaries-if-we-ca.patch [^] (38,616 bytes) 2013-07-09 23:57 [Show Content]

- Relationships

-  Notes
(0009458)
jacques-henri.jourdan (manager)
2013-06-12 00:16

This is a common problem in x86 C compilers : it is more or less a consensus that in the "modern" x86 ABI, stack should be 16-byte aligned :

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40838 [^]

I think OCaml should try to make sure stack is aligned when calling C code.

I can imagine of at least two ways to do that :

(1) As in macosx, ocamlopt keep the stack aligned, so that C functions are calld with an aligned stack
(2) We add the force_align_arg_pointer attribute to all C primitives (by using the CAMLprim macro), so that the C compiler knows that it has to generate code to realign the stack. There is few other C functions that also need to have this attribute.

(1) uses more stack space, but I do not know whether the attribute proposed in (2) is available on all the C compilers we need.

However, all this is my own opinion, and it does not count much, as I am not an ocamlopt expert.

JH
(0009460)
xleroy (administrator)
2013-06-12 14:30

Linux/x86-32 bits is supposed to comply to the SVR4/ELF x86-32 ABI, which states that the stack is always 4-aligned, but does not require 16-byte alignment.

AFAIK, the only x86-32 ABI that requires 16-alignment for the stack is MacOS X.

ocamlopt generates code that conforms to the corresponding ABIs. All the support for 16-alignment of the stack is there, but it is activated only for MacOS X.

Finally, keep in mind that bigger stack alignment has a cost, in terms of wasted stack space. For deeply recursive functions, this wasted space can cause a stack overflow. So, I'm not in favor of increasing alignment if there is no good reason.

Bottom line: I don't think a x86-32 compiler for Linux is ever allowed to assume 16-alignment on the stack pointer. If "clang -O2" does assume this, I'd say it is a clang/llvm problem.
(0009461)
jacques-henri.jourdan (manager)
2013-06-12 16:02

> Linux/x86-32 bits is supposed to comply to the SVR4/ELF x86-32 ABI, which states that the stack is always 4-aligned, but does not require 16-byte alignment.

Indeed. However, it seems like there is a consensus among C compiler writers on the fact that the stack should be 16-aligned. This has been a debate on the GCC bugtracker (see the link above), and the statu quo is that both GCC and LLVM rely on it. The glibc has been updated for this purpose.

> Finally, keep in mind that bigger stack alignment has a cost, in terms of wasted stack space. For deeply recursive functions, this wasted space can cause a stack overflow. So, I'm not in favor of increasing alignment if there is no good reason.

I agree. Moreover, 16-aligning the OCaml stack might not be worth it, because Ocaml mainly does 4-bytes memory accesses. However, we do not need to 16-align the OCaml stack: we could add the force_align_arg_pointer attribute in the CAMLprim macro. It forces the C compiler not to assume that the stack is 16-aligned. It is available in not-too-old versions of LLVM and GCC.
(0009462)
frisch (developer)
2013-06-12 16:27

> we could add the force_align_arg_pointer attribute in the CAMLprim macro

This would force all external stub libraries to use CAMLprim. It seems that the manual indeed says this is required, but since CAMLprim does nothing, I'm not sure all stub libraries actually follow this convention (admittedly, they should be fixed).

> For deeply recursive functions, this wasted space can cause a stack overflow.

It is conceivable (and cheap) to fix the stack alignment only around calls to C functions? I don't think the case of deeply recursive functions going through C matters in practice.
(0009468)
jacques-henri.jourdan (manager)
2013-06-12 18:18

> This would force all external stub libraries to use CAMLprim. It seems that the manual indeed says this is required, but since CAMLprim does nothing, I'm not sure all stub libraries actually follow this convention (admittedly, they should be fixed).

If such a stub does not contain CAMLprim, it is bogus (what's the point of CAMLprim if it is not used ?). Note that anyway, those stubs will be as likely to crash as now, because nothing is done currently.

> It is conceivable (and cheap) to fix the stack alignment only around calls to C functions? I don't think the case of deeply recursive functions going through C matters in practice.

While possible, I am not sure it is that easy: you will have to realign the stack before pushing arguments, and restore the stack pointer after the call. However, the compiled code for evaluating the arguments can depend on the current offset between the current frame and the stack pointer, which is unknown statically...
(0009470)
frisch (developer)
2013-06-12 18:23

> (what's the point of CAMLprim if it is not used ?)

Currently, the only effect of CAMLprim is within the runtime system itself (it serves as a marker to collect the list of built-in primitives with sed).
(0009477)
kik (reporter)
2013-06-13 11:28

Thanks,

> Bottom line: I don't think a x86-32 compiler for Linux is ever allowed to assume 16-alignment on the stack pointer. If "clang -O2" does assume this, I'd say it is a clang/llvm problem.

I checked my clang binary and source code. The binary is from Ubuntu package. So, I think it is properly configured.
The stack setting is written in

https://github.com/llvm-mirror/clang/blob/master/lib/Basic/Targets.cpp#L2860 [^]

It seems x86_32 default is "-S128" (128bit stack) and only WindowsX86_32TargetInfo uses "-S32".
I think this is intentional.
(0009478)
frisch (developer)
2013-06-13 11:31

> If such a stub does not contain CAMLprim, it is bogus (what's the point of CAMLprim if it is not used ?). Note that anyway, those stubs will be as likely to crash as now, because nothing is done currently.

This is true, but since Linux x86-32 is an exotic architecture these days, it is unlikely that authors of stubs will notice the problem. If this solution is picked (instead of having ocamlopt enforce the alignment), it will be important to make it widely known that CAMLprim is now required.
(0009484)
xleroy (administrator)
2013-06-14 10:21
edited on: 2013-06-14 10:22

> It is conceivable (and cheap) to fix the stack alignment only around calls to C functions?

No, as far as I can see. This would require extensive changes in the i386 code generator, plus an extra register must be reserved (in the code that pushes the arguments to the call) to act as an alternate stack pointer.

The possible workarounds seem to be:
1- Compile the stub code for the C libraries in question with GCC flag -mstackrealign
2- Put attribute force_align_arg_pointer in CAMLprim as suggested by Jacques-Henri and encourage authors of stub code to use CAMLprim
3- Impose 16-alignment of the stack throughout all OCaml-generated code, like we currently do for i386-macosx.

#2 looks like the least intrusive solution, but I wonder how much of a speed penalty this would be for the OCaml runtime itself.

(0009485)
xleroy (administrator)
2013-06-14 13:30

Replying to myself:

> I wonder how much of a speed penalty [solution #2] would be for the OCaml runtime itself.

After examination of the code generated by GCC, the answer is: probably negligible. (Just one "and" instruction. GCC already uses a frame pointer, so that cost is already paid.)
(0009546)
doligez (administrator)
2013-06-18 13:26

Looks like the best solution is Xavier's option (2): add force_align_arg_pointer to CAMLprim, with all the necessary incantations in the configure script to make sure it's only used with C compilers that know it.
Who wants to provide a patch?

Out of curiosity, there's still one unanswered question: why does this affect clang and not gcc if both are supposed to assume 16-byte alignment of the stack?
(0009556)
kik (reporter)
2013-06-19 11:35

> Out of curiosity, there's still one unanswered question: why does this affect clang and not gcc if both are supposed to assume 16-byte alignment of the stack?

Because use of SSE instructions is enabled default on clang but optional on gcc.
(0009567)
jacques-henri.jourdan (manager)
2013-06-19 17:22

I submitted a patch that implements this solution.

I used a patched version of GCC that produces executables that crashes when entering a function with an unaligned stack. It appeared that some C callbacks were not annotated with CAMLprim (even in the Ocaml runtime). I have corrected many of them, but it may be the case there are still some of them.

I do not have a reliable way to test in the configure script whether the force_align_arg_pointer attribute is actually taken into account for the system compiler: I just tests it does not produce an error. With gcc, unknown attribute just produce warnings.

I think it could be a good idea to use this method for the macosx architecture: indeed, it would remove the performance penalty of keeping the Ocaml stack aligned. What do you think ?
(0009571)
xleroy (administrator)
2013-06-19 19:38

> I do not have a reliable way to test in the configure script whether the force_align_arg_pointer attribute is actually taken into account for the system compiler

Googling suggests

#if defined(__i386__) && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ > 1))

which seems to work for GCC and for Clang on my Mac OS X 10.8 ("Apple LLVM version 4.2"). Maybe this version test is as reliable or more than your configure-time test.
(0009617)
jacques-henri.jourdan (manager)
2013-06-26 04:46

The problem with such a test is that it is indirect : some versions of clang will support the attribute, but won't be accepted by this test.

I would prefer using a conservative approach where we put the attribute when we know it does not generates an error (while still being able to generate a warning if the compiler is not recent enough).

As an other approach, we could test the attribute support using something like:

__attribute__((force_align_arg_pointer)) int test_realign() {
  int x __attribute__((aligned(16)));
  return (int)&x;
}

int main() {
  int i;
  for(i = 0; i < 4; i++) {
    int res;
    __asm__(
      "sub %1, %%esp\n\t"
      "call *%2\n\t"
      "add %1, %%esp\n\t"
      : "=&a" (res)
      : "r" (i*4), "r" (test_realign)
      : "ecx", "edx");
      if(res & 15)
        return 1;
  }
  return 0;
}

If this test passes, we use the attribute.

Concerning the use of this attribute on macosx, I suggest using the previous test in order to deactivate ocaml stack alignment when possible. However, I do not have an Apple computer to do tests...
(0009739)
jacques-henri.jourdan (manager)
2013-07-10 00:11

I submitted a new version of the patch using this method to check alignment. If the C compiler does not support the attribute, it maintains the whole stack 16-aligned (using the existing macosx code).

I also added some more missing CAMLprims in libraries.

Given that we expect CAMLprim to be missing in some other code, and that macosx is very sensitive to misalignment, the stack is still kept aligned in this OS.

There is some specific code for mingw32, that should work, but I havn't tested it.
(0010782)
Richard Jones (reporter)
2014-01-06 20:34

This does also affect gcc. See my comments here:
http://caml.inria.fr/mantis/view.php?id=5700#c10779 [^]
(0011579)
shinwell (developer)
2014-05-28 14:41

Am I right in thinking that we still need to fix this problem in OCaml?

It does appear that gcc has been presuming 16-byte alignment on x86-32 for some time.
(0011580)
Richard Jones (reporter)
2014-05-28 14:47

Well it definitely affects Fedora. For a long time I've used this hack as a workaround. It would be good to have a proper fix in the compiler itself.

http://pkgs.fedoraproject.org/cgit/ocaml.git/tree/ocaml.spec#n274 [^]
(0011630)
xleroy (administrator)
2014-06-02 19:14

Fixed in version/4.02 (commit 14943) and on trunk (commit 14944) by forcing 16-alignment of stack for all x86-32 platforms except Win32/MSVC. It's not like the GCC and Clang developers left us much choice. Enjoy the new stack consumption. For the record, I still morally object to compiler writers violating ABIs, instead of evolving ABI standard documents.

- Issue History
Date Modified Username Field Change
2013-06-11 14:22 kik New Issue
2013-06-12 00:16 jacques-henri.jourdan Note Added: 0009458
2013-06-12 14:30 xleroy Note Added: 0009460
2013-06-12 14:30 xleroy Status new => feedback
2013-06-12 16:02 jacques-henri.jourdan Note Added: 0009461
2013-06-12 16:27 frisch Note Added: 0009462
2013-06-12 18:18 jacques-henri.jourdan Note Added: 0009468
2013-06-12 18:23 frisch Note Added: 0009470
2013-06-13 11:28 kik Note Added: 0009477
2013-06-13 11:28 kik Status feedback => new
2013-06-13 11:31 frisch Note Added: 0009478
2013-06-14 10:21 xleroy Note Added: 0009484
2013-06-14 10:21 xleroy Status new => acknowledged
2013-06-14 10:22 xleroy Note Edited: 0009484 View Revisions
2013-06-14 13:30 xleroy Note Added: 0009485
2013-06-18 13:26 doligez Note Added: 0009546
2013-06-18 13:26 doligez Target Version => 4.01.0+dev
2013-06-19 11:35 kik Note Added: 0009556
2013-06-19 17:13 jacques-henri.jourdan File Added: 0001-PR-6038.patch
2013-06-19 17:22 jacques-henri.jourdan Note Added: 0009567
2013-06-19 19:38 xleroy Note Added: 0009571
2013-06-26 04:46 jacques-henri.jourdan Note Added: 0009617
2013-07-09 23:57 jacques-henri.jourdan File Added: 0001-Align-the-whole-stack-on-16-byte-boundaries-if-we-ca.patch
2013-07-10 00:11 jacques-henri.jourdan Note Added: 0009739
2013-08-19 16:32 doligez Target Version 4.01.0+dev => 4.01.1+dev
2013-12-16 16:41 doligez Tag Attached: patch
2014-01-06 20:34 Richard Jones Note Added: 0010782
2014-05-25 20:20 doligez Target Version 4.01.1+dev => 4.02.0+dev
2014-05-28 14:41 shinwell Note Added: 0011579
2014-05-28 14:41 shinwell Assigned To => xleroy
2014-05-28 14:41 shinwell Status acknowledged => assigned
2014-05-28 14:47 Richard Jones Note Added: 0011580
2014-06-02 09:39 shinwell Priority normal => high
2014-06-02 19:14 xleroy Note Added: 0011630
2014-06-02 19:14 xleroy Status assigned => resolved
2014-06-02 19:14 xleroy Resolution open => fixed
2014-06-02 19:14 xleroy Fixed in Version => 4.02.0+dev


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker