New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
segmentation fault in native function built by clang (misaligned stack) #6038
Comments
Comment author: @jhjourdan This is a common problem in x86 C compilers : it is more or less a consensus that in the "modern" x86 ABI, stack should be 16-byte aligned : http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40838 I think OCaml should try to make sure stack is aligned when calling C code. I can imagine of at least two ways to do that : (1) As in macosx, ocamlopt keep the stack aligned, so that C functions are calld with an aligned stack (1) uses more stack space, but I do not know whether the attribute proposed in (2) is available on all the C compilers we need. However, all this is my own opinion, and it does not count much, as I am not an ocamlopt expert. JH |
Comment author: @xavierleroy Linux/x86-32 bits is supposed to comply to the SVR4/ELF x86-32 ABI, which states that the stack is always 4-aligned, but does not require 16-byte alignment. AFAIK, the only x86-32 ABI that requires 16-alignment for the stack is MacOS X. ocamlopt generates code that conforms to the corresponding ABIs. All the support for 16-alignment of the stack is there, but it is activated only for MacOS X. Finally, keep in mind that bigger stack alignment has a cost, in terms of wasted stack space. For deeply recursive functions, this wasted space can cause a stack overflow. So, I'm not in favor of increasing alignment if there is no good reason. Bottom line: I don't think a x86-32 compiler for Linux is ever allowed to assume 16-alignment on the stack pointer. If "clang -O2" does assume this, I'd say it is a clang/llvm problem. |
Comment author: @jhjourdan
Indeed. However, it seems like there is a consensus among C compiler writers on the fact that the stack should be 16-aligned. This has been a debate on the GCC bugtracker (see the link above), and the statu quo is that both GCC and LLVM rely on it. The glibc has been updated for this purpose.
I agree. Moreover, 16-aligning the OCaml stack might not be worth it, because Ocaml mainly does 4-bytes memory accesses. However, we do not need to 16-align the OCaml stack: we could add the force_align_arg_pointer attribute in the CAMLprim macro. It forces the C compiler not to assume that the stack is 16-aligned. It is available in not-too-old versions of LLVM and GCC. |
Comment author: @alainfrisch
This would force all external stub libraries to use CAMLprim. It seems that the manual indeed says this is required, but since CAMLprim does nothing, I'm not sure all stub libraries actually follow this convention (admittedly, they should be fixed).
It is conceivable (and cheap) to fix the stack alignment only around calls to C functions? I don't think the case of deeply recursive functions going through C matters in practice. |
Comment author: @jhjourdan
If such a stub does not contain CAMLprim, it is bogus (what's the point of CAMLprim if it is not used ?). Note that anyway, those stubs will be as likely to crash as now, because nothing is done currently.
While possible, I am not sure it is that easy: you will have to realign the stack before pushing arguments, and restore the stack pointer after the call. However, the compiled code for evaluating the arguments can depend on the current offset between the current frame and the stack pointer, which is unknown statically... |
Comment author: @alainfrisch
Currently, the only effect of CAMLprim is within the runtime system itself (it serves as a marker to collect the list of built-in primitives with sed). |
Comment author: kik Thanks,
I checked my clang binary and source code. The binary is from Ubuntu package. So, I think it is properly configured. https://github.com/llvm-mirror/clang/blob/master/lib/Basic/Targets.cpp#L2860 It seems x86_32 default is "-S128" (128bit stack) and only WindowsX86_32TargetInfo uses "-S32". |
Comment author: @alainfrisch
This is true, but since Linux x86-32 is an exotic architecture these days, it is unlikely that authors of stubs will notice the problem. If this solution is picked (instead of having ocamlopt enforce the alignment), it will be important to make it widely known that CAMLprim is now required. |
Comment author: @xavierleroy
No, as far as I can see. This would require extensive changes in the i386 code generator, plus an extra register must be reserved (in the code that pushes the arguments to the call) to act as an alternate stack pointer. The possible workarounds seem to be: #2 looks like the least intrusive solution, but I wonder how much of a speed penalty this would be for the OCaml runtime itself. |
Comment author: @xavierleroy Replying to myself:
After examination of the code generated by GCC, the answer is: probably negligible. (Just one "and" instruction. GCC already uses a frame pointer, so that cost is already paid.) |
Comment author: @damiendoligez Looks like the best solution is Xavier's option (2): add force_align_arg_pointer to CAMLprim, with all the necessary incantations in the configure script to make sure it's only used with C compilers that know it. Out of curiosity, there's still one unanswered question: why does this affect clang and not gcc if both are supposed to assume 16-byte alignment of the stack? |
Comment author: kik
Because use of SSE instructions is enabled default on clang but optional on gcc. |
Comment author: @jhjourdan I submitted a patch that implements this solution. I used a patched version of GCC that produces executables that crashes when entering a function with an unaligned stack. It appeared that some C callbacks were not annotated with CAMLprim (even in the Ocaml runtime). I have corrected many of them, but it may be the case there are still some of them. I do not have a reliable way to test in the configure script whether the force_align_arg_pointer attribute is actually taken into account for the system compiler: I just tests it does not produce an error. With gcc, unknown attribute just produce warnings. I think it could be a good idea to use this method for the macosx architecture: indeed, it would remove the performance penalty of keeping the Ocaml stack aligned. What do you think ? |
Comment author: @xavierleroy
Googling suggests #if defined(i386) && (GNUC > 4 || (GNUC == 4 && GNUC_MINOR > 1)) which seems to work for GCC and for Clang on my Mac OS X 10.8 ("Apple LLVM version 4.2"). Maybe this version test is as reliable or more than your configure-time test. |
Comment author: @jhjourdan The problem with such a test is that it is indirect : some versions of clang will support the attribute, but won't be accepted by this test. I would prefer using a conservative approach where we put the attribute when we know it does not generates an error (while still being able to generate a warning if the compiler is not recent enough). As an other approach, we could test the attribute support using something like: attribute((force_align_arg_pointer)) int test_realign() { int main() { If this test passes, we use the attribute. Concerning the use of this attribute on macosx, I suggest using the previous test in order to deactivate ocaml stack alignment when possible. However, I do not have an Apple computer to do tests... |
Comment author: @jhjourdan I submitted a new version of the patch using this method to check alignment. If the C compiler does not support the attribute, it maintains the whole stack 16-aligned (using the existing macosx code). I also added some more missing CAMLprims in libraries. Given that we expect CAMLprim to be missing in some other code, and that macosx is very sensitive to misalignment, the stack is still kept aligned in this OS. There is some specific code for mingw32, that should work, but I havn't tested it. |
Comment author: Richard Jones This does also affect gcc. See my comments here: |
Comment author: @mshinwell Am I right in thinking that we still need to fix this problem in OCaml? It does appear that gcc has been presuming 16-byte alignment on x86-32 for some time. |
Comment author: Richard Jones Well it definitely affects Fedora. For a long time I've used this hack as a workaround. It would be good to have a proper fix in the compiler itself. http://pkgs.fedoraproject.org/cgit/ocaml.git/tree/ocaml.spec#n274 |
Comment author: @xavierleroy Fixed in version/4.02 (commit 14943) and on trunk (commit 14944) by forcing 16-alignment of stack for all x86-32 platforms except Win32/MSVC. It's not like the GCC and Clang developers left us much choice. Enjoy the new stack consumption. For the record, I still morally object to compiler writers violating ABIs, instead of evolving ABI standard documents. |
Original bug ID: 6038
Reporter: kik
Assigned to: @xavierleroy
Status: closed (set by @xavierleroy on 2015-12-11T18:27:50Z)
Resolution: fixed
Priority: high
Severity: minor
Platform: i386
OS: Linux Mint
OS Version: 15
Version: 3.12.1
Target version: 4.02.0+dev
Fixed in version: 4.02.0+dev
Category: runtime system and C interface
Tags: patch
Monitored by: @ygrek "Richard Jones"
Bug description
I'm using ocaml binding of llvm on i386 Linux.
But it crashes with segmentation fault on the same position every time.
I noticed:
It seems clang++ expects aligned stack but ocaml runtime does not align stack.
I do not know i386 Linux ABI requests it, but I think passing aligned stack from
ocaml runtime removes surprising result.
Steps to reproduce
/* simple example generates "movaps" and request stack alignment /
/ compile: clang -O2 -c foo.c */
#include <stdio.h>
#include <caml/mlvalues.h>
#include <caml/memory.h>
#include <caml/alloc.h>
#include <caml/custom.h>
struct X
{
int a, b, c, d;
};
attribute((noinline))
int foo(const struct X *x)
{
printf("in foo\n");
return x->a;
}
value bar(value unit)
{
struct X x = { 0, 0, 0, 0 };
CAMLparam1(unit);
CAMLreturn(Val_int(foo(&x)));
}
File attachments
The text was updated successfully, but these errors were encountered: