New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ocamlc got segfault in Alpine ppc64le #7562
Comments
Comment author: @gasche The OCaml version seems to be 4.04.1 ( https://github.com/alpinelinux/aports/blob/master/community/ocaml/APKBUILD ) with some downstream patches ( https://github.com/alpinelinux/aports/tree/master/community/ocaml ), most of them being build-system related -- the only one affecting code generation marks the stack as non-executable and a ppc64 fix to CONTEXT_* macros in signal_osdeps.h https://github.com/alpinelinux/aports/blob/master/community/ocaml/010_all_execstacks.patch Since 4.04.0, "ocamlc" points to the native-compiled ocamlc.opt instead of the bytecode-compiled ocamlc.byte. Out of curiosity, does running ocamlc.byte (also installed in PATH) work correctly? |
Comment author: rgdoliveira I just tried ocamlc.byte and I was able to compile a simple .ml file and run the generated file (no segfault). |
Comment author: @xavierleroy Thanks for trying ocamlc.byte. This confirms my suspicion that the problem is with dynamic loading in OCaml programs compiled to native code, which is the case of ocamlc in this Alpine setup. The bad news is that we have extremely limited access to ppc64le hardware: just one virtual machine provided by RedHat in Brno, running Fedora (I think). So, I'm at a loss on how to debug this issue. |
Comment author: rgdoliveira xleroy, I have a VM running Alpine ppc64le and I can give you access to this VM, if that helps you with debug. Can you talk with me at freenode? My username is 'rdutra'. |
Comment author: @mshinwell I can also try to look at this, I think I can get access to a suitable machine now. @xLeroy please let me know if you have time / want to do it. |
Comment author: rgdoliveira I applied a downstream patch (workaround) in Alpine build of ocaml and it fixed the segfault. Basically, I compiled the ocaml natives using -no-pie flag (https://github.com/alpinelinux/aports/blob/1feea49eaec12328e73541436bd1612228cd7e9a/community/ocaml/fix-segfault-in-ppc64le.patch) |
Comment author: @xavierleroy @shinwell: I lost access to a ppc64le machine, so you are most welcome to try and understand this issue while I try to build a qemu-based VM. (virt-builder should make this easy, except that the version that comes with Ubuntu 16.04 LTS doesn't work.) |
Comment author: @xavierleroy That '-no-pie' helps suggests a misunderstanding between ocamlopt and the dynamic loader about register usage or what not. I'm afraid that even with -no-pie, later attempts to do dynamic loading would fail. |
Comment author: @dbuenzli FWIW this is not specific to ppc64le. The same occurs on alpine 'armv6' and is easy to reproduce in docker. docker run -it arm32v6/alpine sh All the OCaml '.opt' executable segfault as does any executable produced by ocamlopt.byte except if '-cclib -no-pie' is provided on the cli in the final link step. The configure makes it a bit difficult to target precisely the phase where you want to add flags (and using -cc 'gcc -no-pie' seems to break jbuilder which seems nowadays needed to bootstrap opam) so I went with a dirty: sed -i s/common_cflags="-O2/common_cflags="-no-pie\ -O2/g configure This adds Info about gcc and ld: gcc -vUsing built-in specs. ld -vGNU ld (GNU Binutils) 2.28 |
Comment author: @dbuenzli If that may help, exactly the same problem I mention (with identical resolution via docker run -it i386/alpine sh This one should be more pleasant to diagnose with compilation-time wise. Also note that this doesn't occur with the Could this point to some kind of 32-bit issue (though the initial issue mentions ppc64le) ? |
Shouldn't be this one closed if not reproducible with the padding fix? |
@XVilka can you confirm that the issue is gone with the padding fix? (The fix should be in 4.09.1 and 4.10.0, but not 4.09.0) |
@gasche you are right, issue still reproducible with 4.09.1 and 4.10.0 docker run -it i386/alpine sh
apk add --update bash tar make m4 curl git gcc musl-dev binutils
ln -s /usr/bin/as /usr/bin/i586-alpine-linux-musl
curl -OL http://caml.inria.fr/pub/distrib/ocaml-4.10/ocaml-4.10.0.tar.gz
tar -xf ocaml-4.10.0.tar.gz && cd ocaml-4.10.0
./configure -host i586-alpine-linux-musl
make world.opt |
I had a second look at these crashes with Alpine Linux. They are hard to debug because the crash occurs very early in the execution of ocamlopt-generated binaries, well before any OCaml code is entered, even before The root cause seems to be non-PIE object files being linked in PIE mode, which seems to be the default in Alpine. The problem can be reproduced with just C files, no OCaml involved:
I wish the linker would detect the mismatch and emit a diagnostic, rather than silently producing executables that crash. The fix is indeed to use
in the build script for the OCaml package. This worked fine for me on i586 and on ppc64le. I believe that Eventually the
4.09.1 builds fine because all the build is done by bytecode executables, but still produces crashing native-code executables. You can see the crashes by running the test suite. The build of 4.10 uses some native-code executables, so that's why you see the crash during the build. |
I was about to ask about which native-code executable were suddenly used in 4.10, but I guess that I'm the one to blame-or-thank here, this would be the BEST_FOO logic (#8840) using the .opt version of each tool when available. |
…ault Some Linux and BSD platforms now generate position-independent executables (PIE) by default. However, generating a PIE from object files that are not PIC (position-independent code) causes either link-time errors or the production of executable files that crash when run. This commit turns PIE off (-no-pie C compiler option) on platforms where ocamlopt does not generate PIC by default: currently all platforms except amd64 (x86-64) and s390x (Z systems). Closes: ocaml#7562
…ault Some Linux and BSD platforms now generate position-independent executables (PIE) by default. However, generating a PIE from object files that are not PIC (position-independent code) causes either link-time errors or the production of executable files that crash when run. This commit turns PIE off (-no-pie C compiler option) on platforms where ocamlopt does not generate PIC by default: currently all platforms except amd64 (x86-64) and s390x (Z systems). Closes: ocaml#7562
… code Add link-time option -no-pie when - PIE is the default on the target system - ocamlopt does not generate PIC by default (i.e. not amd64, not s390x) - link-time errors or run-time errors occur when linking non-PIC objects in PIE mode. (Observed on Alpine Linux.) Closes: ocaml#7562
Alpine Linux produces position-independent executables (PIEs) by default. If non-PIC object files are given to the linker, it silently produces a wrong executable that crashes when run. This is the case for ocamlopt-generated code, which by default is not PIC except on amd64 (x86_64) and s390x (Z systems). Closes: ocaml#7562
…390x Alpine Linux and perhaps other musl-based Linux distributions produce position-independent executables (PIEs) by default. If non-PIC object files are given to the linker, it silently produces a wrong executable that crashes when run. This is the case for ocamlopt-generated code, which by default is not PIC except on amd64 (x86_64) and s390x (Z systems). Closes: ocaml#7562
Original bug ID: 7562
Reporter: rgdoliveira
Status: acknowledged (set by @xavierleroy on 2017-06-22T18:03:14Z)
Resolution: open
Priority: normal
Severity: crash
Platform: ppc64le
OS: Alpine Linux
OS Version: 3.6.2
Version: 4.04.1
Category: back end (clambda to assembly)
Related to: #7697
Monitored by: @nojb @gasche @dbuenzli
Bug description
I'm building ocaml in Alpine Linux ppc64le and it builds fine. But when I try to use ocamlc, I'm getting a segfault.
Gdb backtrace shows:
#0 0x00003fffb7fad710 in do_relocs (dso=0x3fffb7ff26a0 , rel=0x200ab4b8, rel_size=2495088,
stride=3) at ldso/dynlink.c:379
#1 0x00003fffb7fae1ec in reloc_all (p=0x3fffb7ff26a0 ) at ldso/dynlink.c:1195
#2 0x00003fffb7fafc94 in __dls3 (sp=) at ldso/dynlink.c:1638
#3 0x00003fffb7faf3d4 in __dls2 (base=, sp=0x3ffffffffba0) at ldso/dynlink.c:1424
#4 0x00003fffb7facd2c in _dlstart_c (sp=, dynv=)
at ldso/dlstart.c:147
#5 0x00003fffb7fb1104 in _dlstart () from /lib/ld-musl-powerpc64le.so.1
I know that ocaml was recently ported to ppc64le architecture and works fine with glic, but seems there is an issue with musl.
Steps to reproduce
The steps bellow need to be done inside an Alpine ppc64le:
Clone aports repository
$ git clone https://github.com/alpinelinux/aports.git
Build ocaml package
$ cd aports/community/ocaml
$ abuild -r
Install the built package:
$ sudo apk add <ocaml_apk>
Run ocamlc to get the segmentation fault.
The text was updated successfully, but these errors were encountered: