Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0005334OCamlOCaml generalpublic2011-08-12 12:372013-08-31 12:43
Reporterglondu 
Assigned To 
PrioritynormalSeverityminorReproducibilityalways
StatusclosedResolutionwon't fix 
PlatformOSOS Version
Product Version3.12.0 
Target VersionFixed in Version 
Summary0005334: ocamlopt generates stack invalid for backtrace()
DescriptionOn a powerpc machine:

$ uname -a
Linux pescetti 2.6.32-5-powerpc #1 Tue Jun 14 09:33:14 UTC 2011 ppc GNU/Linux

$ ocamlopt -config
version: 3.12.0
standard_library_default: /usr/lib/ocaml
standard_library: /usr/lib/ocaml
standard_runtime: /usr/bin/ocamlrun
ccomp_type: cc
bytecomp_c_compiler: gcc -fno-defer-pop -Wall -D_FILE_OFFSET_BITS=64 -D_REENTRANT -fPIC
bytecomp_c_libraries: -lm -ldl -lcurses -lpthread
native_c_compiler: gcc -Wall -D_FILE_OFFSET_BITS=64 -D_REENTRANT
native_c_libraries: -lm -ldl
native_pack_linker: ld -r -o
ranlib: ranlib
cc_profile: -pg
architecture: power
model: ppc
system: elf
asm: as -u -m ppc
ext_obj: .o
ext_asm: .s
ext_lib: .a
ext_dll: .so
os_type: Unix
default_executable_name: a.out
systhread_supported: true

$ cat backtrace.c
#include <caml/mlvalues.h>
#include <execinfo.h>

value caml_backtrace(value unit) {
  void *buffer[100];
  return(Val_int(backtrace(buffer, 100)));
}

$ cat main.ml
external backtrace : unit -> int = "caml_backtrace";;
let f () = backtrace () + 1;;

exit (f ());;

$ ocamlopt backtrace.c main.ml && ./a.out; echo $?
zsh: segmentation fault ./a.out
139

$ ocamlc -custom backtrace.c main.ml && ./a.out; echo $?
7

This does not happen on amd64, i386, armel, sparc. It doesn't happen either when calling backtrace() in a pure C program, nor when all CAML{local,return} macros are there, nor when backtrace() in called directly at toplevel in OCaml code. I suspect ocamlopt is faulty here.
TagsNo tags attached.
Attached Files

- Relationships
related to 0005314closedshinwell add CFI directives for reliable stack unwinding 

-  Notes
(0006141)
xleroy (administrator)
2011-10-03 14:35

The man page for backtrace(3) notes:

"These functions make some assumptions about how a function's return
address is stored on the stack. Note the following:
* Omission of the frame pointers (as implied by any of gcc(1)'s non-
   zero optimization levels) may cause these assumptions to be violated."

ocamlopt-generated stack frames are, in general, quite different from gcc-generated frames. In particular, no frame pointer is ever used by ocamlopt. So, I'm not surprised that backtrace() fails to unravel a Caml stack.

Maybe this could be improved by generating .cfi directives in ocamlopt's output, as suggested in PR#5314, but I have no idea whether backtrace() can exploit this information.

When you say "this does not happen on amd64, i386, ...", do you mean that on those platforms,
(1) backtrace() doesn't segfault
(2) backtrace() produces a useful backtrace up to the first Caml call frame
(3) backtrace() produces a full backtrace including Caml frames?

If the answer is (1) and (2) but not (3), maybe this just points to a lack of robustness in the PowerPC implementation of backtrace(), its implementation on other platforms being more robust against nonstandard frames.




  That might (or not) be improved if

(0006142)
ygrek (reporter)
2011-10-03 16:01

I am observing (2) and I think it's libc's problem.
AFAICS, backtrace() (at least in debian stable) doesn't benefit from static unwind info (gdb does). Here is an example with cfi-enabled ocaml on amd64 :

$ cat qq.ml

module U = ExtUnix.Specific

let rec f = function
| 0 -> (*print_endline (Obj.magic 0);*) Array.iter print_endline (U.backtrace ()); 0
| n -> let r = f (n - 1) in r + 1

let g a b =
  let n = a + b in
  let r = f n in
  r + 1

let () = let r = g 1 2 in exit r

$ ocamlfind ocamlopt -g -linkpkg -package extunix qq.ml -o qq
$ ./qq
./qq(caml_extunix_backtrace+0xca) [0x427f66]
./qq() [0x43c908]

Uncommenting Obj.magic and running in gdb :

$ ocamlfind ocamlopt -g -linkpkg -package extunix qq.ml -o qq
$ gdb ./qq
GNU gdb (GDB) 7.0.1-debian
[...]
Program received signal SIGSEGV, Segmentation fault.
0x0000000000414564 in camlPervasives__output_string_1191 ()
(gdb) bt
#0 0x0000000000414564 in camlPervasives__output_string_1191 ()
#1 0x0000000000414a1a in camlPervasives__print_endline_1274 ()
#2 0x000000000040cf58 in camlQq__f_1140 ()
0000003 0x000000000040cf43 in camlQq__f_1140 ()
0000004 0x000000000040cf43 in camlQq__f_1140 ()
0000005 0x000000000040cf43 in camlQq__f_1140 ()
0000006 0x000000000040cff8 in camlQq__entry ()
0000007 0x000000000040c2b9 in caml_program ()
0000008 0x000000000043c95e in caml_start_program ()
0000009 0x000000000042c9e5 in caml_main ()
0000010 0x000000000042ca24 in main ()
(0006344)
xleroy (administrator)
2011-12-17 09:16

Thanks to "ygrek" for the analysis. It seems that, with the current glibc-PowerPC implementation of backtrace(), the only way to address this issue would be to add frame pointers to the OCaml PowerPC code generator, but this is not going to happen. (It is a design decision of ocamlopt that frame pointers are not maintained: this has a non-negligible cost for small functions, and in a functional language there are many of them.) So, I'll close this PR shortly.

There are however two problems with glibc that should probably be reported to them:
1- on PowerPC, backtrace() should be robust against stack frames that it doesn't understand. It's not rocket science and the x86 and x86_64 implementations seem robust already,
2- glibc might also consider taking advantage of CFI information to produce better backtraces.

- Issue History
Date Modified Username Field Change
2011-08-12 12:37 glondu New Issue
2011-10-03 14:35 xleroy Note Added: 0006141
2011-10-03 14:35 xleroy Status new => feedback
2011-10-03 14:35 xleroy Relationship added related to 0005314
2011-10-03 16:01 ygrek Note Added: 0006142
2011-12-17 09:16 xleroy Note Added: 0006344
2011-12-17 09:16 xleroy Status feedback => resolved
2011-12-17 09:16 xleroy Resolution open => won't fix
2013-08-31 12:43 xleroy Status resolved => closed


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker