Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ocamlopt generated code: segmentation fault instead of stack overflow #5064

Closed
vicuna opened this issue May 30, 2010 · 6 comments
Closed

ocamlopt generated code: segmentation fault instead of stack overflow #5064

vicuna opened this issue May 30, 2010 · 6 comments

Comments

@vicuna
Copy link

vicuna commented May 30, 2010

Original bug ID: 5064
Reporter: @edwintorok
Status: closed (set by @xavierleroy on 2013-08-31T10:46:33Z)
Resolution: fixed
Priority: normal
Severity: minor
Version: 3.11.2
Fixed in version: 3.13.0+dev
Category: ~DO NOT USE (was: OCaml general)
Duplicate of: #4297
Related to: #5485
Monitored by: mfp mehdi @edwintorok @ygrek @glondu "Julien Signoles" @mjambon gerd

Bug description

This is related to #4297 , however I didn't find a way to reopen that bug. Please consider fixing this, see a possible solution in the 'Additional Information' section of this bug.

I encountered a segmentation fault when running menhir (see http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=583300), and I reduced the problem down to the attached testcase.

Problem is that there is not enough stack space available to invoke the C function caml_equal, and ocaml's sigsegv handler only raises Stack_overflow for OCaml code. This is generally a good idea, since the C code could itself be responsible for the overflow, and there is no way to know that raising an OCaml exception (instead of segfaulting) won't result in inconsistent state for the C part.

However the equality comparison is part of the OCaml language (and any native C code invoked by standard libraries would be too), and I think it is important to have a robust protection against stack overflows. Functional programs are recursive by their nature, and getting a segfault due to a simple runaway recursion is not acceptable.
One might need to run some cleanup code in all cases (like releasing a file lock), having the Stack_overflow exception ensures that, a segfault doesn't.

Additional information

Here are some suggestions:

  1. check that there is always enough stack space left before invoking a C function. "enough" is tricky to define here, but I think we could have something minimal, like 8k or 16k ensured at all times.

  2. implement Continuation Passing Style transform. that way the OCaml code could run in constant system stack space, and the C part would have plenty of stack left always. Sure the continuation will eventually grow too big and allocation will fail. Recovering from a failed allocation is safer than recovering from a segfault

Implementing 1. is very simple, and doesn't require much additional code:
Just add code like this before every call to non-OCaml code:
subq $8192, %rsp
movq $0, 0(%rsp)
addq $8192, %rsp

Then if you don't have at least 8k left on the stack, you'll get a SIGSEGV at the 'movq', which will be considered as still part of the OCaml code, and you get a nice Stack_Overflow instead of a segfault.
You could emit this code only when '-unsafe' is not specified. If -unsafe is specified ocamlopt could generate code just like now.

I tested this modification, and it works!
After modification:
$ as -o po/y.o y.s
$ gcc -o a.out -L/usr/lib/ocaml po/camlstartupaa0114.o po/std_exit.o po/y.o /usr/lib/ocaml/stdlib.a /usr/lib/ocaml/libasmrun.a -lm -ldl
Fatal error: exception Stack_overflow

Before the modification to y.s I got the segfault. (I captured the .o files by using a bash script that I passed as -cc to ocamlopt).

File attachments

@vicuna
Copy link
Author

vicuna commented Jun 4, 2010

Comment author: @edwintorok

I have measured the performance and code size impact of implementing my suggestion #2.

There is no measurable performance difference (all differences are within the standard deviation, and are <1%)

  1. time make one DIR=tests/misc
    without patch: 80.91s avg, 0.04s stdev
    with patch: 80.99s avg, 0.3s stdev
    +0.07s, +0.09%
  2. time make one DIR=tests/misc-unsafe:
    without patch: 74.68s avg, 0.76s stdev
    with patch: 74.98avg, 0.24s stdev
    +0.28s, +0.38%
  3. time make world.opt:
    without patch: 188.73s, 1.43s stdev
    with patch: 187.661s, 1.55s stdev
    -1.12s, -0.59%

Code size for ocamlopt.opt:
without patch: 2481760
with patch: 2684640
+202880, +8.1%

I think this stack check could be eliminated when -unsafe is specified, and then you'd get back the original code size.

Measurements done on Debian unstable, x86_64, Intel(R) Core(TM)2 Quad CPU Q9550 @ 2.83GHz, 4GB RAM.

@vicuna
Copy link
Author

vicuna commented Jun 4, 2010

Comment author: @edwintorok

Attached the patch I used to test performance/codesize.
It is for amd64 only, I attached it only as an example.
It should be extended for other architectures, and take -unsafe into consideration.

@vicuna
Copy link
Author

vicuna commented Jun 1, 2011

Comment author: @damiendoligez

Using CPS is a bad idea here, you don't want to bog down the machine by filling the whole memory just because of one buggy program. Stacks are limited for a good reason.

@vicuna
Copy link
Author

vicuna commented Aug 16, 2011

Comment author: @ygrek

Here is another example (from caml-list, "Segmentation faults in unexpected places" by Peter Frey at Tue, 9 Aug 2011 18:16:00 -0400) :

$ cat a.ml

let stol len = (* create some bogus int list of length len *)
let head = ref [] in for i = 0 to pred len do head := i :: !head
done; !head

let () =
let limit = int_of_string (Sys.argv.(1)) in
let str_list = stol limit in
let stupid_copy = List.fold_right (fun x l -> x :: l) str_list [] in
Printf.printf"List copied with fold_right; len:%i\n" (List.length stupid_copy)

$ uname -rms
Linux 2.6.38-bpo.2-amd64 x86_64
$ ulimit -s
8192
$ ./a 261900
List copied with fold_right; len:261900
$ ./a 261900
Segmentation fault (core dumped)
$ ./a 261900
List copied with fold_right; len:261900
$ ./a 261900
Fatal error: exception Stack_overflow
$ ./a 261900
List copied with fold_right; len:261900
$ ./a 261900
List copied with fold_right; len:261900

Note the random crashing.

In this case offending C function is caml_call_gc and descendants :

Core was generated by `./a 261900'.
Program terminated with signal 11, Segmentation fault.
#0 0x000000000041f097 in caml_fl_allocate ()
(gdb) bt
#0 0x000000000041f097 in caml_fl_allocate ()
#1 0x0000000000413319 in caml_alloc_shr ()
#2 0x0000000000412538 in caml_oldify_one ()
#3 0x0000000000410d0c in caml_oldify_local_roots ()
#4 0x0000000000412750 in caml_empty_minor_heap ()
#5 0x0000000000412879 in caml_minor_collection ()
#6 0x0000000000411891 in caml_garbage_collection ()
#7 0x000000000041df16 in caml_call_gc ()
#8 0x00000000004061b4 in camlA__fun_1042 ()
#9 0x0000000000408fe2 in camlList__fold_right_1084 ()
#10 0x0000000000408fe2 in camlList__fold_right_1084 ()
#11 0x0000000000408fe2 in camlList__fold_right_1084 ()

@vicuna
Copy link
Author

vicuna commented Feb 16, 2012

Comment author: gerd

This seems to be difficult to solve. I think a strategy could be based on two ideas:

  • Edwin's idea of guaranteeing a certain minimum of free stack. I like
    this idea. This can even be optimized - you don't need to do the stack
    check before every call to C, but only at the beginning of every recursion,
    assuming that only a let rec can grow the stack too much. Even more
    fine-grained schemes could be worked out (e.g. do the check only at every
    entry or recursion point of a function, and only if the function contains
    calls to C).

    This measure would already protect the vast majority of C stubs.

  • C functions needing more stack space have to protect themselves. If the
    function is part of the Ocaml runtime, one could do stack checks at
    safe points where raising an exception does not hurt (we would need a
    global flag for marking such safe points, so that the sigsegv handler
    knows that). This would probably work well for compare, marshal, hash.
    I'm a bit wondering what's going wrong in the GC. AFAIK there are no
    unlimited recursions.

@vicuna
Copy link
Author

vicuna commented Feb 17, 2012

Comment author: @xavierleroy

I implemented a 90% complete fix (commits r12159 and r12160) whereas the stack "touch", checking that 4K words of stack are available and raising Stack_overflow otherwise, is performed in the asm glue between ocamlopt-generated code and the OCaml runtime system, namely in glue functions caml_c_call and caml_call_gc.

A good thing about this solution is that correct stack backtraces are generated, since caml_bottom_of_stack and caml_last_return_address are properly set up before the stack touch. Another good thing is that no extra code is generated by ocamlopt.

A weakness, compared with the original proposal, is that no stack touch is performed (yet) before calls to "noalloc" C functions, which do not go through caml_c_call. I need to run more experiments to assess the overhead of this solution, esp. when calling small functions from the math library.

The fix is currently implemented for x86 (32 and 64 bits) under Linux, BSD and MacOS X, as well as PowerPC under MacOS X. Testing on production code is most welcome.

The 4K words should be enough to run the GC and most of the OCaml runtime functions, with the notable exception of hashing, marshaling and unmarshaling, which are recursive. An iterative reimplementation of these functions can be considered but is nontrivial work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant