Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0005064OCaml-OCaml generalpublic2010-05-30 20:382013-08-31 12:46
Assigned To 
PlatformOSOS Version
Product Version3.11.2 
Target VersionFixed in Version3.13.0+dev 
Summary0005064: ocamlopt generated code: segmentation fault instead of stack overflow
DescriptionThis is related to [^] , however I didn't find a way to reopen that bug. Please consider fixing this, see a possible solution in the 'Additional Information' section of this bug.

I encountered a segmentation fault when running menhir (see [^]), and I reduced the problem down to the attached testcase.

Problem is that there is not enough stack space available to invoke the C function caml_equal, and ocaml's sigsegv handler only raises Stack_overflow for OCaml code. This is generally a good idea, since the C code could itself be responsible for the overflow, and there is no way to know that raising an OCaml exception (instead of segfaulting) won't result in inconsistent state for the C part.

However the equality comparison is part of the OCaml language (and any native C code invoked by standard libraries would be too), and I think it is important to have a robust protection against stack overflows. Functional programs are recursive by their nature, and getting a segfault due to a simple runaway recursion is not acceptable.
One might need to run some cleanup code in all cases (like releasing a file lock), having the Stack_overflow exception ensures that, a segfault doesn't.
Additional InformationHere are some suggestions:
1. check that there is always enough stack space left before invoking a C function. "enough" is tricky to define here, but I think we could have something minimal, like 8k or 16k ensured at all times.

2. implement Continuation Passing Style transform. that way the OCaml code could run in constant system stack space, and the C part would have plenty of stack left always. Sure the continuation will eventually grow too big and allocation will fail. Recovering from a failed allocation is safer than recovering from a segfault

Implementing 1. is very simple, and doesn't require much additional code:
Just add code like this before every call to non-OCaml code:
    subq $8192, %rsp
    movq $0, 0(%rsp)
        addq $8192, %rsp

Then if you don't have at least 8k left on the stack, you'll get a SIGSEGV at the 'movq', which will be considered as still part of the OCaml code, and you get a nice Stack_Overflow instead of a segfault.
You could emit this code only when '-unsafe' is not specified. If -unsafe is specified ocamlopt could generate code just like now.

I tested this modification, and it works!
After modification:
$ as -o po/y.o y.s
$ gcc -o a.out -L/usr/lib/ocaml po/camlstartupaa0114.o po/std_exit.o po/y.o /usr/lib/ocaml/stdlib.a /usr/lib/ocaml/libasmrun.a -lm -ldl
Fatal error: exception Stack_overflow

Before the modification to y.s I got the segfault. (I captured the .o files by using a bash script that I passed as -cc to ocamlopt).
TagsNo tags attached.
Attached Files? file icon [^] (165 bytes) 2010-05-30 20:38 [Show Content]
patch file icon amd_checkstack.patch [^] (2,023 bytes) 2010-06-04 12:30 [Show Content]

- Relationships
duplicate of 0004297closedertai Crash/segmentation fault on x86_32 for a simple program 
related to 0005485closed Reducing the risk of segfault due to stack overflow 

-  Notes
edwin (reporter)
2010-06-04 12:29

I have measured the performance and code size impact of implementing my suggestion #2.

There is no measurable performance difference (all differences are within the standard deviation, and are <1%)
1. time make one DIR=tests/misc
  without patch: 80.91s avg, 0.04s stdev
  with patch: 80.99s avg, 0.3s stdev
  +0.07s, +0.09%
2. time make one DIR=tests/misc-unsafe:
  without patch: 74.68s avg, 0.76s stdev
  with patch: 74.98avg, 0.24s stdev
  +0.28s, +0.38%
3. time make world.opt:
  without patch: 188.73s, 1.43s stdev
  with patch: 187.661s, 1.55s stdev
  -1.12s, -0.59%

Code size for ocamlopt.opt:
  without patch: 2481760
  with patch: 2684640
  +202880, +8.1%

I think this stack check could be eliminated when -unsafe is specified, and then you'd get back the original code size.

Measurements done on Debian unstable, x86_64, Intel(R) Core(TM)2 Quad CPU Q9550 @ 2.83GHz, 4GB RAM.
edwin (reporter)
2010-06-04 12:32

Attached the patch I used to test performance/codesize.
It is for amd64 only, I attached it only as an example.
It should be extended for other architectures, and take -unsafe into consideration.
doligez (administrator)
2011-06-01 23:09

Using CPS is a bad idea here, you don't want to bog down the machine by filling the whole memory just because of one buggy program. Stacks are limited for a good reason.
ygrek (reporter)
2011-08-16 23:33

Here is another example (from caml-list, "Segmentation faults in unexpected places" by Peter Frey at Tue, 9 Aug 2011 18:16:00 -0400) :

$ cat

let stol len = (* create some bogus int list of length len *)
  let head = ref [] in for i = 0 to pred len do head := i :: !head
  done; !head

let () =
  let limit = int_of_string (Sys.argv.(1)) in
  let str_list = stol limit in
  let stupid_copy = List.fold_right (fun x l -> x :: l) str_list [] in
  Printf.printf"List copied with fold_right; len:%i\n" (List.length stupid_copy)

$ uname -rms
Linux 2.6.38-bpo.2-amd64 x86_64
$ ulimit -s
$ ./a 261900
List copied with fold_right; len:261900
$ ./a 261900
Segmentation fault (core dumped)
$ ./a 261900
List copied with fold_right; len:261900
$ ./a 261900
Fatal error: exception Stack_overflow
$ ./a 261900
List copied with fold_right; len:261900
$ ./a 261900
List copied with fold_right; len:261900

Note the random crashing.

In this case offending C function is caml_call_gc and descendants :

Core was generated by `./a 261900'.
Program terminated with signal 11, Segmentation fault.
#0 0x000000000041f097 in caml_fl_allocate ()
(gdb) bt
#0 0x000000000041f097 in caml_fl_allocate ()
#1 0x0000000000413319 in caml_alloc_shr ()
#2 0x0000000000412538 in caml_oldify_one ()
0000003 0x0000000000410d0c in caml_oldify_local_roots ()
0000004 0x0000000000412750 in caml_empty_minor_heap ()
0000005 0x0000000000412879 in caml_minor_collection ()
0000006 0x0000000000411891 in caml_garbage_collection ()
0000007 0x000000000041df16 in caml_call_gc ()
0000008 0x00000000004061b4 in camlA__fun_1042 ()
0000009 0x0000000000408fe2 in camlList__fold_right_1084 ()
0000010 0x0000000000408fe2 in camlList__fold_right_1084 ()
0000011 0x0000000000408fe2 in camlList__fold_right_1084 ()
gerd (reporter)
2012-02-16 14:35

This seems to be difficult to solve. I think a strategy could be based on two ideas:

- Edwin's idea of guaranteeing a certain minimum of free stack. I like
  this idea. This can even be optimized - you don't need to do the stack
  check before every call to C, but only at the beginning of every recursion,
  assuming that only a let rec can grow the stack too much. Even more
  fine-grained schemes could be worked out (e.g. do the check only at every
  entry or recursion point of a function, and only if the function contains
  calls to C).

  This measure would already protect the vast majority of C stubs.

- C functions needing more stack space have to protect themselves. If the
  function is part of the Ocaml runtime, one could do stack checks at
  safe points where raising an exception does not hurt (we would need a
  global flag for marking such safe points, so that the sigsegv handler
  knows that). This would probably work well for compare, marshal, hash.
  I'm a bit wondering what's going wrong in the GC. AFAIK there are no
  unlimited recursions.
xleroy (administrator)
2012-02-17 17:01

I implemented a 90% complete fix (commits r12159 and r12160) whereas the stack "touch", checking that 4K words of stack are available and raising Stack_overflow otherwise, is performed in the asm glue between ocamlopt-generated code and the OCaml runtime system, namely in glue functions caml_c_call and caml_call_gc.

A good thing about this solution is that correct stack backtraces are generated, since caml_bottom_of_stack and caml_last_return_address are properly set up before the stack touch. Another good thing is that no extra code is generated by ocamlopt.

A weakness, compared with the original proposal, is that no stack touch is performed (yet) before calls to "noalloc" C functions, which do not go through caml_c_call. I need to run more experiments to assess the overhead of this solution, esp. when calling small functions from the math library.

The fix is currently implemented for x86 (32 and 64 bits) under Linux, BSD and MacOS X, as well as PowerPC under MacOS X. Testing on production code is most welcome.

The 4K words should be enough to run the GC and most of the OCaml runtime functions, with the notable exception of hashing, marshaling and unmarshaling, which are recursive. An iterative reimplementation of these functions can be considered but is nontrivial work.

- Issue History
Date Modified Username Field Change
2010-05-30 20:38 edwin New Issue
2010-05-30 20:38 edwin File Added:
2010-06-04 12:29 edwin Note Added: 0005542
2010-06-04 12:30 edwin File Added: amd_checkstack.patch
2010-06-04 12:32 edwin Note Added: 0005543
2011-06-01 23:07 doligez Relationship added duplicate of 0004297
2011-06-01 23:09 doligez Note Added: 0005956
2011-06-01 23:09 doligez Severity crash => minor
2011-06-01 23:09 doligez Status new => acknowledged
2011-08-16 23:33 ygrek Note Added: 0006091
2012-02-16 11:16 xleroy Relationship added related to 0005485
2012-02-16 14:35 gerd Note Added: 0006931
2012-02-17 17:01 xleroy Note Added: 0006934
2012-02-17 17:01 xleroy Status acknowledged => resolved
2012-02-17 17:01 xleroy Resolution open => fixed
2012-02-17 17:01 xleroy Fixed in Version => 3.13.0+dev
2013-08-31 12:46 xleroy Status resolved => closed
2017-02-23 16:36 doligez Category OCaml general => -OCaml general

Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker