Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0005471OCamlOCaml generalpublic2012-01-10 14:422013-08-31 12:44
Reporteredwin 
Assigned Toprotz 
PrioritynormalSeverityminorReproducibilityalways
StatusclosedResolutionfixed 
PlatformLinuxOSDebian GNU/LinuxOS VersionLinux 3.2.0
Product Version3.12.1 
Target VersionFixed in Version3.13.0+dev 
Summary0005471: [PATCH] handle SIGBUS on shrinked Bigarray mapped files
DescriptionAccesses to bigarrays are bounds-checked, but the bounds for Bigarray.*.map_file
is established when the file is mapped.
Later the file may shrink or grow, if another (or the same) process modifies it.
If the file shrinks, and you attempt to read/write past the new end-of-file
the OS may raise SIGBUS (for example Linux does).

It is not possible to handle SIGBUS purely on OCaml side, because
attempting to set an OCaml signal handler would cause SIGBUS to be
re-delivered infinitely.

Attached patch (for ocaml trunk) implements SIGBUS handling in the byterun/asmrun runtimes.
It also introduces a new page flag In_mapped_area, that is set by bigarray.
This way we know the SIGBUS came from our mapping, and not from some other
native code mapping.

Patch note: The stdlib/printexc.ml change require an 'make coreboot' run.
If you 'git apply attachedpatch.patch' then it'll update boot/ for you.
Steps To Reproduce1. Compile the below code with ocamlc and ocamlopt
2. Run xbyte and xnat
3. Current results: both crash
$ ./xbyte (or ./xnat)
trying to read invalid mem area
Bus error

4. Uncomment the Sys.set_signal line, and run again. Now you got an infinite loop (SIGBUS delivered over and over again and the OCaml signal handle never reached).

5. Expected result: no crash, an OCaml exception is raised (like with stack overflow, bounds errors, etc.):

Expected xbyte output:

trying to read invalid mem area
exception caught: Bus error
Raised by primitive operation at unknown location
trying to write invalid mem area
Raised by primitive operation at unknown location
exception caught: Bus error

Expected xnat output:
trying to read invalid mem area
exception caught: Bus error
Raised by primitive operation at file "pervasives.ml", line 250, characters 21-28
trying to write invalid mem area
Raised by primitive operation at file "pervasives.ml", line 250, characters 21-28
exception caught: Bus error
Additional Information(* test code
 * ocamlc bigarray.cma unix.cma x.ml -o xbyte; ./xbyte
 * ocamlopt bigarray.cmxa unix.cmxa x.ml -o xnat; ./xnat
 *)

open Bigarray
open Unix

exception Sigbus

let sigbus_handler _ =
  raise Sigbus;;

let _ =
  Printexc.record_backtrace true;
(* Sys.set_signal 7 (Sys.Signal_handle sigbus_handler);*)
    close (openfile "/tmp/x" [O_CREAT] 0o644);
    truncate "/tmp/x" 1024;
    let fd = openfile "/tmp/x" [ O_RDWR ] 0 in
    let m = Array1.map_file fd char c_layout false (-1) in
    Printf.printf "%c" m.{0};
    ftruncate fd 0;
    begin try
      Printf.printf "trying to read invalid mem area\n";
        flush_all ();
      Printf.printf "%c" m.{0}
    with
    | e ->
        Printf.printf "exception caught: %s\n" (Printexc.to_string e);
        Printexc.print_backtrace Pervasives.stdout;
        flush_all ();
    end;
    begin try
      Printf.printf "trying to write invalid mem area\n";
      Printexc.print_backtrace Pervasives.stdout;
      flush_all ();
      m.{0} <- 'a';
    with
    | e ->
        Printf.printf "exception caught: %s\n" (Printexc.to_string e);
        flush_all ();
    end;;
TagsNo tags attached.
Attached Filespatch file icon 0001-Handle-SIGBUS-when-underlying-mmap-ed-file-is-shrink.patch [^] (525,743 bytes) 2012-01-10 14:42
patch file icon Handle-SIGBUS-when-underlying-mmap-ed-file-is-shrink.patch [^] (13,793 bytes) 2012-01-10 14:47 [Show Content]

- Relationships

-  Notes
(0006637)
edwin (reporter)
2012-01-10 14:44

P.S.: the patch doesn't raise Bus_error on Win32, but AFAIK Win32 locks mapped files, and you wouldn't be able to shrink it anyway.
(0006638)
edwin (reporter)
2012-01-10 14:48

Added a patch without the boot/ changes for easier review:
http://caml.inria.fr/mantis/file_download.php?file_id=565&type=bug [^]
(0006640)
gerd (reporter)
2012-01-10 21:21

Interesting idea, and I also ran already into this problem, so I'd really appreciate a solution.

Edwin, I don't understand why it is always safe to raise an exception when the bus error occurs. Even if we only look at the case this patch made for, namely trapping illegal bigarray accesses. If Ocaml code is running at this time, the code may be wrong for allowing exceptions at this point (probably easy to fix). If C code is running, everything can go wrong that may go wrong (e.g. memory may remain uninitialized). If we could recognize the latter case, it would be better.

There are also other reasons for SIGBUS, and they are quite platform-dependent. We should exclude these in the signal handler (somehow).

A safe solution would be probably to enable this special sigbus handler immediately before the bigarray access, and to disable it after the access. But this would make bigarray access a bit more expensive.
(0006641)
edwin (reporter)
2012-01-10 21:52

btw for OCamlNet you'd need to set the In_mapped_area flag (perhaps wrapped
in an #ifdef In_mapped_area) for the file when mmap/munmap and you would get same behaviour as Bigarray (see patch comments).

I agree that we should check (in a platform specific way?) the SIGBUS reason, as it can also be unaligned access (and other, like stack overflow/underflow).
The asmrun/ patch already checks that the fault came from a Bigarray mapped file
(In_mapped_area page flag), so that only leaves two other fault possibilities:
 - unaligned access, but the compiler should already prevent this
 - permission problem, i.e. write to a PROT_READ page, or read from PROT_NONE page. Don't know if this can generate a SIGBUS on some platforms.

The byterun/ code only checks that the fault comes from Bigarray mapped file (In_mapped_area flag), as I don't know a way to check that the fault came from OCaml code (its the caml_ba_get_N and caml_ba_set_generic C functions that die in this case, or the copy* functions that they call). If this is too unsafe,
then we could set the sigbus handler only for native code.

Instead of raising the Bus_error (or Failure, or other exception) we could stat() the file to determine the new bounds, and update Bigarray's idea of what the bounds should be. And then raise a bounds violation error. But that seems like overkill for an otherwise rare condition.
(0006673)
xleroy (administrator)
2012-01-14 10:00

Thanks for the suggestion and the clever patch. I like the idea of recording mmap-ed areas in Caml's master page table. I see a problem, though:

On ports that cache runtime variables in registers (like the amd64 port, the one we care most about), turning a signal into a synchronous exception is possible only if the signal occurs while executing ocamlopt-generated code. It's not possible if we're inside C code. Your patch correctly accounts for this. However, it means that invalid bigarray accesses performed from C will still crash the program on an unhandled SIGBUS.

Bytecode programs always go through the C functions caml_ba_{get,set}_* to access bigarrays. Native-code programs also go through these functions in a number of cases (high-dimension bigarrays and bigarrays whose type isn't fully statically known). It's only in a few special cases that the bigarray access can be inlined by ocamlopt.

I don't think there is any easy solution to this issue. The most reasonable thing might be to document the issue and tell users "don't do that" (memory-mapping a file whose size can change asynchronously).
(0006676)
edwin (reporter)
2012-01-14 11:41
edited on: 2012-01-14 11:42

Most of the time the "don't do that" is outside the control of the OCaml application though, so it'd be nice if a solution can be found on the OCaml side.
Its a rare situation though - for a C application (ClamAV) it took about 6 years until the first SIGBUS crash got reported - so if a nice OCaml solution can't be found just documenting the issue would be fine.

I think we could set/reset a flag in caml_ba_{get,set} as gerd suggested,
and change the signal handler to check for that flag.
Then raise the exception even if we are not in OCaml code, but the flag is set. Except in that case it should also leave caml_young_ptr/caml_exception_pointer untouched. Same flag could be used in bytecode runtime to make sure the SIGBUS came from bigarray.

Actually all the signal handler setup and signal handling could be moved to Bigarray (as ocp suggested), leaving only the page-table changes in the byte/asm runtimes.

This should be safe:
 - caml_ba_{get,set} doesn't enter blocking section, so we know only one
thread runs it, and that no OCaml code is running at that time
 - caml_ba_{get,set} is not noalloc, so it is called via caml_c_call and the register-cached variables are saved to global variables (caml_young_ptr/exception_pointer)
 - all we have to do is to leave the global pointers alone in the signal handler as they already have the correct values, and just raise the exception

(0006721)
protz (manager)
2012-01-18 11:05

Hi edwin,

We had a lengthy discussion about that yesterday, and we feel like there's no good solution to this; trying to make things rights would take us too far.

Therefore, I have implemented you first suggestion, namely, documenting this behavior clearly, so that users are aware of this "shortcoming". This is r12042

Thanks,

jonathan

- Issue History
Date Modified Username Field Change
2012-01-10 14:42 edwin New Issue
2012-01-10 14:42 edwin File Added: 0001-Handle-SIGBUS-when-underlying-mmap-ed-file-is-shrink.patch
2012-01-10 14:44 edwin Note Added: 0006637
2012-01-10 14:47 edwin File Added: Handle-SIGBUS-when-underlying-mmap-ed-file-is-shrink.patch
2012-01-10 14:48 edwin Note Added: 0006638
2012-01-10 21:21 gerd Note Added: 0006640
2012-01-10 21:52 edwin Note Added: 0006641
2012-01-14 10:00 xleroy Note Added: 0006673
2012-01-14 10:00 xleroy Status new => feedback
2012-01-14 11:41 edwin Note Added: 0006676
2012-01-14 11:41 edwin Status feedback => new
2012-01-14 11:42 edwin Note Edited: 0006676 View Revisions
2012-01-18 11:05 protz Note Added: 0006721
2012-01-18 11:05 protz Status new => resolved
2012-01-18 11:05 protz Fixed in Version => 3.13.0+dev
2012-01-18 11:05 protz Resolution open => fixed
2012-01-18 11:05 protz Assigned To => protz
2013-08-31 12:44 xleroy Status resolved => closed


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker