[PATCH] handle SIGBUS on shrinked Bigarray mapped files #5471

vicuna · 2012-01-10T13:42:13Z

Original bug ID: 5471
Reporter: @edwintorok
Assigned to: @protz
Status: closed (set by @xavierleroy on 2013-08-31T10:44:26Z)
Resolution: fixed
Priority: normal
Severity: minor
Platform: Linux
OS: Debian GNU/Linux
OS Version: Linux 3.2.0
Version: 3.12.1
Fixed in version: 3.13.0+dev
Category: ~DO NOT USE (was: OCaml general)
Monitored by: mehdi

Bug description

Accesses to bigarrays are bounds-checked, but the bounds for Bigarray.*.map_file
is established when the file is mapped.
Later the file may shrink or grow, if another (or the same) process modifies it.
If the file shrinks, and you attempt to read/write past the new end-of-file
the OS may raise SIGBUS (for example Linux does).

It is not possible to handle SIGBUS purely on OCaml side, because
attempting to set an OCaml signal handler would cause SIGBUS to be
re-delivered infinitely.

Attached patch (for ocaml trunk) implements SIGBUS handling in the byterun/asmrun runtimes.
It also introduces a new page flag In_mapped_area, that is set by bigarray.
This way we know the SIGBUS came from our mapping, and not from some other
native code mapping.

Patch note: The stdlib/printexc.ml change require an 'make coreboot' run.
If you 'git apply attachedpatch.patch' then it'll update boot/ for you.

Steps to reproduce

Compile the below code with ocamlc and ocamlopt
Run xbyte and xnat
Current results: both crash
$ ./xbyte (or ./xnat)
trying to read invalid mem area
Bus error
Uncomment the Sys.set_signal line, and run again. Now you got an infinite loop (SIGBUS delivered over and over again and the OCaml signal handle never reached).
Expected result: no crash, an OCaml exception is raised (like with stack overflow, bounds errors, etc.):

Expected xbyte output:

trying to read invalid mem area
exception caught: Bus error
Raised by primitive operation at unknown location
trying to write invalid mem area
Raised by primitive operation at unknown location
exception caught: Bus error

Expected xnat output:
trying to read invalid mem area
exception caught: Bus error
Raised by primitive operation at file "pervasives.ml", line 250, characters 21-28
trying to write invalid mem area
Raised by primitive operation at file "pervasives.ml", line 250, characters 21-28
exception caught: Bus error

Additional information

(* test code

ocamlc bigarray.cma unix.cma x.ml -o xbyte; ./xbyte
ocamlopt bigarray.cmxa unix.cmxa x.ml -o xnat; ./xnat
*)

open Bigarray
open Unix

exception Sigbus

let sigbus_handler _ =
raise Sigbus;;

File attachments

vicuna · 2012-01-10T13:44:25Z

Comment author: @edwintorok

P.S.: the patch doesn't raise Bus_error on Win32, but AFAIK Win32 locks mapped files, and you wouldn't be able to shrink it anyway.

vicuna · 2012-01-10T13:48:11Z

Comment author: @edwintorok

Added a patch without the boot/ changes for easier review:
http://caml.inria.fr/mantis/file_download.php?file_id=565&type=bug

vicuna · 2012-01-10T20:21:30Z

Comment author: gerd

Interesting idea, and I also ran already into this problem, so I'd really appreciate a solution.

Edwin, I don't understand why it is always safe to raise an exception when the bus error occurs. Even if we only look at the case this patch made for, namely trapping illegal bigarray accesses. If Ocaml code is running at this time, the code may be wrong for allowing exceptions at this point (probably easy to fix). If C code is running, everything can go wrong that may go wrong (e.g. memory may remain uninitialized). If we could recognize the latter case, it would be better.

There are also other reasons for SIGBUS, and they are quite platform-dependent. We should exclude these in the signal handler (somehow).

A safe solution would be probably to enable this special sigbus handler immediately before the bigarray access, and to disable it after the access. But this would make bigarray access a bit more expensive.

vicuna · 2012-01-10T20:52:49Z

Comment author: @edwintorok

btw for OCamlNet you'd need to set the In_mapped_area flag (perhaps wrapped
in an #ifdef In_mapped_area) for the file when mmap/munmap and you would get same behaviour as Bigarray (see patch comments).

I agree that we should check (in a platform specific way?) the SIGBUS reason, as it can also be unaligned access (and other, like stack overflow/underflow).
The asmrun/ patch already checks that the fault came from a Bigarray mapped file
(In_mapped_area page flag), so that only leaves two other fault possibilities:

unaligned access, but the compiler should already prevent this
permission problem, i.e. write to a PROT_READ page, or read from PROT_NONE page. Don't know if this can generate a SIGBUS on some platforms.

The byterun/ code only checks that the fault comes from Bigarray mapped file (In_mapped_area flag), as I don't know a way to check that the fault came from OCaml code (its the caml_ba_get_N and caml_ba_set_generic C functions that die in this case, or the copy* functions that they call). If this is too unsafe,
then we could set the sigbus handler only for native code.

Instead of raising the Bus_error (or Failure, or other exception) we could stat() the file to determine the new bounds, and update Bigarray's idea of what the bounds should be. And then raise a bounds violation error. But that seems like overkill for an otherwise rare condition.

vicuna · 2012-01-14T09:00:08Z

Comment author: @xavierleroy

Thanks for the suggestion and the clever patch. I like the idea of recording mmap-ed areas in Caml's master page table. I see a problem, though:

On ports that cache runtime variables in registers (like the amd64 port, the one we care most about), turning a signal into a synchronous exception is possible only if the signal occurs while executing ocamlopt-generated code. It's not possible if we're inside C code. Your patch correctly accounts for this. However, it means that invalid bigarray accesses performed from C will still crash the program on an unhandled SIGBUS.

Bytecode programs always go through the C functions caml_ba_{get,set}_* to access bigarrays. Native-code programs also go through these functions in a number of cases (high-dimension bigarrays and bigarrays whose type isn't fully statically known). It's only in a few special cases that the bigarray access can be inlined by ocamlopt.

I don't think there is any easy solution to this issue. The most reasonable thing might be to document the issue and tell users "don't do that" (memory-mapping a file whose size can change asynchronously).

vicuna · 2012-01-14T10:41:42Z

Comment author: @edwintorok

Most of the time the "don't do that" is outside the control of the OCaml application though, so it'd be nice if a solution can be found on the OCaml side.
Its a rare situation though - for a C application (ClamAV) it took about 6 years until the first SIGBUS crash got reported - so if a nice OCaml solution can't be found just documenting the issue would be fine.

I think we could set/reset a flag in caml_ba_{get,set} as gerd suggested,
and change the signal handler to check for that flag.
Then raise the exception even if we are not in OCaml code, but the flag is set. Except in that case it should also leave caml_young_ptr/caml_exception_pointer untouched. Same flag could be used in bytecode runtime to make sure the SIGBUS came from bigarray.

Actually all the signal handler setup and signal handling could be moved to Bigarray (as ocp suggested), leaving only the page-table changes in the byte/asm runtimes.

This should be safe:

caml_ba_{get,set} doesn't enter blocking section, so we know only one
thread runs it, and that no OCaml code is running at that time
caml_ba_{get,set} is not noalloc, so it is called via caml_c_call and the register-cached variables are saved to global variables (caml_young_ptr/exception_pointer)
all we have to do is to leave the global pointers alone in the signal handler as they already have the correct values, and just raise the exception

vicuna · 2012-01-18T10:05:44Z

Comment author: @protz

Hi edwin,

We had a lengthy discussion about that yesterday, and we feel like there's no good solution to this; trying to make things rights would take us too far.

Therefore, I have implemented you first suggestion, namely, documenting this behavior clearly, so that users are aware of this "shortcoming". This is r12042

Thanks,

jonathan

vicuna closed this as completed Aug 31, 2013

vicuna added the bug label Mar 20, 2019

gadmm mentioned this issue Oct 21, 2022

Add BigArray.Genarray.free #389

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PATCH] handle SIGBUS on shrinked Bigarray mapped files #5471

[PATCH] handle SIGBUS on shrinked Bigarray mapped files #5471

vicuna commented Jan 10, 2012

vicuna commented Jan 10, 2012

vicuna commented Jan 10, 2012

vicuna commented Jan 10, 2012

vicuna commented Jan 10, 2012

vicuna commented Jan 14, 2012

vicuna commented Jan 14, 2012

vicuna commented Jan 18, 2012

[PATCH] handle SIGBUS on shrinked Bigarray mapped files #5471

[PATCH] handle SIGBUS on shrinked Bigarray mapped files #5471

Comments

vicuna commented Jan 10, 2012

Bug description

Steps to reproduce

Additional information

File attachments

vicuna commented Jan 10, 2012

vicuna commented Jan 10, 2012

vicuna commented Jan 10, 2012

vicuna commented Jan 10, 2012

vicuna commented Jan 14, 2012

vicuna commented Jan 14, 2012

vicuna commented Jan 18, 2012