Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

map_file fails if requested length is larger than RAM+swap #5466

Closed
vicuna opened this issue Jan 5, 2012 · 4 comments
Closed

map_file fails if requested length is larger than RAM+swap #5466

vicuna opened this issue Jan 5, 2012 · 4 comments

Comments

@vicuna
Copy link

vicuna commented Jan 5, 2012

Original bug ID: 5466
Reporter: @mjambon
Status: closed (set by @xavierleroy on 2013-08-31T10:46:24Z)
Resolution: suspended
Priority: normal
Severity: feature
OS: Linux
OS Version: 2.6.34
Version: 3.12.1
Category: ~DO NOT USE (was: OCaml general)
Monitored by: Camarade_Tux @hcarty

Bug description

The map_file function as implemented for the Bigarray module uses systematically the PROT_WRITE flag in its call to mmap (otherlibs/bigarray/mmap_unix.c).

This causes the call to mmap to fail with error "Cannot allocate memory" if the requested length exceeds a certain value and the file is opened in read-only mode (O_RDONLY) and therefore no sharing is possible (MAP_PRIVATE). This limit seems to be slightly under the combined total RAM+swap. I don't know whether there is a good reason for such a limit on writable files, but it should not be a problem with read-only mappings.

Using PROT_READ instead of PROT_READ|PROT_WRITE solves the problem, i.e. allows to map the entirety of a large file into a single array even it is longer than the available memory.

A workaround is to open the file in read-write mode and to enable sharing, but the file permissions may not always allow it.

Steps to reproduce

(*
Create a file test.dat that is larger than total memory+swap.
ocamlopt -o testmap unix.cmxa bigarray.cmxa testmap.ml

$ ./testmap
Fatal error: exception Sys_error("Cannot allocate memory")
*)
let map_file fname =
let fd = Unix.openfile fname [Unix.O_RDONLY] 0o600 in
Bigarray.Array1.map_file fd Bigarray.char Bigarray.c_layout false (-1)

let () = ignore (map_file "test.dat")

(* Workaround *)
let map_file fname =
let fd = Unix.openfile fname [Unix.O_RDWR] 0o600 in
Bigarray.Array1.map_file fd Bigarray.char Bigarray.c_layout true (-1)

let () = ignore (map_file "test.dat")

@vicuna
Copy link
Author

vicuna commented Jan 5, 2012

Comment author: gerd

This looks like a bug in Linux or libc. SUS specifies: "If PROT_WRITE is specified, the application must have opened the file descriptor fildes with write permission unless MAP_PRIVATE is specified in the flags parameter as described below." I guess Linux silently assumes MAP_PRIVATE if the write permission is not given instead of returning EACCES (which is even documented in the Linux man page).

@vicuna
Copy link
Author

vicuna commented Jan 14, 2012

Comment author: @xavierleroy

The Bigarray library could support mapping files without PROT_WRITE, either via an optional parameter to the map_file functions, or automatically if the file is opened readonly. The problem I have with this solution is that any assignment to such a bigarray would crash the program on a SEGV. This doesn't bode well for a language that claims to be type-safe.

@vicuna
Copy link
Author

vicuna commented Jan 17, 2012

Comment author: @mjambon

I can imagine an "unsafe_map_file" function that would allow a call to mmap without PROT_WRITE as you suggest. Such function would be clearly marked as unsafe and possibly hidden from the official documentation.

On top of that, some library could provide read-only access to bigarrays via a read-only (but not immutable) abstract type and the corresponding safe map_file function. It could be an extension of the Bigarray module (Bigarray.Array1.Read_only.t), it could be a new module Bigarray_read_only shipping with the bigarray library, or the task could be left to third-party libraries.

@vicuna
Copy link
Author

vicuna commented Feb 5, 2012

Comment author: @xavierleroy

At this point, I don't quite know what is the best way to address this feature wish, nor how far we should go to support read-only file mappings in Bigarray. So, I'm putting this PR in the "suspended" state. But feel free to experiment with some of the approaches mentioned and let us know of your findings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant