|Anonymous | Login | Signup for a new account||2018-11-19 01:56 CET|
|Main | My View | View Issues | Change Log | Roadmap|
|View Issue Details|
|ID||Project||Category||View Status||Date Submitted||Last Update|
|0007100||OCaml||otherlibs||public||2015-12-18 11:24||2018-11-09 14:31|
|Platform||OS||Linux and Mirage||OS Version|
|Target Version||Fixed in Version||4.08.0+dev|
|Summary||0007100: Bigarray's caml_ba_alloc doesn't try GC if malloc fails|
|Description||If there happens to be no memory available when allocating a bigarray because a GC is due then it raises Out_of_memory, even if memory would be available after GC.|
|Steps To Reproduce||This program crashes with "Fatal error: exception Out_of_memory" if run in an environment with limited memory (so that malloc may return null; tested with "ulimit -Sv 52000"):|
let () =
let rec loop () =
let x = Array1.create Char c_layout 102400 in
loop () in
However, it works with an explicit call to the GC:
let () =
let rec loop () =
let x =
try Array1.create Char c_layout 102400
with Out_of_memory ->
Array1.create Char c_layout 102400 in
loop () in
|Additional Information||MirageOS uses bigarrays extensively (via Cstruct), and this causes MirageOS unikernels to crash from time to time.|
|Tags||No tags attached.|
|The problem with "triggering a GC" is that you can easily get into a state where every allocation triggers a GC and the program gets bogged down to the speed of a snail, which is worse than crashing.|
Isn't that just how GC works? You run out of memory and then run a GC. Not running a GC because it *might* not free memory makes no sense to me (crashing afterwards if it fails might be OK though).
If OCaml doesn't run the GC when it runs out the memory, then applications have to instead. e.g. we currently have:
Each time we get a network packet, we check the memory situation. If less than 10% is free, we Gc.full_major. Compared to having OCaml do it, this means:
1. We become slow at close to 90% used, rather than close to 100%.
2. Sometimes we still crash (more margin => less chance of crash, but more RAM wasted).
3. Every input event (incoming packet, user commands, etc) needs to run the check.
I think the problem with MirageOs usage of bigarrays is more the value of CAML_BA_MAX_MEMORY (1Gb) that's very far from the average memory one wants to spend on bigarrays in a microkernel (which often times would run with 256Mb of RAM or even less).
Especially given the terrible page allocator of minios, which can allocate only power of two number of pages for large allocations.
Therefore, MirageOs is going to malloc, say, 32KiB for a 20KiB cstruct, and says the GC that "unless I have mallocated 50000 such blocks there is no need to run garbage collection".
I think we or the Mirage people need to do something to address this issue, it's just unclear to me what needs to be done. @doligez could you please restart the discussion?
@talex, OCaml's GC doesn't work like that because it's incremental: it tries to do enough work, as the program is running, to make sure it won't ever run out of memory. When the program does run out of memory, we assume it means the program is allocating faster than it is dropping objects, which means its memory needs are increasing, so we increase the heap size.
For the CAML_BA_MAX_MEMORY problem, I have a posssible solution: instead of using a constant, use a proportion of the heap size, set by the user or by the program. For example, if you set it at 100%, it means you are allocating half your memory to the heap, and the other half to bigarrays (along with other external data, if you use other libraries with custom objects).
Would that be a workable solution?
doligez: yes, I was confused when I wrote this. I was imagining that OCaml's GC worked like Java's.
rixed pointed out in https://github.com/mirage/io-page/issues/38 [^] that Mirage's io-page does not instruct the GC of how much memory could be free by a GC, so that could be a big part of the problem. We should probably fix that and reopen this issue if that doesn't fix it.
I think there is a real problem here: caml_alloc_custom does not have any specific logic to trigger a minor GC when too many "external" memory is used by custom blocks in the minor heap. One can thus easily a lot of memory with e.g. bigarrays -- and reach an OOM -- before the GC even triggers. This does not even depend on the value for CAML_BA_MAX_MEMORY.
It seems one would need some logic to keep track of the "size" of external memory used by custom blocks in the minor heap (i.e. the mem/max arguments to caml_alloc_custom) and force a minor GC when a given threshold is reached.
|Alternatively, one could put a limit to the "external size" of custom blocks allocated in the minor heap. For instance, it makes sense to allocate "small float bigarrays" in the minor heap, but for large ones, the benefit is less clear.|
|https://github.com/ocaml/ocaml/pull/1738 [^] seems to fix the problem.|
|2015-12-18 11:24||talex||New Issue|
|2016-01-22 17:25||doligez||Note Added: 0015266|
|2016-01-22 17:25||doligez||Severity||crash => major|
|2016-01-22 17:25||doligez||Target Version||=> 4.03.1+dev|
|2016-01-23 18:28||talex||Note Added: 0015268|
|2016-11-30 21:27||rixed||Note Added: 0016616|
|2017-02-16 11:03||xleroy||Note Added: 0017277|
|2017-02-16 11:03||xleroy||Status||new => acknowledged|
|2017-02-16 11:03||xleroy||Target Version||4.03.1+dev => 4.06.0 +dev/beta1/beta2/rc1|
|2017-02-23 16:42||doligez||Category||OCaml otherlibs => otherlibs|
|2017-03-10 11:22||shinwell||Assigned To||=> doligez|
|2017-03-10 11:22||shinwell||Status||acknowledged => assigned|
|2017-10-05 17:55||doligez||Note Added: 0018488|
|2017-10-05 17:55||doligez||Target Version||4.06.0 +dev/beta1/beta2/rc1 =>|
|2017-10-05 18:07||doligez||Relationship added||related to 0007198|
|2017-10-05 18:07||doligez||Relationship added||related to 0007180|
|2017-10-05 18:09||doligez||Relationship added||related to 0007158|
|2017-10-06 16:47||talex||Note Added: 0018495|
|2017-11-13 14:52||yallop||Relationship added||has duplicate 0007670|
|2017-11-13 15:10||frisch||Note Added: 0018654|
|2017-11-13 15:12||frisch||Note Added: 0018655|
|2017-11-13 15:52||frisch||Note Added: 0018656|
|2017-11-13 16:15||frisch||Relationship added||related to 0007671|
|2018-11-05 16:45||doligez||Note Added: 0019435|
|2018-11-09 14:31||frisch||Status||assigned => resolved|
|2018-11-09 14:31||frisch||Fixed in Version||=> 4.08.0+dev|
|2018-11-09 14:31||frisch||Resolution||open => fixed|
|Copyright © 2000 - 2011 MantisBT Group|