Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0007185OCamlruntime system and C interfacepublic2016-03-17 15:132018-07-25 23:48
Reportermlasson 
Assigned Todoligez 
PrioritylowSeverityfeatureReproducibilityalways
StatusassignedResolutionopen 
PlatformOSOS Version
Product Version 
Target VersionFixed in Version 
Summary0007185: Is it possible to get rid of the "fatal" out of memory ?
DescriptionAs you all know, there are essentially two ways a caml program can fail due to a lack of memory.

Either,
  - it raises a catchable Out_of_memory exception; this is usually raised by a C binding that wants to report an allocation failure or by the runtime when caml_alloc_shr while we are not in a minor collection,

  - or with a fatal error that terminates the program if caml_alloc_shr is not able to expand the heap in the middle of a minor collection.

The first situation can be exercised by allocating big chunks of memory:
  let _ = try
    ignore (Array.init Sys.max_array_length
       (fun k -> Array.init
         Sys.max_array_length (fun _ -> k)))
  with Out_of_memory -> Printf.printf "I've survived\n%!" (* Printed *)

whereas the second is likely to happen while reaching the limit with small increments of memory:
  let rec init acc n =
    if n >= 0 then init (n :: acc) (n - 1)
    else acc

  let _ = try
    ignore (init [] max_int)
  with Out_of_memory -> Printf.printf "I've survived\n%!" (* Not printed *)

In some applications reaching the memory limits is a normal way to use a program (eg. in scientific or financial computing it is natural to push a system to its limits and these limits are often quite difficult to estimate without actually running the computation). In that cases, having a decent way to report the failure and its reason to the user is more complicated in the case of fatal error.

Would it really be impossible to hack the GC to avoid fatal errors and always raise Out_of_memory ?
While we are in the middle of a minor collection, could we somehow undo the unfinished job of the GC (ie. put everything we've just copied in major heap back to the minor heap) in order to raise the exception as if we've never tried to collect ? Or has some information been definitively lost in the process ?
TagsNo tags attached.
Attached Files

- Relationships

-  Notes
(0015534)
frisch (developer)
2016-03-17 15:29

It seems we could indeed scan the minor heap linearly, detect blocks which have been copied to the major heap in order to copy them back. The hard part might be to undo the rewriting of pointers from other blocks into these moved blocks, since the information is not kept, AFAICT. Perhaps one could instead scan from the roots again.

Damien: do you think it is somehow doable?

While the problem is most noticeable on 32-bit architectures, it is also applicable to other systems with constraints on memory usage.
(0015535)
jacques-henri.jourdan (manager)
2016-03-17 15:48

Another solution, I think, would be to pre-allocate a block of memory for the major heap that is never touched, except when getting out of memory.

When encountering an out of memory, we:
1- use this "emergency" block to empty the minor heap
2- unwind the stack to find the exception handler, thus releasing local roots
3- re-run the major GC (and the compactor), until freeing memory
4- re-allocate the emergency region
5- launch the exception handler
(0015537)
frisch (developer)
2016-03-17 22:38

This is a very interesting approach. Damien: do you see any obstacle or downside to this approach (except that one "waste" the equivalent of the minor heap size)?
(0015539)
lpw25 (developer)
2016-03-18 01:38

What happens if you can't reallocate the emergency buffer in step 4?
(0015540)
frisch (developer)
2016-03-18 09:02

> What happens if you can't reallocate the emergency buffer in step 4?

I can see several variants:

 - Immediately try to reallocate the buffer in step 4; if it fails, simply continue unwinding the stack until the next handler, and iterate. The downside is that this can silently drop handlers that would for instance log the error, or restore some invariants. Of course, if they need to allocate, the fact that they cannot be guaranteed to be executed in case of OOM condition is clear enough, but in case they manage to do their job without allocating, one can do better:

 - Instead of reallocating the buffer immediately, keep the allocation pointer to the top of the minor heap (even though it is empty) so that the next allocation will trigger the GC. Only reallocate at this point (and re-raise OOM if not possible). This preserves the expected semantics that the OOM exception is only raised at allocation points; and it guarantees that e.g. a try..finally block that does not allocate can always do its job.

 - Immediately try to reallocate the buffer in step 4; if it fails, abort the process as of today.

 - Immediately try to reallocate the buffer in step 4; if it fails, split the minor heap in two equal halves (the new minor heap, and the new emergency buffer). Back to previous step if the minor heap becomes too small.
(0016826)
shinwell (developer)
2016-12-08 10:13

As far as I recall we were considering turning _all_ out-of-memory errors into fatal errors to reduce the number of possibilities for asynchronous exceptions. (cf. GPR#852).

@doligez Can you comment?

- Issue History
Date Modified Username Field Change
2016-03-17 15:13 mlasson New Issue
2016-03-17 15:29 frisch Note Added: 0015534
2016-03-17 15:48 jacques-henri.jourdan Note Added: 0015535
2016-03-17 22:38 frisch Note Added: 0015537
2016-03-18 01:38 lpw25 Note Added: 0015539
2016-03-18 09:02 frisch Note Added: 0015540
2016-12-08 10:13 shinwell Note Added: 0016826
2016-12-08 10:13 shinwell Assigned To => doligez
2016-12-08 10:13 shinwell Status new => assigned
2016-12-08 10:13 shinwell Status assigned => acknowledged
2016-12-08 10:13 shinwell Status acknowledged => assigned
2017-02-23 16:43 doligez Category OCaml runtime system => runtime system
2017-03-03 17:45 doligez Category runtime system => runtime system and C interface


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker