Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0007676OCamlruntime system and C interfacepublic2017-11-22 16:002018-11-13 13:31
Reporterfrisch 
Assigned Todoligez 
PrioritynormalSeverityminorReproducibilityhave not tried
StatusresolvedResolutionfixed 
PlatformOSOS Version
Product Version 
Target VersionFixed in Version4.08.0+dev 
Summary0007676: Allocating custom blocks become expensive with large major heaps
DescriptionCustom blocks allocated in or moved to the major heap add some "pressure" to trigger a major GC slice. The current criterion is such that a GC slice is triggered more often if the size of the major heap grows. This can result is noticeable slowdown.

The function implementing this strategy is:

CAMLexport void caml_adjust_gc_speed (mlsize_t res, mlsize_t max)
{
  if (max == 0) max = 1;
  if (res > max) res = max;
  caml_extra_heap_resources += (double) res / (double) max;
  if (caml_extra_heap_resources > 1.0){
    CAML_INSTR_INT ("request_major/adjust_gc_speed_1@", 1);
    caml_extra_heap_resources = 1.0;
    caml_request_major_slice ();
  }
  if (caml_extra_heap_resources
           > (double) caml_minor_heap_wsz / 2.0
             / (double) caml_stat_heap_wsz) {
    CAML_INSTR_INT ("request_major/adjust_gc_speed_2@", 1);
    caml_request_major_slice ();
  }
}


The second criterion becomes easier and easier to satisfy as the size of the major heap grows. It is not clear to me why the ratio of the size of the minor and major heaps is taken into account, nor why the constant 2 was chosen.

See below for a program that illustrates the behavior. It runs two computations starting from different heap sizes. The first one allocates a large block (in the major heap), then does many small allocations (to trigger a minor GC). The second one create an in_channel (which is a custom block, with a "pressure ratio" of 1/1000, i.e. one asks to force a full GC cycle every 1000 allocations). The timings are:

n =     2000 => 0.22s / 0.23s
n =     4000 => 0.22s / 0.35s
n =     6000 => 0.23s / 0.47s
n =     8000 => 0.22s / 0.73s
n =    10000 => 0.26s / 0.88s
n =    12000 => 0.23s / 1.06s
n =    14000 => 0.23s / 1.02s
n =    16000 => 0.24s / 1.19s
n =    18000 => 0.23s / 1.35s
n =    20000 => 0.25s / 1.30s
n =    22000 => 0.28s / 1.55s
n =    24000 => 0.25s / 1.83s
n =    26000 => 0.26s / 1.60s
n =    28000 => 0.23s / 1.88s
n =    30000 => 0.23s / 1.82s
n =    32000 => 0.25s / 2.79s
n =    34000 => 0.25s / 2.12s
n =    36000 => 0.23s / 2.66s
n =    38000 => 0.21s / 2.52s
n =    40000 => 0.25s / 2.38s


n is proportional to the size of the initial heap. One sees clearly that the speed of the first computation does not depend on the heap size, while the speed of the second one is directly impacted.

The marginal cost of opening a channel raises to about 2ms and would continue to grow with larger major heaps.

This specific problem with in_channel could be addressed in part by reducing the ratio 1/1000, but the real issue here might be with the criterion above, which impacts all kinds of custom objects (that add "pressure").
Additional Information
let a = ref [||]

let test n =
  (* Force the major heap to grow *)
  Gc.compact ();
  a := Array.init n (fun _ -> Array.make 10000 0);
  Gc.compact ();
  let t0 = Sys.time () in
  (* First, try with allocating an array -> the speed of
     this is independent of the initial size of the major heap. *)
  for i = 1 to 1000 do
    let a = Array.make 4096 0 in
    (* Enough allocation to trigger a minor GC *)
    for _j = 1 to 1024 do ignore (Array.make 256 0) done;
    a.(0) <- i
  done;
  let t1 = Sys.time () in
  Gc.compact ();
  let t2 = Sys.time () in
  (* Second, try with allocating a custom (here, an in_channel) ->
     this becomes slower as the heap grows. *)
  for i = 1 to 1000 do
    let ic = open_in "foobar" in
    (* Enough allocation to trigger a minor GC *)
    for _j = 1 to 1024 do ignore (Array.make 256 0) done;
    close_in ic;
  done;
  let t3 = Sys.time () in
  Printf.printf "n = % 8i => %.02fs / %.02fs\n%!" n (t1 -. t0) (t3 -. t2)


let () =
  let oc = open_out "foobar" in close_out oc;
  for i = 1 to 20 do
    test (i * 2000)
  done
TagsNo tags attached.
Attached Files

- Relationships
related to 0006294assignedgarrigue Poor tracking of extra heap resources 
related to 0004108assigneddoligez Memory-mapping of bigarrays may exhaust address space 
related to 0006962acknowledged bigarray lacks a free-like call. 
related to 0006133acknowledged Wish: make max size of custom space runtime-configurable 

-  Notes
(0018761)
xleroy (administrator)
2017-12-20 18:06

Is this another incarnation of the familiar "allocation of custom objects mess up the speed of the major GC" problem? Or a distinct, new problem?
(0018765)
frisch (developer)
2017-12-21 08:59

> Is this another incarnation of the familiar "allocation of custom objects mess up the speed of the major GC" problem? Or a distinct, new problem?

Of course it's in the same family of problems, but I believe it's distinct from other reports. What I find specifically weird here is the second condition:

   (caml_extra_heap_resources > (double) caml_minor_heap_wsz / 2.0 / (double) caml_stat_heap_wsz)

which can ultimately lead to a major GC for every allocation of a custom block. I don't understand the rationale behind computing the ratio of the size of the two heaps and comparing it with "caml_extra_heap_resources".

The first condition, OTOH, seems well understood (including its problems), and corresponds to the documented behavior.

Other tickets are about:

  - tracking custom blocks in the minor heap
  - allowing to specify the pressure on the GC as a ratio of the major heap size
  - making it easy for libraries to give control to their client code about the ratio to be used for its custom blocks
(0019440)
frisch (developer)
2018-11-09 14:20

The sample program now exihibits a much nicer behavior thanks to https://github.com/ocaml/ocaml/pull/1738 [^] . That said, I believe it would still be worth revisiting the condition

  if (caml_extra_heap_resources
           > (double) caml_minor_heap_wsz / 2.0
             / (double) caml_stat_heap_wsz)

which looks super weird to me, so I'm leaving this ticket open.
(0019457)
doligez (administrator)
2018-11-13 13:30

https://github.com/ocaml/ocaml/pull/2144 [^]

- Issue History
Date Modified Username Field Change
2017-11-22 16:00 frisch New Issue
2017-12-20 18:03 xleroy Status new => acknowledged
2017-12-20 18:03 xleroy Relationship added related to 0006294
2017-12-20 18:03 xleroy Relationship added related to 0004108
2017-12-20 18:04 xleroy Relationship added related to 0006962
2017-12-20 18:04 xleroy Relationship added related to 0006133
2017-12-20 18:06 xleroy Note Added: 0018761
2017-12-21 08:59 frisch Note Added: 0018765
2018-11-09 14:20 frisch Note Added: 0019440
2018-11-09 14:21 frisch Assigned To => doligez
2018-11-09 14:21 frisch Status acknowledged => assigned
2018-11-13 13:30 doligez Note Added: 0019457
2018-11-13 13:31 doligez Status assigned => resolved
2018-11-13 13:31 doligez Resolution open => fixed
2018-11-13 13:31 doligez Fixed in Version => 4.08.0+dev


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker