Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

minor_words in gc stats sometimes double counts allocations #7798

Closed
vicuna opened this issue May 24, 2018 · 2 comments · Fixed by #8619
Closed

minor_words in gc stats sometimes double counts allocations #7798

vicuna opened this issue May 24, 2018 · 2 comments · Fixed by #8619
Assignees

Comments

@vicuna
Copy link

vicuna commented May 24, 2018

Original bug ID: 7798
Reporter: @sliquister
Assigned to: @damiendoligez
Status: assigned (set by @damiendoligez on 2018-06-20T13:43:00Z)
Resolution: open
Priority: normal
Severity: minor
Version: 4.06.1
Category: back end (clambda to assembly)
Monitored by: @nojb @gasche @dbuenzli

Bug description

The symptom is simple: sometimes you allocate n words and the gc stats increase by 2n.

Here is a program that shows this behavior [1]:
$ ocamlopt a.ml && ./a.out
minor heap size: 1048576 = (2 * 524288)
allocated: 524289, gc stats error: 3
allocated: 1048575, gc stats error: 6
allocated: 1572861, gc stats error: 9
allocated: 2097147, gc stats error: 12
allocated: 2621433, gc stats error: 18
...

This has been causing some very hard to understand spurious failures in allocation tests where the reported allocation is too high by a few words, despite measuring allocation like so (no signals/ocaml finalizers in this context):

let minor_words_allocated f =
Gc.minor ();
let before = truncate (Gc.minor_words ()) in
Sys.opaque_identity (f ());
let after = truncate (Gc.minor_words ()) in
after - before

This is because if you're unlucky and the call to Gc.minor finishes the major cycle, the major gc goes into Phase_idle and call caml_request_minor_gc (). Which causes the next minor allocation to trigger a minor gc and be double counted because of the bug above.

The double counting does not happen all the time. caml_alloc is correct. The inline assembly is wrong. The assembly functions used with -compact are also wrong.

After having spent hours figuring out what the problem is, the workaround is easy enough: do one allocation after Gc.minor () to set off the requested minor gc, if any.
I also see that the documentation for minor_words says "this is an approximation in native code", but who knows what that emcompasses? Given that experimentally Gc.minor_words () works perfectly (or so it seemed until now), and that allocation tests are reasonable code to wrote, it's only natural to rely on Gc.minor_words ().

So I'd like to ask for the stats to be fixed (and the documentation to be updated, unless it's referring to something else).


The fix doesn't seem too hard: before calling caml_call_gc, undo the subtraction to the minor heap frontier. In the non inlined-assembly, I think this is not contentious. In the inlined assembly, code size might be a concern? But even that seems fixable (shouldn't matter for big allocations, and for small allocations, one can use a family of caml_call_gc_X functions that change back the heap frontier by X bytes and jump to caml_call_gc).

[1]

let rec zero_alloc_bytes_of_int buf n i =
Bytes.set buf i (Char.unsafe_chr (Char.code '0' + (n mod 10)));
if n / 10 = 0
then (
for j = 0 to i / 2; do
let tmp = Bytes.get buf j in
Bytes.set buf j (Bytes.get buf (i - j));
Bytes.set buf (i - j) tmp;
done;
i + 1
)
else zero_alloc_bytes_of_int buf (n / 10) (i + 1)
;;

let zero_alloc_string_of_int buf n =
zero_alloc_bytes_of_int buf n 0
let minor_words () = truncate (Gc.minor_words ())

let () =
let gc = Gc.get () in
Printf.printf "minor heap size: %d = (2 * %d)\n%!"
gc.minor_heap_size (gc.minor_heap_size / 2)
let l = ref []
let buf = Bytes.create 100
let () = Gc.compact ()
let () = Gc.compact ()
let () = Gc.compact ()
let () = Gc.minor ()

let () =
let minor_words_start = minor_words () in
let minor_words_allocated = ref 0 in
let stats_error = ref 0 in
while true do
l := () :: !l;
minor_words_allocated := !minor_words_allocated + 3;
let minor_words_from_stats = minor_words () - minor_words_start in
let stats_error_now = minor_words_from_stats - !minor_words_allocated in
if stats_error_now <> !stats_error
then (stats_error := stats_error_now;
output_string stdout "allocated: ";
let len = zero_alloc_string_of_int buf !minor_words_allocated in
output_substring stdout (Obj.magic (buf : Bytes.t) : string) 0 len;
output_string stdout ", gc stats error: ";
let len = zero_alloc_string_of_int buf !stats_error in
output_substring stdout (Obj.magic (buf : Bytes.t) : string) 0 len;
output_string stdout "\n";
flush stdout;
)
done

@vicuna
Copy link
Author

vicuna commented May 25, 2018

Comment author: @stedolan

We ran into this bug on the multicore branch a while ago, where it caused crashes since we assume that the minor heap does not have holes in it. I didn't realise that this affected trunk as well! The fix is as you say (see commit 8ceec49 on github.com/ocamllabs/ocaml-multicore).

By the way, Alloc_small in caml/memory.h (used for allocations in the bytecode interpreter) already undoes the subtraction before entering the GC, which is why this bug only affects native code.

@vicuna
Copy link
Author

vicuna commented May 25, 2018

Comment author: @gasche

For reference: the commit mentioned by stedolan above can be found at the URL

ocaml-multicore/ocaml-multicore@8ceec49

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants