|Anonymous | Login | Signup for a new account||2018-04-21 08:07 CEST|
|Main | My View | View Issues | Change Log | Roadmap|
|View Issue Details|
|ID||Project||Category||View Status||Date Submitted||Last Update|
|0007180||OCaml||runtime system and C interface||public||2016-03-14 21:53||2018-03-08 11:03|
|Target Version||Fixed in Version|
|Summary||0007180: Change of behavior in Gc.major_slice in 4.03|
|Description||I am trying to understand the changes in the Gc heuristics in the upcoming 4.03 w.r.t custom blocks. More specifically, I looked at how the value of caml_extra_heap_resources in major_gc.c contributes to make the Gc run major collection more often. |
I am puzzled by the output of the attached test program. It's a loop that allocates k custom blocks with used = 1, max = 1000. The payload of the blocks is a pointer to (big) heap allocated region (to detect issues). The main loop calls `Gc.major_slice i` every so often (where i is a small int).
In 4.02.3, the program reaches a steady state: the custom blocks are deallocated as expected, Gc.major_slice returns a non-zero integer.
In todays's 4.03 (or 4.03.0+beta1), depending on the value of k and i, it's possible to get a steady state, or to eat all the available memory. If Gc.major_slice is called with i = 0 or a small int, its return value is 0. If Gc.major_slice is called with i = 10, the return value of Gc.major_slice is still 0, and the program quickly eats all available memory.
Returning 0 seems to violate the documentation for 4.02.3 "In all cases, the result is the computed slice size." (It would be nice to document the unit of the size here.)
The fact that the program eats all memory seems to violate this part of the manual (which has not changed between 4.02.3 and 4.03.0+beta1)
The way I read it is that I should expect one full major collection every N allocations (when used/max = 1 / N). It seems that giving "bad" slice sizes to Gc.major_slice confuses the speed heuristic.
I tried to have a look at the Gc code to understand better what was going on, and I am confused by the code that was added by this commit https://github.com/ocaml/ocaml/commit/0225ca01e39289ce1801fb09cd011cdbfb542b8d [^]
if p is capped at 0.3, why do we reset caml_extra_heap_resources to 0.0? Shouldn't it be decreased by the actual amount of work that was done to copy with the steady state assumption?
|Steps To Reproduce||opam switch 4.03.0+beta1|
make test # build the test binary
make success # call Gc.major_slice 0, reach a steady state.
make failure # call Gc.major_slice 10, runs out of memory
opam switch 4.02.3
make test # build the test binary
make success # call Gc.major_slice 0, reach a steady state
make failure # call Gc.major_slice 10, reach a steady state
|Tags||No tags attached.|
|Attached Files|| test-gc-major-slice.tar.gz [^] (994 bytes) 2016-03-14 21:53|
test-gc-minor-slice-4-03.ml [^] (581 bytes) 2016-03-22 17:35 [Show Content]
test-gc-minor-slice.tar.gz [^] (1,053 bytes) 2016-03-29 10:24
Looking at the changes introduced by the aforementioned commit, the return type of
caml_major_collection_slice was changed to void, and Gc.major_slice has been explicitly changed to return 0. If this change is intentional, I would be happy to edit the relevant entry to mark it as potentially breaking in the changelog and update the documentation in gc.mli to explain the change.
If the other change (the fact that a valid program in 4.02.3 runs out of memory in 4.03) is considered a bug, I think I understand enough to propose a patch, but I would like a confirmation that this is indeed a bug before spending time on this.
|Yes, the change to Gc.major_slice is intentional, and yes the running out of memory is very probably a bug. I'm looking into it.|
Here's the bug: Gc.major_slice triggers a minor collection before the major slice. This was needed by the GC invariants up to 4.02.3 but not for 4.03.0, and I forgot to remove it.
Triggering the minor collection like that means (in this program) that the minor heap is never half-full, so we never get an automatic-mode major slice, and the slice size computation is all wrong.
When I fixed this bug, I got another problem: your program does so many major slices between minor collections that it would do more major cycles than minor collections. But we need an empty minor heap to start the major GC, so a lot of these slices just end up doing nothing.
I solved this second problem by requesting a minor GC as soon as the major cycle is finished, so the next cycle can start right away.
Both changes are in commit 84b86b5a163e397f2aaa1590e09b14b183219331 and they resolve the problem in your example program.
For the return value, I wanted to return something in the same unit as the input, but that would involve a backward calculation from the internal p parameter and I don't like that. Do you really need this output? I'm not sure the change has enough breaking potential to warrant a mark in the changelog.
|My approach to breaking-star marks in the changelog is that the entry containing them should be clear enough for users to be able to know whether they are affected just by reading it. If you follow this policy, the fatigue cost of an additional mark can remain modest. I would rather make more people read an irrelevant entry than have some careful people uninformed about code breakage despite their carefulness.|
Thanks for the commit. I think it would still be nice to update the documentation to reflect the fact that the return value is now always 0. I am not sure to what extent this is a breaking change in itself.
I looked at the fix, and I am still confused by the output of the Gc log for the new attached test program. It seems to me that (for this program) extra_heap_resources is always 0 for 4.03, while it takes the expected value (1) in 4.02.3. Running the said test program works "fine" in 4.02.3, but runs out of memory in 4.03.
It seems that https://github.com/ocaml/ocaml/commit/8851e6b7d5850ff5d7dab6971f3a1bc26455b1f4 [^]
changed the code for caml_alloc_custom but the branch that allocates the custom block on the minor heap does not adjust the Gc speed (and hence, does not change the value of extra_heap_resources). Is this expected?
result = caml_alloc_custom(&test_ops, sizeof(value) * Max_young_wosize, 1, 1000);
to make sure that the said custom blocks are allocated on the major heap makes extra_heap_resources non zero in 4.03, and makes the whole test case not run out of memory in 4.03.
edited on: 2016-03-25 12:31
Thanks. This is a nasty bug in the commit you referenced. [edit: in fact the bug was already present before, but this commit made it much more likely to trigger]
About updating the documentation, I'll do that if you don't need the return value. Otherwise I will implement what the documentation says.
@gasche, I'll make the entry more explicit and add the mark.
|I think I fixed it in commit c95ccf7e9b3df76fb159113c36312a4e9914f627 but I can't reproduce the bug with your second program. Can you check the fix (and if you can review the patch that would be great) ?|
|I will have a look either today or tuesday, and will come back to you. Just to be sure, when you say that you cannot reproduce with my test program, is it because it fails out of memory also in 4.02.3 ? (I have quite a bit of RAM, and the apparent steady state was quite high, so it might the reason for the issue.)|
I reuploaded the test for the gc minor slice. On my system, the program seems to have a steady state memory consumption at around 16gb in 4.02.3, and a steady state at ~22gb in latest 4.03, but I think this increase is irrelevant.
To some extent, this is a really bad test case, because it's really not precise and changing a few constants would change the behavior quite a bit. On the other hand, it's quite hard to come up with an example with precise claims about how much memory should be used in 4.02.3, and how much it uses in 4.03. Now, the numbers of "extra_heap_resources" seem to be correct, and that's probably what we should look at.
I have made some comments on your commit on github
My main question regarding the current patch is whether or not one should address the issue of counting the extra_resources on the minor heap and triggering the gc automatically when it reaches some level.
>My main question regarding the current patch is whether or not one should address the issue of counting the extra_resources on the minor heap and triggering the gc automatically when it reaches some level.
We'll do that for 4.04.
For the test, I meant it didn't run out of memory on my machine with any OCaml version.
|This is marked as major with target version = 4.04. Does there remain something to be done for 4.04?|
|Nothing has been done to count the amount of extra_resources on the minor heap. I would argue to change the target version for succ(4.04)|
|This PR is still open because I got a report that jenga's memory behavior changed quite a lot because of this issue with extra_resources on the minor heap.|
> Nothing has been done to count the amount of extra_resources on the minor heap.
|2016-03-14 21:53||braibant||New Issue|
|2016-03-14 21:53||braibant||File Added: test-gc-major-slice.tar.gz|
|2016-03-21 11:38||braibant||Note Added: 0015561|
|2016-03-21 15:22||doligez||Note Added: 0015562|
|2016-03-21 15:23||doligez||Status||new => acknowledged|
|2016-03-21 15:23||doligez||Target Version||=> 4.03.0+dev / +beta1|
|2016-03-21 15:23||doligez||Assigned To||=> doligez|
|2016-03-21 15:23||doligez||Status||acknowledged => assigned|
|2016-03-22 15:43||doligez||Note Added: 0015572|
|2016-03-22 15:43||doligez||Status||assigned => resolved|
|2016-03-22 15:43||doligez||Resolution||open => fixed|
|2016-03-22 15:43||doligez||Fixed in Version||=> 4.03.0+dev / +beta1|
|2016-03-22 16:06||gasche||Note Added: 0015573|
|2016-03-22 17:35||braibant||File Added: test-gc-minor-slice-4-03.ml|
|2016-03-22 17:47||braibant||Note Added: 0015576|
|2016-03-22 18:01||braibant||Note Added: 0015578|
|2016-03-22 18:07||braibant||Note Added: 0015579|
|2016-03-22 18:07||braibant||Status||resolved => feedback|
|2016-03-22 18:07||braibant||Resolution||fixed => reopened|
|2016-03-23 00:13||braibant||Note Deleted: 0015579|
|2016-03-24 15:17||doligez||Note Added: 0015597|
|2016-03-25 12:31||doligez||Note Edited: 0015597||View Revisions|
|2016-03-25 14:47||doligez||Note Added: 0015616|
|2016-03-25 15:09||braibant||Note Added: 0015617|
|2016-03-25 15:09||braibant||Status||feedback => assigned|
|2016-03-29 10:24||braibant||File Added: test-gc-minor-slice.tar.gz|
|2016-03-29 11:02||braibant||Note Added: 0015640|
|2016-04-04 16:31||doligez||Note Added: 0015672|
|2016-04-04 16:31||doligez||Target Version||4.03.0+dev / +beta1 => 4.03.1+dev|
|2016-07-13 10:17||frisch||Fixed in Version||4.03.0+dev / +beta1 =>|
|2016-07-13 10:17||frisch||Target Version||4.03.1+dev => 4.04.0 +dev / +beta1 / +beta2|
|2016-08-30 11:05||frisch||Note Added: 0016251|
|2016-09-01 21:26||braibant||Note Added: 0016260|
|2016-09-02 17:39||doligez||Target Version||4.04.0 +dev / +beta1 / +beta2 => 4.05.0 +dev/beta1/beta2/beta3/rc1|
|2017-02-21 14:22||doligez||Note Added: 0017391|
|2017-02-22 15:12||xleroy||Priority||high => normal|
|2017-02-22 15:12||xleroy||Severity||major => minor|
|2017-02-22 15:12||xleroy||Target Version||4.05.0 +dev/beta1/beta2/beta3/rc1 => 4.06.0 +dev/beta1/beta2/rc1|
|2017-02-23 16:43||doligez||Category||OCaml runtime system => runtime system|
|2017-03-03 17:45||doligez||Category||runtime system => runtime system and C interface|
|2017-10-05 18:07||doligez||Relationship added||related to 0007100|
|2017-10-05 18:08||doligez||Target Version||4.06.0 +dev/beta1/beta2/rc1 =>|
|2017-11-14 11:38||frisch||Note Added: 0018661|
|2018-03-08 11:03||gasche||Relationship added||child of 0007750|
|Copyright © 2000 - 2011 MantisBT Group|