Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poor tracking of extra heap resources #6294

Closed
vicuna opened this issue Jan 13, 2014 · 21 comments
Closed

Poor tracking of extra heap resources #6294

vicuna opened this issue Jan 13, 2014 · 21 comments
Assignees

Comments

@vicuna
Copy link

vicuna commented Jan 13, 2014

Original bug ID: 6294
Reporter: gmelquiond
Assigned to: @garrigue
Status: assigned (set by gmelquiond on 2014-01-15T16:15:41Z)
Resolution: open
Priority: normal
Severity: major
Version: 4.01.0
Target version: later
Fixed in version: 4.08.0+dev/beta1/beta2
Category: runtime system and C interface
Related to: #4108 #6133 #7676
Monitored by: braibant @diml @ygrek bobot gerd @yakobowski

Bug description

Currently, when a custom block is allocated, the caller passes an arbitrary ratio between 0 and 1. This ratio is added to the caml_extra_heap_resources accumulator. Each time the accumulator reaches 1, a garbage collection is triggered and the accumulator is reset to 0.

This "feature" is breaking the usability of Why3ide badly. Consider the following numbers obtained by launching Why3ide on a large testcase.

946.39user 0.36system 16:13.65elapsed 97%CPU (0avgtext+0avgdata 658164maxresident)k
minor_collections: 113838
major_collections: 1424

As you can, it takes mote than 16 minutes before the user can interact with the program. Most of the time is wasted in the garbage collector. If we disable the accumulator, we get the following numbers:

18.98user 0.41system 1:26.67elapsed 22%CPU (0avgtext+0avgdata 761232maxresident)k
minor_collections: 8825
major_collections: 26

Now it takes only 19 seconds for the program to start. That is a x50 speedup!

By the way, this might seem like a degenerate case. But even on smaller testcases, the speedup is noticeable by the user, since each user interaction was triggering a major collection before.

The blame is not solely on OCaml. My opinion is that this is misfeature of OCaml that is misused by Lablgtk, leading to a disastrous user experience when using Labgtk-based applications.

Follow this link if you want to see how we disabled this part of the GC without having to ask our users to recompile OCaml:
https://gforge.inria.fr/plugins/scmgit/cgi-bin/gitweb.cgi?p=why3/why3.git;a=commitdiff;h=2dbd9e653925b3991b99b0b10ba320b7850ae106

@vicuna
Copy link
Author

vicuna commented Jan 15, 2014

Comment author: @garrigue

This is a very interesting test case.
I had already noticed slow behavior when using lablgtk from the top-level, but not in standalone applications.

I agree that this is a bad situation, but what you are doing is just disabling the feature.
This is going to work if your program triggers the GC by itself (i.e. if it allocates sufficiently in the heap), but not if it is doing only user interaction without allocating (or allocating very little).
We do not want to run out of extra-heap resources in that case.

I think we need a new approach, but I'm not sure what.
One possibility would be not to trigger the GC when the accumulator reaches 1, but rather to wait some amount of time before doing it (if it doesn't happen before). This would reduce the significance of the fraction, which is almost impossible to choose correctly.

@vicuna
Copy link
Author

vicuna commented Jan 15, 2014

Comment author: gmelquiond

I don't think the GC needs to do anything about it. As you say, the fraction is impossible to choose correctly, which makes the mechanism kind of pointless. If the GC needs to be run regularly, it doesn't have to be up to the GC to do it, it might just as well be done by the user program (especially if it is an interactive one).

Ideally, there should be a way to tell the actual footprint of each custom block and the threshold at which point a collection should start. (I thought that was already possible; I must have been mistaken.) Note that this can be accurately emulated in the user space thanks to a global accumulator and some finalizer functions. I guess it would be cheaper in the GC though, since no finalizer would be needed then.

@vicuna
Copy link
Author

vicuna commented Jan 16, 2014

Comment author: @ygrek

Ideally, there should be a way to tell the actual footprint of each custom block and the threshold at which point a collection should start.

Probably per resource type?
On a side note, bindings should provide the way to explicitely destroy the expensive resources..

@vicuna
Copy link
Author

vicuna commented Jul 11, 2014

Comment author: @damiendoligez

Ideally, there should be a way to tell the actual footprint of each custom block and the threshold at which point a collection should start.

That won't work in the case where all the custom blocks are still alive and the total foot print is still above the threshold after the collection.

@vicuna
Copy link
Author

vicuna commented Sep 13, 2015

Comment author: gerd

What about a modification that works similar to way heap memory is managed: we keep track of the footprint of the custom blocks, and aggregate that to alive_custom and total_custom, and a GC is triggered when alive_custom drops below a percentage of total_custom. The percentage is configurable.

With some more thoughts this could maybe even be integrated into the normal GC trigger condition (i.e. GC is triggered when alive_heap+alive_custom gets relatively too low).

I don't see big impl problems. The only thing is that we need a place for storing the footprint of the external part with the custom block (maybe at the end of the heap block, to keep compatibility).

@vicuna
Copy link
Author

vicuna commented Mar 6, 2016

Comment author: @garrigue

For lablgtk2, I've just added a function that allows to change the impact of custom allocation on the GC (including completely ignoring it). In general, I don't know what is the best approach.

@vicuna
Copy link
Author

vicuna commented Nov 9, 2018

Comment author: @alainfrisch

What about a modification that works similar to way heap memory is managed

Done in #1738

Would be good to get a confirmation that the problem reported here is indeed fixed. Guillaume: if this is still of interest to you, could you try with the current OCaml trunk?

@vicuna
Copy link
Author

vicuna commented Nov 9, 2018

Comment author: gmelquiond

It surely is, as we are still shipping our ugly hack in Why3. That said, I am not sure I will experience any improvement by switching to OCaml trunk. If I understand correctly, I would also need a version of Lablgtk that calls the new GC interface, wouldn't I?

@vicuna
Copy link
Author

vicuna commented Nov 12, 2018

Comment author: @damiendoligez

That's correct. Now that OCaml provides a better API, we will need to change lablgtk to use it.

@mshinwell
Copy link
Contributor

There hasn't been any progress on this for 18 months and the likelihood is that external libraries need fixing, not the compiler distribution, given the (fairly) recent changes as to the management of external heap resources. I think this can be closed.

@gasche gasche reopened this Apr 17, 2020
@mshinwell
Copy link
Contributor

@gasche Can you explain what changes in the compiler distribution are still required that necessitate this PR remaining open? I don't think there is anything for us to do here.

@gasche
Copy link
Member

gasche commented Apr 17, 2020

We have had gaping holes in our external-resource-accounting logic for years, which may have been fixed by a new iteration that Damien did in #1738. I think it makes sense to try getting a clear status report from the people who had issues before that, in the chance that their issues may not be completely fixed, or at least to get a close-case confirmation. This sounds particularly interesting here given that lablgtk is a relatively widely-used library exhibiting tricky resource-usage patterns, and that all of the people involved (@garrigue and @silene) are still active in the OCaml community, and could thus be persuaded to report on the current state -- instead of merely closing our dedicated communication channel with them, at least symbolically speaking.

@gasche
Copy link
Member

gasche commented Apr 17, 2020

@garrigue: have you had the chance to try using the new caml_custom_alloc_mem function in lablgtk? (Is it in a lablgtk release?) Could @silene try that?

@silene
Copy link

silene commented Apr 17, 2020

No problem. It is just a matter of commenting out one single line of Why3 to disable our workaround. As soon as there is something to test, I will do so. (As far as I know, lablgtk does not yet use caml_custom_alloc_mem.)

@silene
Copy link

silene commented Apr 17, 2020

Actually, I cannot seem to reproduce the issue anymore. So, I guess it was fixed on lablgtk's side and our workaround is no longer useful.

@damiendoligez
Copy link
Member

I've searched for caml_alloc_custom_mem in the lablgtk sources and didn't find it. @garrigue can you tell us more about this?

@github-actions
Copy link

This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc.

@github-actions github-actions bot added the Stale label Sep 13, 2021
@damiendoligez
Copy link
Member

ping @garrigue

@garrigue
Copy link
Contributor

Actually caml_alloc_custom_mem is not used in lablgtk, but the following code is (in wrappers.c):

/* Reimplement the old behaviour of alloc_custom, to be sure that
   values do not move around */
CAMLexport value ml_alloc_custom(struct custom_operations * ops,
                                 uintnat size,
                                 mlsize_t mem,
                                 mlsize_t max)
{
  mlsize_t wosize;
  value result;

  wosize = 1 + (size + sizeof(value) - 1) / sizeof(value);  
  result = caml_alloc_shr(wosize, Custom_tag);
  Custom_ops_val(result) = ops;
  caml_adjust_gc_speed(mem*gc_speed, max*100);
  return caml_check_urgent_gc(result);
}

where gc_speed is a parameter set by the program.

This function is used every time one returns a managed data structure from C to ocaml.

Is there an easy way to switch to using caml_alloc_custom_mem ?

@github-actions github-actions bot removed the Stale label Sep 15, 2021
@damiendoligez
Copy link
Member

Is there an easy way to switch to using caml_alloc_custom_mem ?

Not directly, since you still need to allocate directly in the major heap to avoid moving values.

What you are doing is quite similar to the mechanism implemented by alloc_custom_mem so I'm not surprised the bug seems to have disappeared. If you want to use the custom_mem stuff directly, you'll need to:

  • get rid of the gc_speed factor and the max argument
  • duplicate the computation of max_major from caml_alloc_custom_mem and use
  caml_adjust_gc_speed(mem, max_major);

The only problem is that caml_custom_major_ratio is needed for this computation, and it is
currently not exported by the runtime.

@github-actions
Copy link

github-actions bot commented Oct 5, 2022

This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc.

@github-actions github-actions bot added the Stale label Oct 5, 2022
@github-actions github-actions bot closed this as completed Nov 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants