Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

caml_alloc_custom/caml_alloc_final API easily leads to GC performance issues #7198

Closed
vicuna opened this issue Mar 27, 2016 · 3 comments
Closed
Assignees
Milestone

Comments

@vicuna
Copy link

vicuna commented Mar 27, 2016

Original bug ID: 7198
Reporter: mfp
Assigned to: @damiendoligez
Status: resolved (set by @alainfrisch on 2018-11-09T13:24:29Z)
Resolution: fixed
Priority: normal
Severity: minor
Version: 4.02.3
Target version: 4.07.0+dev/beta2/rc1/rc2
Category: runtime system and C interface
Related to: #7100 #7158
Monitored by: @braibant @diml @hcarty @alainfrisch

Bug description

I've found a number of instances of the same pattern: a library allocates custom blocks with caml_alloc_custom/caml_alloc_final and hardcodes "reasonable" used,max parameters, which latter lead to excessive GC work being performed when an application uses more than max/used "handles" at a time (this can happen because the used,max parameters were OK many years ago on computers with less memory, or because the particular usage in the application is not the one anticipated by the lib's author).

Some examples: the Event module in the (sys)threads library, regexp handles in ocaml-pcre, statement and DB handles in ocaml-sqlite3.

I expanded on some possible ways to address this in

#7158#c15410

In short, it'd be nice to have one of the following:

  • caml_alloc_custom/final-like functions where the memory footprint of the out-of-heap resource can be given so that the GC can adjust its speed using a mechanism similar to the one for in-heap blocks (this would be directly usable in all stubs where the custom blocks represent chunks of memory and not scarce resources)

  • a means to (1) register and (2) list and modify "build-time constants" (or more precisely, variables with build-time defaults). This would allow to tweak parameters without having to patch the upstream libs (after the one-time change to use this new API, that is).

The latter would allow to address the caml_alloc_custom issue on an application basis, and should also be useful for other "build-time constants" where it is hard to provide a value suitable for all application domains. It could be argued that each library could provide such functionality on its own, but the caml_alloc_custom issues I found (and the others likely to lie dormant waiting to be stepped upon) show this doesn't happen in practice. Having an official C + OCaml API for this would lower the barrier of adoption.

@vicuna
Copy link
Author

vicuna commented Nov 15, 2017

Comment author: @alainfrisch

It's true that even the 1/1000 ratio used for Pervasives channels can be suboptimal. Since the GC doesn't even close the underlying file descriptor anyway, one cannot even argue that this is to avoid fd leaks. Should we use a (default) value more representative of the actual memory usage for the channel structure itself?

@vicuna
Copy link
Author

vicuna commented Nov 22, 2017

Comment author: @mmottl

Since Alain has just contacted me about this issue, which I'm fixing right now in my bindings, I agree that Pervasives channels shouldn't be collected so aggressively. They are somewhat larger than most values (4K+), but modern machines also have crazy amounts of memory compared to "the old days". I guess a 1/10000 ratio is fine. If your application processes 10000 files, it better run on a machine with more than 40MB of available RAM.

Choosing the right kind of ratio is a little bit of an art, especially if the underlying value can vary unpredictably in size, which is the case for e.g. database query results. C-libraries do not typically provide functions for easily calculating the size of values.

@vicuna
Copy link
Author

vicuna commented Nov 9, 2018

Comment author: @alainfrisch

Addressed by #1738 which implements the first suggestion from the issue description.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants