Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0007158OCamlotherlibspublic2016-02-28 23:442017-02-16 14:12
Reportermfp 
Assigned To 
PrioritynormalSeveritymajorReproducibilityalways
StatusresolvedResolutionfixed 
PlatformOSOS Version
Product Version4.02.3 
Target Version4.05.0 +dev/beta1/beta2/beta3/rc1Fixed in Version4.05.0 +dev/beta1/beta2/beta3/rc1 
Summary0007158: Event.sync forces a full major GC cycle every 5000 calls at most
DescriptionEvent.sync uses condition variables, which are represented with custom blocks.

The parameters to alloc_custom are used=1, max=Max_condition_number=5000 (raised in 2010 from the original 1000 set back in 1996).

The end result is that a full major GC cycle is completed after at most 5000 calls to Event.sync, which can represent a considerable GC load. This is triggered for example by Lwt_preemptive.
Additional InformationReferences:
https://github.com/ocsigen/lwt/issues/218 [^]
https://github.com/mfp/ocaml-sqlexpr/issues/13 [^]
TagsNo tags attached.
Attached Files

- Relationships
related to 0007198assigneddoligez caml_alloc_custom/caml_alloc_final API easily leads to GC performance issues 

-  Notes
(0015410)
mfp (reporter)
2016-02-29 13:52

Some further considerations:

this is not the first time I run into such a thing (caml_alloc_custom params triggering too frequent GC): it also happened with Pcre regexps (at most 500(!) unreclaimed at any time) until recently.

I'd say there's an underlying API issue with caml_alloc_custom here: the used/max limits are application-dependent and not future-proof, so any library not using used=0,max=n is exposing itself to causing performance troubles to the users and/or being rendered comically outdated when the "acceptable" limits raise exponentially.

When the custom block does not represent scarce resources (like file descriptors or things with attached kernel structures), but only out-of-(OCaml-)heap memory, it would be preferable to have the GC adjust its speed based on the memory footprint relative to the current heap size.

caml_alloc_custom currently increases an internal value by used/max, and a full GC cycle is completed by the time it exceeds 1 or 0.5 * minor / major.

Would it be possible to have a new caml_alloc_custom-like function for "(extra-heap) memory only" structures which increased the internal value by something proportional to, say, resource_size / major or to piggy-back on the GC's speed control system to the same effect?

Many C libraries using custom block could use such a function, becoming both future-proof and usable across very different applications.

Having custom blocks that represent scarce resources is arguably a bad idea (or more precisely, leaving their disposal up to the GC), but there's indeed some value in having a safety net like the one offered by caml_alloc_custom at present. It would be nice to have a way to expose and make more visible all those "runtime parameters" so that different applications can manipulate them without patching all the dependencies, and as to make it easier to locate and increase them in the future.

This could be as simple as a registry of (mutable) "build-time constants" in the runtime, along with a trivial module in the stdlib, providing 3 operations: (1) register a value associated to a unique name (like the custom_ops identifier), (2) find a value and (3) list all values.

(3) would be useful to future developers to get a comprehensive list of build-time constants they might want to tweak for their specific applications or review as the resources become more abundant.


(BTW, Lwt_preemptive didn't need first-class sync communication, so I proposed to replace Event with a trivial mutex + CV combo: https://github.com/ocsigen/lwt/pull/219 [^] . I wonder how many actual users of Event's full capabilities there are).
(0015621)
mfp (reporter)
2016-03-27 16:11
edited on: 2016-03-27 16:11

I have located yet another instance of the hardcoded limit issue: sqlite3-ocaml's database and statement handles (both with used=1, max=100), resulting in the GC taking over >70% of the CPU time. I suspect systematic search for more caml_alloc_custom/caml_alloc_final uses would yield several results.

Edit: should I open a new PR for the caml_alloc_custom/caml_alloc_final API issue?

(0015623)
gasche (developer)
2016-03-27 16:16

> should I open a new PR for the caml_alloc_custom/caml_alloc_final API issue?

Do as you feel is best. Damien's triaging of the issue indicates that he considers it a major issue, but that he probably won't be working on it before the 4.03 release -- I guess there is not enough time left to design and test more GC control mechanism, as those have a tendency to need a lot of testing on production workloads.
(0015625)
mfp (reporter)
2016-03-27 16:49

Indeed, it's way too late to fiddle with the GC and introduce a new C API. There's more than enough GC work going on with the ephemerons and the low-latency stuff :)

On further reflection, the Event.sync performance bug is an instance of the broader API issue, so the latter definitely deserves a PR of its own, I'm posting it in a minute.
(0017283)
xleroy (administrator)
2017-02-16 14:12

After some thoughts and discussions, it appears that good C implementations of mutexes and condition variables do not consume kernel resources and we can allocate as many as will fit in memory. Hence we now call caml_alloc_custom with cost 0/1 instead of 1/N. This will be in release 4.05.

Commits: [trunk 84be1bc] and [4.05 16ade59]

- Issue History
Date Modified Username Field Change
2016-02-28 23:44 mfp New Issue
2016-02-29 13:52 mfp Note Added: 0015410
2016-02-29 16:21 doligez Severity minor => major
2016-02-29 16:21 doligez Status new => confirmed
2016-02-29 16:21 doligez Target Version => 4.03.1+dev
2016-03-27 16:11 mfp Note Added: 0015621
2016-03-27 16:11 mfp Note Edited: 0015621 View Revisions
2016-03-27 16:16 gasche Note Added: 0015623
2016-03-27 16:49 mfp Note Added: 0015625
2016-04-06 13:07 doligez Relationship added related to 0007198
2017-02-16 14:00 doligez Target Version 4.03.1+dev => undecided
2017-02-16 14:12 xleroy Note Added: 0017283
2017-02-16 14:12 xleroy Status confirmed => resolved
2017-02-16 14:12 xleroy Resolution open => fixed
2017-02-16 14:12 xleroy Fixed in Version => 4.05.0 +dev/beta1/beta2/beta3/rc1
2017-02-16 14:12 xleroy Target Version undecided => 4.05.0 +dev/beta1/beta2/beta3/rc1
2017-02-23 16:42 doligez Category OCaml otherlibs => otherlibs


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker