New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
multiple "open" can become expensive in memory #5877
Comments
Comment author: @alainfrisch Shouldn't the two memoization tables be cleared between compilation units (e.g. in Env.reset_cache)? I'm a little bit concerned by the structural equality on signatures, since they contain mutable components. I don't see an actual problem, but are you convinced that your approach is sound? |
Comment author: @alainfrisch I've uploaded a new version of your patch which applies to the current trunk (and in universal format, which I find more readable). |
Comment author: @lefessan For Env.reset_cache, it should be done, indeed, but it is not a problem if it is not done (since the comparison is done with physical equality and caches of persistent signatures are cleared, there is no risk to use a signature from an older compilation). For the mutable parts, I think they either relate to lazy evaluation, or they are only mutated when the signature is being built, and a signature is not mutated afterwards. So there should be no problem with sharing these mutable components. |
Comment author: @alainfrisch I've uploaded a new version, which merges the two memoization tables. Fabrice: do you agree with this change? |
Comment author: @alainfrisch
Each open still involves adding all toplevel elements of the opened structure to the environment, which is not a no-op. I've create a source file with 10000 lines equal to "open Hashtbl". The maximum resident set size for compiling this module with ocamlc is reduced by a factor about 4 with the patch (but not more!), and compilation time by a factor of 7. This is good, but I would not go as far as to say that multiple opens become not more expensive than one open. |
Comment author: @lefessan It would be possible to save some more space (a lot in your example) by removing redundant entries from the environment, i.e. [store_value env x y] would check first if [x] is not already associated (physically) with [y] before adding it. Actually, it would even be possible to use such checks to display warnings when opening a module hides a value that was already bound in the environement. |
Comment author: @alainfrisch
The real use case is rather when the code has a lot of non-nested local opens, so this would not apply. |
Comment author: @lefessan Maybe you can memoize on Ident.hide ? |
Comment author: @lefessan Or check if a type before a substitution on a value is not polymorphic, in which case there is probably no need to do the substitution and thus, the value_description can be shared. That sounds complex for a problem that has not yet been exhibited in practice on real code. |
Comment author: @alainfrisch
Agreed! I just wanted to make sure that the patch indeed solves the problem it is intended to solve (repeated local opens taking too much memory). |
Comment author: @alainfrisch I've committed the revised patch to the trunk (r13236). |
Comment author: @alainfrisch Just to drop a note that #834 should fully address the problem reported here (repeated opening of the same module becomes really cheap). |
Original bug ID: 5877
Reporter: @lefessan
Assigned to: @alainfrisch
Status: closed (set by @alainfrisch on 2016-10-04T08:10:58Z)
Resolution: not a bug
Priority: normal
Severity: minor
Version: 4.00.1
Category: ~DO NOT USE (was: OCaml general)
Duplicate of: #5916
Monitored by: @bobzhang @hcarty gerd @alainfrisch
Bug description
The "open" construct has a non-negligeable cost, as it applies a substitution to a signature before including it in the environment. With new syntax sugar like M.(x) and "let open X in", opening several times the same signature has become easy. However, if the signature is big (for example, if it is a pack of many interfaces), memory can become a problem.
For example, with Core, "open Core.Std" costs 50 MB of memory, opening it 20 times is 1 GB.
The provided patch improves sharing of substituted signatures, making multiple opens not more expensive than one open.
File attachments
The text was updated successfully, but these errors were encountered: