Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0005877OCamlOCaml generalpublic2013-01-08 12:032013-02-07 06:36
Reporterlefessan 
Assigned Tofrisch 
PrioritynormalSeverityminorReproducibilityalways
StatusresolvedResolutionfixed 
PlatformOSOS Version
Product Version4.00.1 
Target VersionFixed in Version 
Summary0005877: multiple "open" can become expensive in memory
DescriptionThe "open" construct has a non-negligeable cost, as it applies a substitution to a signature before including it in the environment. With new syntax sugar like M.(x) and "let open X in", opening several times the same signature has become easy. However, if the signature is big (for example, if it is a pack of many interfaces), memory can become a problem.

For example, with Core, "open Core.Std" costs 50 MB of memory, opening it 20 times is 1 GB.

The provided patch improves sharing of substituted signatures, making multiple opens not more expensive than one open.
TagsNo tags attached.
Attached Filespatch file icon ocaml-4.00.1-memo.patch [^] (5,597 bytes) 2013-01-08 12:03 [Show Content]
diff file icon patch_5877.diff [^] (4,855 bytes) 2013-01-08 15:31 [Show Content]

- Relationships
duplicate of 0005916acknowledged Compile time performance of opens 

-  Notes
(0008709)
frisch (developer)
2013-01-08 14:08

Shouldn't the two memoization tables be cleared between compilation units (e.g. in Env.reset_cache)?

I'm a little bit concerned by the structural equality on signatures, since they contain mutable components. I don't see an actual problem, but are you convinced that your approach is sound?
(0008710)
frisch (developer)
2013-01-08 14:11

I've uploaded a new version of your patch which applies to the current trunk (and in universal format, which I find more readable).
(0008711)
lefessan (developer)
2013-01-08 14:55

For Env.reset_cache, it should be done, indeed, but it is not a problem if it is not done (since the comparison is done with physical equality and caches of persistent signatures are cleared, there is no risk to use a signature from an older compilation).

For the mutable parts, I think they either relate to lazy evaluation, or they are only mutated when the signature is being built, and a signature is not mutated afterwards. So there should be no problem with sharing these mutable components.
(0008712)
frisch (developer)
2013-01-08 15:31

I've uploaded a new version, which merges the two memoization tables. Fabrice: do you agree with this change?
(0008713)
frisch (developer)
2013-01-08 15:51

> making multiple opens not more expensive than one open.

Each open still involves adding all toplevel elements of the opened structure to the environment, which is not a no-op. I've create a source file with 10000 lines equal to "open Hashtbl". The maximum resident set size for compiling this module with ocamlc is reduced by a factor about 4 with the patch (but not more!), and compilation time by a factor of 7. This is good, but I would not go as far as to say that multiple opens become not more expensive than one open.
(0008722)
lefessan (developer)
2013-01-09 00:27

It would be possible to save some more space (a lot in your example) by removing redundant entries from the environment, i.e. [store_value env x y] would check first if [x] is not already associated (physically) with [y] before adding it.

Actually, it would even be possible to use such checks to display warnings when opening a module hides a value that was already bound in the environement.
(0008723)
frisch (developer)
2013-01-09 08:37

> (a lot in your example)

The real use case is rather when the code has a lot of non-nested local opens, so this would not apply.
(0008724)
lefessan (developer)
2013-01-09 12:11

Maybe you can memoize on Ident.hide ?
(0008725)
lefessan (developer)
2013-01-09 12:17

Or check if a type before a substitution on a value is not polymorphic, in which case there is probably no need to do the substitution and thus, the value_description can be shared.

That sounds complex for a problem that has not yet been exhibited in practice on real code.
(0008727)
frisch (developer)
2013-01-09 16:49

> That sounds complex for a problem that has not yet been exhibited in practice on real code.

Agreed! I just wanted to make sure that the patch indeed solves the problem it is intended to solve (repeated local opens taking too much memory).
(0008752)
frisch (developer)
2013-01-15 13:55

I've committed the revised patch to the trunk (r13236).

- Issue History
Date Modified Username Field Change
2013-01-08 12:03 lefessan New Issue
2013-01-08 12:03 lefessan File Added: ocaml-4.00.1-memo.patch
2013-01-08 14:08 frisch Note Added: 0008709
2013-01-08 14:10 frisch File Added: patch_5877.diff
2013-01-08 14:11 frisch Note Added: 0008710
2013-01-08 14:55 lefessan Note Added: 0008711
2013-01-08 15:30 frisch File Deleted: patch_5877.diff
2013-01-08 15:31 frisch File Added: patch_5877.diff
2013-01-08 15:31 frisch Note Added: 0008712
2013-01-08 15:51 frisch Note Added: 0008713
2013-01-09 00:27 lefessan Note Added: 0008722
2013-01-09 08:37 frisch Note Added: 0008723
2013-01-09 12:11 lefessan Note Added: 0008724
2013-01-09 12:17 lefessan Note Added: 0008725
2013-01-09 16:49 frisch Note Added: 0008727
2013-01-15 13:55 frisch Note Added: 0008752
2013-01-15 13:55 frisch Status new => resolved
2013-01-15 13:55 frisch Resolution open => fixed
2013-01-15 13:55 frisch Assigned To => frisch
2013-02-07 06:36 frisch Relationship added duplicate of 0005916


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker