New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
creating module values may lead to memory leaks #5098
Comments
Comment author: @alainfrisch Note that there is no allocation occurring when evaluating (module T : S) (if T is already evaluated). In fact, the problem you mentions is not directly related to turning modules into first-class values, although I agree that the new feature makes the issue more visible. Before first-class modules, you could already capture a module in a closure and face the same problem: let f = I believe the issue is minor, though, and can be avoided by explicitly copying the fields you want to keep. Better, the "include" mechanism can do it for you: let cleanup x = This is not to say that the optimization which avoids truncating a structure at runtime couldn't be questioned. |
Comment author: @gasche Markus Mottl relayed this surprise and change request in the discussion of bug 0005931 ( #5931 ), and the duplicate issue 0005932. Interestingly, it reportedly also mirrors a recommendation resulting from the LAFOSEC security-oriented study of the OCaml programming language: as the polymorphic equality will still structurally compare the fields hidden from the signature, their non-erasure breaks type abstraction. We have no hard data to evaluate this, but a mild guess of the performance benefits of this optimization would be: "it doesn't change performances in practice". I think we could consider changing this (removing the optimization of order-relevant signature prefixes, to only avoid the copy for exact signature correspondence). Markus or anyone, would you care to submit a patch for this? No guarantee that it will be implemented in the end (someone may still come up with a deadly real-world example where it is application-critical that no copy is made, in which case Alain remark that enforcing a copy is still expressible in the current system might prevail), but otherwise it will be "when someone else that cares look at it". |
Comment author: @alainfrisch Even without the "optimization/bug", polymorphic equality still breaks type abstraction. module type S = sig type t val x : t end;;module type S = sig type t val x : t end (module struct type t = int let x = 1 end : S) = (module struct type t = bool let x = true end : S);;
But the same can be achieved with (i) exceptions and (ii) GADTs. |
Comment author: @gasche Indeed, but this is an orthogonal issue: it is the existential type that is being broken here, while the optimization/bug (nice name!) breaks signature ascription, even in absence of actually abstract types in the signature. (I do agree that those are related issues and that a real solution for the existential problem would possibly also solve the optimization/bug problem. My initial mention of this work was more a way to say "by the way, other people have been unhappy about this feature/bug for yet another reason, so it starts to look like we could make more people happy than unhappy with this change.".) |
Comment author: @mmottl It hasn't occurred to me that people might want to polymorphically compare modules, because they usually contain closures. But since they can also be used as extensible records with ordinary value fields, that's certainly a sane thing to do. This is indeed another argument against the current semantics, because it not only affects memory management but also program logic, e.g.: module type S = sig val x : int end let () =
|
Comment author: @gasche Let's fix this! Who volunteers to write the patch? |
Comment author: @Chris00 Does this also happens for objects? If an object t is coerced, (t :> s), and only this latter object is reachable, are the values only reachable from the hidden methods of t collected? |
Comment author: @xavierleroy @Christophe Troestler: I'm speaking from memory here, but I'm pretty sure that coercions (t :> s) generate no code and don't change the representation of t. (Just like an ordinary type constraint.) @ALL: yes, I agree this "prefix" optimization on structure thinning should go away, for reasons of space safety (this PR) and of security (as pointed out by the Lafosec study). The main reason for this optimization, as far as I can remember, was not so much to save execution time but to avoid generating big code to copy & thin the structure. I have no idea how often it fires in practice, though. |
Comment author: @alainfrisch A lightly tested patch is attached. I haven't even tried to evaluate the impact on performance / code size. |
Comment author: @gasche I find it a bit vexing that this patch makes the relevant part of the code slightly more complex, rather than less (which is why you would except after disabling an optimization except in its most trivial case). I have tried to get a better formulation, but failed. I think the reason why checking "subset with identity coercions" is easier than "bijection with identity coercions" is that the code is doing a subtyping test, so naturally considers only the fields of sig1 that also appear in sig2. In this context the optimization for prefix is actually more natural than the optimization for whole-structures. Count this as a token of support for the patch: I have reviewed it and believe it is correct and cannot be much improved. I think we should add a comment around the "if len1 = len2" test explaining why the heck we care about that, mentioning the memory leak issue. |
Comment author: @damiendoligez Patch applied and committed (rev 13735) |
Original bug ID: 5098
Reporter: milanst
Status: closed (set by @damiendoligez on 2013-06-03T14:47:06Z)
Resolution: fixed
Priority: normal
Severity: minor
Version: 3.12.0+beta1 or 3.12.0+rc1
Target version: 4.01.0+dev
Fixed in version: 4.01.0+dev
Category: ~DO NOT USE (was: OCaml general)
Has duplicate: #5932
Related to: #5931
Monitored by: @gasche "Pascal Cuoq" @ygrek sweeks "Julien Signoles" @hcarty yminsky @Chris00 @mmottl
Bug description
compiler optimizes expressions (module T : S) and avoids block creation if values in T line up with S. The consequence is that values in T that are hidden by S are kept around for as long as the resulting module value is not collected, although they can't be accessed.
Additional information
in the following code
module type S = sig
val x : int
end
let pack1 x =
(module struct
let x = x
let should_be_collected = "but it isn't"
end : S)
let pack2 x =
(module struct
let should_be_collected = "and this one is"
let x = x
end : S)
let () =
let a1 = pack1 0 in
let a2 = pack2 0 in
...
If we check representation of [a1] we can see that it is a block with 2 elements, int and a string should_be_collected. [a2] on the other had is a block with one element, just an int.
I think that optimization is case of [pack1] leads to a memory leak.
The optimization should make a stricter check, namely in (module T:S) the allocation should be omitted only if representation of T is exactly the same as S, which is probably the case in practice.
File attachments
The text was updated successfully, but these errors were encountered: