Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0005098OCamlOCaml generalpublic2010-07-08 23:172013-06-03 16:47
Assigned To 
PlatformOSOS Version
Product Version3.12.0+beta1 or 3.12.0+rc1 
Target Version4.01.0+devFixed in Version4.01.0+dev 
Summary0005098: creating module values may lead to memory leaks
Descriptioncompiler optimizes expressions (module T : S) and avoids block creation if values in T line up with S. The consequence is that values in T that are hidden by S are kept around for as long as the resulting module value is not collected, although they can't be accessed.
Additional Informationin the following code
module type S = sig
  val x : int

let pack1 x =
  (module struct
     let x = x
     let should_be_collected = "but it isn't"
  end : S)

let pack2 x =
  (module struct
     let should_be_collected = "and this one is"
     let x = x
  end : S)

let () =
  let a1 = pack1 0 in
  let a2 = pack2 0 in

If we check representation of [a1] we can see that it is a block with 2 elements, int and a string should_be_collected. [a2] on the other had is a block with one element, just an int.
I think that optimization is case of [pack1] leads to a memory leak.

The optimization should make a stricter check, namely in (module T:S) the allocation should be omitted only if representation of T is exactly the same as S, which is probably the case in practice.
TagsNo tags attached.
Attached Filesdiff file icon pr5098.diff [^] (2,728 bytes) 2013-02-28 15:02 [Show Content]

- Relationships
has duplicate 0005932resolvedfrisch Modules can leak memory 
related to 0005931closed Inefficient record duplication 

-  Notes
frisch (developer)
2010-07-09 10:12

Note that there is no allocation occurring when evaluating (module T : S) (if T is already evaluated). In fact, the problem you mentions is not directly related to turning modules into first-class values, although I agree that the new feature makes the issue more visible.

Before first-class modules, you could already capture a module in a closure and face the same problem:

let f =
  let module M : sig val x : int end = struct
    let x = 1
    let s = "...."
  fun () -> M.x

I believe the issue is minor, though, and can be avoided by explicitly copying the fields you want to keep. Better, the "include" mechanism can do it for you:

let cleanup x =
  let module M = (val x : S) in
  let module M = struct include M end in
  (module M : S)

This is not to say that the optimization which avoids truncating a structure at runtime couldn't be questioned.
gasche (developer)
2013-02-25 15:21
edited on: 2013-02-25 15:22

Markus Mottl relayed this surprise and change request in the discussion of bug 0005931 ( [^] ), and the duplicate issue 0005932. Interestingly, it reportedly also mirrors a recommendation resulting from the LAFOSEC security-oriented study of the OCaml programming language: as the polymorphic equality will still structurally compare the fields hidden from the signature, their non-erasure breaks type abstraction.

We have no hard data to evaluate this, but a mild guess of the performance benefits of this optimization would be: "it doesn't change performances in practice".

I think we could consider changing this (removing the optimization of order-relevant signature prefixes, to only avoid the copy for exact signature correspondence). Markus or anyone, would you care to submit a patch for this? No guarantee that it will be implemented in the end (someone may still come up with a deadly real-world example where it is application-critical that no copy is made, in which case Alain remark that enforcing a copy is still expressible in the current system might prevail), but otherwise it will be "when someone else that cares look at it".

frisch (developer)
2013-02-25 15:31

Even without the "optimization/bug", polymorphic equality still breaks type abstraction.

# module type S = sig type t val x : t end;;
module type S = sig type t val x : t end
# (module struct type t = int let x = 1 end : S) = (module struct type t = bool let x = true end : S);;
- : bool = true

But the same can be achieved with (i) exceptions and (ii) GADTs.
gasche (developer)
2013-02-25 15:33
edited on: 2013-02-25 15:35

Indeed, but this is an orthogonal issue: it is the existential type that is being broken here, while the optimization/bug (nice name!) breaks signature ascription, even in absence of actually abstract types in the signature.

(I do agree that those are related issues and that a real solution for the existential problem would possibly also solve the optimization/bug problem. My initial mention of this work was more a way to say "by the way, other people have been unhappy about this feature/bug for yet another reason, so it starts to look like we could make more people happy than unhappy with this change.".)

mottl (reporter)
2013-02-25 15:47

It hasn't occurred to me that people might want to polymorphically compare modules, because they usually contain closures. But since they can also be used as extensible records with ordinary value fields, that's certainly a sane thing to do. This is indeed another argument against the current semantics, because it not only affects memory management but also program logic, e.g.:

module type S = sig val x : int end
module type T = sig include S val y : int end

let () =
  let module S = struct let x = 42 end in
  let t = (module struct include S let y = 3 end : S) in
  (* let t = (module struct include (struct include S let y = 3 end : S) end : S) in *)
  assert ((module S : S) = t)

This program should succeed, but it doesn't. Uncommenting the alternative definition of "t" will yield the solution that a programmer implementing the above would likely have intended.

I also agree that this "optimization" of not allocating a new module will likely not matter that much in practice anyway. If the source and target module signatures are the same, then, of course, it should be done and will be sound. But otherwise there is no guarantee anyway that module signatures will line up properly. So depending on the ordering of module entries (e.g. changing the order of an "include") we may get different semantics (or at least memory leaks).

If this issue came up for a vote, I'd clearly fall into the "lets fix this" camp.
gasche (developer)
2013-02-25 17:22

Let's fix this! Who volunteers to write the patch?
Christophe Troestler (reporter)
2013-02-25 19:03

Does this also happens for objects? If an object t is coerced, (t :> s), and only this latter object is reachable, are the values only reachable from the hidden methods of t collected?
xleroy (administrator)
2013-02-25 19:37

@Christophe Troestler: I'm speaking from memory here, but I'm pretty sure that coercions (t :> s) generate no code and don't change the representation of t. (Just like an ordinary type constraint.)

@all: yes, I agree this "prefix" optimization on structure thinning should go away, for reasons of space safety (this PR) and of security (as pointed out by the Lafosec study). The main reason for this optimization, as far as I can remember, was not so much to save execution time but to avoid generating big code to copy & thin the structure. I have no idea how often it fires in practice, though.
frisch (developer)
2013-02-28 15:03

A lightly tested patch is attached. I haven't even tried to evaluate the impact on performance / code size.
gasche (developer)
2013-03-01 20:43
edited on: 2013-03-01 20:44

I find it a bit vexing that this patch makes the relevant part of the code slightly more complex, rather than less (which is why you would except after disabling an optimization except in its most trivial case).

I have tried to get a better formulation, but failed. I think the reason why checking "subset with identity coercions" is easier than "bijection with identity coercions" is that the code is doing a subtyping test, so naturally considers only the fields of sig1 that also appear in sig2. In this context the optimization for prefix is actually more natural than the optimization for whole-structures.

Count this as a token of support for the patch: I have reviewed it and believe it is correct and cannot be much improved. I think we should add a comment around the "if len1 = len2" test explaining why the heck we care about that, mentioning the memory leak issue.

doligez (administrator)
2013-06-03 16:47

Patch applied and committed (rev 13735)

- Issue History
Date Modified Username Field Change
2010-07-08 23:17 milanst New Issue
2010-07-09 10:12 frisch Note Added: 0005596
2011-05-20 18:16 doligez Status new => acknowledged
2012-07-10 20:15 doligez Target Version => 4.01.0+dev
2012-07-31 13:36 doligez Target Version 4.01.0+dev => 4.00.1+dev
2012-09-18 15:23 doligez Target Version 4.00.1+dev => 4.00.2+dev
2013-02-25 09:42 frisch Relationship added has duplicate 0005932
2013-02-25 15:21 gasche Note Added: 0008917
2013-02-25 15:22 gasche Note Edited: 0008917 View Revisions
2013-02-25 15:23 gasche Relationship added related to 0005931
2013-02-25 15:31 frisch Note Added: 0008918
2013-02-25 15:33 gasche Note Added: 0008919
2013-02-25 15:35 gasche Note Edited: 0008919 View Revisions
2013-02-25 15:47 mottl Note Added: 0008920
2013-02-25 17:22 gasche Note Added: 0008921
2013-02-25 19:03 Christophe Troestler Note Added: 0008922
2013-02-25 19:37 xleroy Note Added: 0008923
2013-02-28 15:02 frisch File Added: pr5098.diff
2013-02-28 15:03 frisch Note Added: 0008937
2013-03-01 20:43 gasche Note Added: 0008943
2013-03-01 20:44 gasche Note Edited: 0008943 View Revisions
2013-06-03 16:47 doligez Note Added: 0009383
2013-06-03 16:47 doligez Status acknowledged => closed
2013-06-03 16:47 doligez Resolution open => fixed
2013-06-03 16:47 doligez Fixed in Version => 4.01.0+dev
2013-06-03 16:47 doligez Target Version 4.00.2+dev => 4.01.0+dev

Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker