Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0006307OCamlOCaml typingpublic2014-01-27 09:542014-03-10 11:50
Reporterfrisch 
Assigned Togarrigue 
PrioritynormalSeverityminorReproducibilityhave not tried
StatusresolvedResolutionfixed 
PlatformOSOS Version
Product Version 
Target VersionFixed in Version4.02.0+dev 
Summary0006307: Behavior of 'module type of' w.r.t. module aliases
DescriptionThe introduction of module aliases can break code which uses 'module type of'. Example:

===================================================================
module A1 = struct end
module A2 = struct end

module L1 = struct
  module X = A1
end

module L2 = struct
  module X = A2
end

module F (L : (module type of L1)) = struct end

module F1 = F(L1)
module F2 = F(L2)
=================================================================== 


Previously, the type of F's argument was "sig module X : sig end end", and now it is "sig module X = A1 end", making L2 incompatible with it.

A discussion about what to do started on caml-devel. Here is Jacques' reply to my email:



> What should we do about it?

An easy fix when the submodule contains no abstract types is to add an annotation on module X:

module L1 = struct
  module X : module type of A1 = A1
end

> I'm tempted to argue in favor of generalizing the special treatment of 'module type of' w.r.t. module aliases to submodules to keep backward compatibility. This seems also more coherent with the treatment of 'module type of' w.r.t. type aliases.

I'm not sure of what you mean by your second sentence.
'module type of' does keep type aliases.
The only specific behavior is that it does not strengthen the types, i.e. it does not add aliases to abstract types.
This is not what happens here for the module aliases: the alias comes from the definition in L1, not from module type of.
So this would not just mean not adding aliases, but actively removing them from the signature, recursively.

This said I agree that the choice not to strengthen abstract types had your use case in mind.
Since the problem we see here is again specific to this use case, it could be reasonable to solve it in the same direction.
However, there is also a new problem here: if we do not leave the alias by default, we need a way to reintroduce it.
Currently "(module type of L1) with module X = A1" does not reintroduce an alias.
I suppose it should introduce one, so this is rather a bug.

> What do you think?

I'm not sure what is the best solution.
In particular, experience has shown that not strengthening 'module type of' was actually a bad idea.
In at least 95% of cases you want the strengthened version, and doing the strengthening by hand is a pain.
The use case you describe here seems pretty rare, to the point I was even wondering wether we should not revert to the strengthening semantics. But, as rare as it is, you demonstrated that it happens in practice, so maybe we need two versions of 'module type of'.

I suppose it also relates to the expected semantics of 'module type of'.
Is it just something which looks up a module type in the environment,
or is it something more meta-level, which behaves like syntactically copying the module type.
PR#6305 is an example of that: there the intuition is that 'module type of A' should not introduce a dependency on A, since we do not access any internal part of A, but this would require some specific treatment.
http://caml.inria.fr/mantis/view.php?id=6305 [^]
Tagspatch
Attached Filesdiff file icon remove_aliases.diff [^] (2,098 bytes) 2014-01-31 07:47 [Show Content]
diff file icon alias_in_recmodule.diff [^] (586 bytes) 2014-03-05 06:52 [Show Content]
diff file icon remove_alias_correct.diff [^] (12,039 bytes) 2014-03-06 04:03 [Show Content]

- Relationships

-  Notes
(0010847)
gasche (developer)
2014-01-27 11:20

My gut feeling (with a healthy disrespect for what the theory or
implementation actually allow) is that the code example you show is
fundamentally wrong. One should not use (module type of L1) in
signature (contravariant) position after just experimenting with this
fragile construction and seeing that, at one point in time, it happens
to do more or less what you want to save you the burden of writing
a proper signature. Of course that was going to break as soon as we
refined the signature language in a way that distinguishes L1 and L2.

Instead, I would expect having to write something such as:

    module L1 = struct
      module X = A1
      type t = ...
      ...
    end
    
    module type S = sig
      module X : sig .. end
      type t
      include (module type of L1 with module X := X and type t := t)
    end
    
    module L2 (* : S with ... *) = struct
      module X = A2
    end
    
    module F (L : S) = ...

Where the S signature re-abstracts over all type and module components
of L1 that are specific, in a way that removes some boilerplate
(over giving the full signature) but protects again future
refinements.

I guess my message is that we should let "module type of" be used to
avoid repeating value components, but should discourage uses that
abusively capture (module) types, to avoid such identity issues in the
future.

I would be in favor of switching (module type) to the strengthened
interpretation as soon as possible -- if we can accept any break of an
experimental feature at all. Maybe there are language constructs that
we could add to simplify weakening strong signatures, or strengthen
weak ones.
(0010852)
frisch (developer)
2014-01-28 09:43

> An easy fix when the submodule contains no abstract types is to add an
> annotation on module X:
>
> module L1 = struct
> module X : module type of A1 = A1
> end

Another fix (which applies also if the submodule contains abstract types and we want to keep a strengthened version):

module L1 = struct
  module X = struct include A1 end
end

Actually, rewriting:

  module X = A

to

  module X = struct include A end

seems to be a way to recover the pre-alias semantics in a systematic way (it could be automated by a ppx filter).
(0010865)
frisch (developer)
2014-01-30 13:24

In which use cases would one prefer the current behavior of 'module type of' w.r.t. module aliases?
(0010871)
garrigue (manager)
2014-01-31 01:48

> In which use cases would one prefer the current behavior of 'module type of' w.r.t. module aliases?

In the case you want to keep the information that the submodules keep their identity.
For instance, if you want to use them as argument to applicative functors.

This said, this may be too restrictive even when you want to use this for an extended implementation, as this means that you cannot reimplement submodules.

So I tend to agree with you that 'module type of' should remove module aliases, independently of whether strengthening was a good idea or not.
(0010872)
garrigue (manager)
2014-01-31 02:25

OK, I added a patch that does what you want: it removes aliases in submodules too.

However, I need to think a bit more about the soundness aspect.
Indeed, this patch allows to create ill-typed signatures: since we remove aliases, some declaration lower in the signature, which depended on the alias, could be ill-typed.
If the signature contains no values, one can turn it into a module using 'module rec'.
However this does not seem to be the case here, as for this to make sense there should be values in some module.
So probably this only means that this signature would not match any module (included the original module), which is bad, but not unsound. Still, we need to be careful.

I we additionally have to check for the well-typedness of the signature, this is much more work :(
(0010877)
frisch (developer)
2014-02-04 11:07

Which case do you see of a signature depending on an alias (and which breaks when the alias is expanded)?
(0010882)
garrigue (manager)
2014-02-05 05:31

Here is an example:

module F (X : sig end) = struct type t = A end
module G (X : sig type t end) (Y : sig type t = X.t end) =
  struct type t = B end
module M1 = struct end
module N1 = F(M1)
module M = struct
  module M2 = M1
  module N2 = F(M2)
  module H = G(N1)(N2)
end

module type S = module type of M

The functor G only accepts modules containing the same type t, but in the signature S this is not the case, since F was applied on different modules. So S is invalid, and would be refused if input explicitly.
Note however that we can still coerce M to S:

module M' : S = M

I'm not yet sure whether this creates problems or not...
And there are other examples of signatures one cannot write directly.
(0010885)
lpw25 (developer)
2014-02-05 17:22

Personally, I think that `(module type of ...)` should be changed to give the strengthened type, and it should include module aliases.

For example, an obvious use for `(module type of ...)` is to allow you to write:

  include (module type of Foo)

as the interface for a module which had:

  include Foo

in its implementation. In this case it is clear that you want the strengthened version and the module aliases included.

> This said, this may be too restrictive even when you want to use this for an extended implementation, as this means that you cannot reimplement submodules.

If you want to reimplement submodules then you just need to use:

  module M : (module type of Foo.M)

  include (module type of Foo) with module M := M

This is probably a good thing as it makes your re-implementation obvious.

(An even better way would be to write:

  include (module type of Foo) with module M : (module type of Foo.M)

but for some reason we do not allow `with module` to specify the type of a module explicitly.)
(0010886)
lpw25 (developer)
2014-02-05 17:25
edited on: 2014-02-05 17:30

> If you want to reimplement submodules then you just need to use:
>
> module M : (module type of Foo.M)
>
> include (module type of Foo) with module M := M
>
>This is probably a good thing as it makes your re-implementation obvious.

Actually, maybe this doesn't work. I think that I am using "with module" backwards.

(0010890)
garrigue (manager)
2014-02-06 02:29

Indeed, the problem with strengthening by default is that 'with' is only allowed to make a signature more specific, not less.

Also, concerning module aliases, it seems that an expected property of 'module type of' is that it should generate a signature with as little dependencies as possible. If we keep aliases, we introduce a dependency.

So having a 'module type of' remove module aliases seems a good idea to me.
My only concerns are:
* soundness as we may create invalid signature
* robustness as one expects that if module M' = M, then one should have module M' : module type of M
* whether there is some need to add aliases to a signature using 'with'
   unfortunately using the current construct to do that would not be backward compatible, so we are probably stuck with writing

  module type T' = sig
    module M2 = M1
    include T with module M2 := M2
  end

An additional question is whether to remove aliases inside functors too (the current patch only does it for submodules).
(0010894)
lpw25 (developer)
2014-02-06 10:15

> whether there is some need to add aliases to a signature using 'with' unfortunately using the current construct to do that would not be backward compatible

Is this just because currently module aliases are not produced by functors? For example, I think on trunk the following is not allowed:

  module F (X : S) : sig module M = X end = struct
    module M = X
  end

Which means that many current uses of `with module` do not want aliases. Although I think we could support module aliases in the result of functors quite easily (I don't think it gets tricky until you allow functor application in the paths being aliased).

Or is there some other major cause of backwards incompatibility?
(0010905)
garrigue (manager)
2014-02-07 00:26

Well, actually you can add an alias to a signature.
The reason is the rather strange behavior of 'with module' (already reported a long time ago in a bug report by Yaron I think).
Namely, while the specification of 'with module M = Q' is to add to the type paths M.P.t aliases to Q.P.t, what actually happens is that the signature of M is replaced by the strengthened signature of P.
So you can do the following:

module type S1 = sig module M : sig type t end end;;
module M = struct type t = int let v = 1 end;;
module M' = M;;
module type S1'' = S1 with module M = M';;

You can also see the strangeness (which is not new) by trying:

module type S1' = S1 with module M = M;;

Concerning functors, this is completely unrelated.
The reason we do not allow taking aliased of functor parameters is that is complicates compilation.
A module alias behaves as a direct reference, so it needs to have the same representation as its target.
However, in the case of functor parameters, functor application also does a coercion, so in practice this cannot be a direct reference, at least not outside of the functor.
Another solution could be to allow aliases of functor parameters, while expanding them when exiting the body, but I don't see much use for that.
(0010911)
lpw25 (developer)
2014-02-07 14:52

> The reason we do not allow taking aliased of functor parameters is that is complicates compilation.
> A module alias behaves as a direct reference, so it needs to have the same representation as its target.
> However, in the case of functor parameters, functor application also does a coercion, so in practice this cannot be a direct reference, at least not outside of the functor.

I think that it actually *is* safe to allow module aliases of
functor parameters.

If we consider an example:

  (* Some arbitrary module types where S can be coerced to T
     which can be coerced to U *)
  module type S = sig type t val a : t val b : t end
  module type T = sig type t val a : t end
  module type U = sig type t end

  (* A functor expecting U whose result includes a (generative)
     type *)
  module G (Y : U) = struct type t = T end

  (* A module of type S *)
  module M : S = struct type t = T let a = T let b = T end

  (* A functor expecting T whose result includes an alias and
     some uses of that alias *)
  module F (X : S) = struct
    module I = X
    type t = I.t
    module J = G(I)
  end

  (* F applied to M *)
  module N = F(M)

As you say, it is obviously safe to have `I` be a module alias
within the body of `F`. The question is over the safety of
strengthening this module alias when `F` is applied. In other
words, is it safe for `N` to have the type:

  sig
    module I = M
    type t = I.t
    module J : sig
      type t = G(I).t = T
    end
  end

This means that whilst `I` was an alias to `M : S` within the
body it is an alias to `M` outside of it. However, I think that
this is in fact safe.

There are two places where `I` can appear in the output of `F`:
as a direct part of a path (e.g. `I.t`), or as the applicative
argument to a functor within a path (e.g. `G(I).t`).

When used directly in a path, this is clearly safe because the
current semantics of strengthening would add equalities for all
its type members anyway.

When used as the argument to a functor things are more subtle,
but it is still safe. The functor (in this case `G`) will expect
an argument with some type that `M` can be coerced to (in this
case `U`). The application within `F` is equivalent to `G(M : S :
U)` whilst the application outside of `F` is equivalent to `G(M :
U)`. However, `M : S : U` is equal to `M : U`, so the
applications are in fact the same.

Note that, in this discussion, when I have written `M : S` I mean
the strengthened module obtained by restricting `M` to the
members of `S` and strengthening all generative types to have
aliases to their counterparts in `M`. This is not the same as
what you get when you write `(M : S)` in OCaml, which is why
`G(M)` and `G((M : U))` have different module types in OCaml.

It is possible that I have missed something, but I think this
argument justifies the safety of allowing module aliases to
functor arguments.
(0010912)
lpw25 (developer)
2014-02-07 15:03

> Well, actually you can add an alias to a signature.

Does this mean that making `S with module X = Y` introduce a type alias *would* be (mostly) backwards compatible? It certainly seems that the semantics you described closely match what you would get by introducing a module alias.

In general, I have always thought that the feature:

  S with module X = Y

didn't really make sense. You couldn't write `module X = Y` inside `S`, so how could you possibly add it to `S`. The more sensible feature would be:

  S with module X : T

since you are then adding a statement which could actually have been in `S`.

However, with the arrival of module aliases, this can now make sense: you could indeed have written `module X = Y` within `S`.

Indeed, the main criticism in Yaron's bug report was that the behaviour of `with module` did not match something which could be produced by a functor. But if module aliases to functor arguments were allowed then this criticism would disappear (ignoring for now things like `S with module X = F(M)`).
(0010992)
garrigue (manager)
2014-03-05 06:51

It took me a long time, but I found an example showing that producing a malformed signature is dangerous in practice.
The conclusion is that if we want to remove aliases from signatures produced by "module type of" we need to check that the resulting signature is well-formed. I have no idea how much code this will require...

The example is below.
Note that running it in the toplevel requires a further patch, which allows using module aliases inside "pure recursive modules", i.e. recursive modules directly extracted from a signature without any implementation.
Without this patch, the signature must be put in a mli file, using the -trans-mod flag so that no implementation is required.
Note also that this is not strictly unsound, as all implementation types are kept, but it allows invariants being broken, for instance here the internal representation of sets can be corrupted.

(* Counter example *)
module Int = struct type t = int let compare = compare end
module SInt = Set.Make(Int)
type (_,_) eq = Eq : ('a,'a) eq
type wrap = W of (SInt.t, SInt.t) eq

module M = struct
  module I = Int
  type wrap' = wrap = W of (Set.Make(Int).t, Set.Make(I).t) eq
end

module type S = module type of M

module M' : S = M;;
module rec MR : S = MR;; (* not well defined *)
MR.I.compare 1 2;; (* delayed failure *)

module Int2 = struct type t = int let compare x y = compare y x end

module type S' = sig
  module I = Int2
  include S with module I := I
end

module rec M2 : S' = M2;; (* should succeed! (but this is bad) *)

let M2.W eq = W Eq;;

let s = List.fold_right SInt.add [1;2;3] SInt.empty;;
module SInt2 = Set.Make(Int2);;
let conv : type a b. (a,b) eq -> a -> b = fun Eq x -> x;;
let s' : SInt2.t = conv eq s;;
SInt2.elements s';;
SInt2.mem 2 s';; (* invariants are broken: 2 cannot be found *)
(0010993)
lpw25 (developer)
2014-03-05 09:45

> The conclusion is that if we want to remove aliases from signatures produced by "module type of" we need to check that the resulting signature is well-formed. I have no idea how much code this will require...

If you're going to check that a signature is well-formed after an operation then you might as well allow destructive substitution to weaken signatures and then check that the result is well-formed. Then `module type of` could be fully strengthened.
(0010996)
garrigue (manager)
2014-03-06 04:05

I added a new correct patch.
It checks whether an alias is used as argument in a path inside the signature, and does not remove it in that case.
(0010998)
garrigue (manager)
2014-03-06 07:41

A few answers to Leo:
> If you're going to check that a signature is well-formed after an operation then you might as well allow destructive substitution to weaken signatures and then check that the result is well-formed.

The approach in my patch is to ensure that the signature is well-formed, by adopting a conservative policy, but I do not check it after hand. This would be more complicated, as the code doing that is mixed with the translation from untyped syntax...

> Does this mean that making `S with module X = Y` introduce a type alias *would* be (mostly) backwards compatible? It certainly seems that the semantics you described closely match what you would get by introducing a module alias.

I just showed that you can introduce an alias in a signature if you want to.
The signature obtained in such a way is certainly not equivalent to the weaker one, since it contains a stronger requirement, and Alain showed how this can break code.

> I think that it actually *is* safe to allow module aliases of functor parameters.

We are clearly talking about different problems.
From a type theoretic point of view it is ok to allow module aliases of functor parameters.
However, this subtly change the semantics.
The current semantics of OCaml is that when a functor receives an argument, it is coerced to the parameter signature. If the functor returns this argument unchanged, it still returns the coerced version, with a smaller signature.
However, if you allow module aliases to propagate through functors, the argument would be returned with its original, unrestricted signature.
This is interesting, but incompatible.

let y = 3
module M = struct let x = 1 let y = 2 end
module F(X : sig val x : int end) = struct module N = X end
module FM = F(M)
open FM.N
let () = print_int y

With the current semantics, this would print 3.
If we allow propagation of module aliases (in its current form), this would print 2.

Note that we could also develop a conservative semantics, which would keep both the alias and the restricted signature. I tried, but this is a lot of work, as the two approaches have to be handled in parallel.
(0011000)
lpw25 (developer)
2014-03-06 10:00

> The approach in my patch is to ensure that the signature is well-formed, by adopting a conservative policy, but I do not check it after hand. This would be more complicated, as the code doing that is mixed with the translation from untyped syntax...

Couldn't you apply this conservative policy to destructive substitutions, so that:

  module M : (module type of Foo.M)

  include (module type of Foo) with module M := M

would allow people to remove the module aliases themselves if they needed to.

In general, I think it would be better for `module type of` to provide the most strengthened version of the signature possible, and then support a means to remove aliases when that is safe.

> let y = 3
> module M = struct let x = 1 let y = 2 end
> module F(X : sig val x : int end) = struct module N = X end
> module FM = F(M)
> open FM.N
> let () = print_int y

Is opening the only place where a larger signature can cause unexpected errors? If so, then perhaps we could provide a warning in such cases, on by default for the first few versions after the change.
(0011003)
frisch (developer)
2014-03-06 10:29

> Is opening the only place where a larger signature can cause unexpected errors?

Another case with "include" where a larger signature can cause an error:

module M = struct type t end;;

module N = struct include M type t end
  (* => "Multiple definition of type name t" *)

module N = struct include (M : sig end) type t end
  (* ok *)
(0011004)
lpw25 (developer)
2014-03-06 10:55

> Another case with "include" where a larger signature can cause an error:

Or even:

  # module X = struct let x = 3 end;;
  module X : sig val x : int end

  # let x = 1;;
  val x : int = 1

  # module Y = struct include X let y = x end;;
  module Y : sig val x : int val y : int end

  # Y.y;;
  - : int = 3

  # module Y = struct include (X : sig end) let y = x end;;
  module Y : sig val y : int end

  # Y.y;;
  - : int = 1

Although, I think these could also be detected and warned about.
(0011035)
garrigue (manager)
2014-03-10 11:50

Fixed in trunk at revision 14451, applying remove_alias_correct.diff.

- Issue History
Date Modified Username Field Change
2014-01-27 09:54 frisch New Issue
2014-01-27 11:20 gasche Note Added: 0010847
2014-01-28 09:43 frisch Note Added: 0010852
2014-01-30 13:24 frisch Note Added: 0010865
2014-01-31 01:48 garrigue Note Added: 0010871
2014-01-31 02:15 garrigue File Added: remove_aliases.diff
2014-01-31 02:25 garrigue Note Added: 0010872
2014-01-31 07:47 garrigue File Deleted: remove_aliases.diff
2014-01-31 07:47 garrigue File Added: remove_aliases.diff
2014-02-04 11:07 frisch Note Added: 0010877
2014-02-05 05:31 garrigue Note Added: 0010882
2014-02-05 17:22 lpw25 Note Added: 0010885
2014-02-05 17:25 lpw25 Note Added: 0010886
2014-02-05 17:26 lpw25 Note Edited: 0010886 View Revisions
2014-02-05 17:30 lpw25 Note Edited: 0010886 View Revisions
2014-02-06 02:29 garrigue Note Added: 0010890
2014-02-06 10:15 lpw25 Note Added: 0010894
2014-02-07 00:26 garrigue Note Added: 0010905
2014-02-07 14:52 lpw25 Note Added: 0010911
2014-02-07 15:03 lpw25 Note Added: 0010912
2014-02-19 20:09 doligez Tag Attached: patch
2014-02-19 20:30 doligez Relationship added child of 0006325
2014-02-19 20:30 doligez Relationship deleted child of 0006325
2014-02-19 20:30 doligez Relationship added related to 0006325
2014-03-05 06:51 garrigue Note Added: 0010992
2014-03-05 06:51 garrigue Assigned To => garrigue
2014-03-05 06:51 garrigue Status new => acknowledged
2014-03-05 06:52 garrigue File Added: alias_in_recmodule.diff
2014-03-05 09:45 lpw25 Note Added: 0010993
2014-03-06 04:03 garrigue File Added: remove_alias_correct.diff
2014-03-06 04:05 garrigue Note Added: 0010996
2014-03-06 07:41 garrigue Note Added: 0010998
2014-03-06 10:00 lpw25 Note Added: 0011000
2014-03-06 10:29 frisch Note Added: 0011003
2014-03-06 10:55 lpw25 Note Added: 0011004
2014-03-10 11:50 garrigue Note Added: 0011035
2014-03-10 11:50 garrigue Status acknowledged => resolved
2014-03-10 11:50 garrigue Fixed in Version => 4.02.0+dev
2014-03-10 11:50 garrigue Resolution open => fixed


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker