Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Behavior of 'module type of' w.r.t. module aliases #6307

Closed
vicuna opened this issue Jan 27, 2014 · 22 comments
Closed

Behavior of 'module type of' w.r.t. module aliases #6307

vicuna opened this issue Jan 27, 2014 · 22 comments
Assignees

Comments

@vicuna
Copy link

vicuna commented Jan 27, 2014

Original bug ID: 6307
Reporter: @alainfrisch
Assigned to: @garrigue
Status: closed (set by @xavierleroy on 2015-12-11T18:25:46Z)
Resolution: fixed
Priority: normal
Severity: minor
Fixed in version: 4.02.0+dev
Category: typing
Tags: patch
Related to: #6325
Monitored by: @gasche @diml @jmeber "Julien Signoles" @yallop @hcarty

Bug description

The introduction of module aliases can break code which uses 'module type of'. Example:

===================================================================
module A1 = struct end
module A2 = struct end

module L1 = struct
  module X = A1
end

module L2 = struct
  module X = A2
end

module F (L : (module type of L1)) = struct end

module F1 = F(L1)
module F2 = F(L2)
=================================================================== 

Previously, the type of F's argument was "sig module X : sig end end", and now it is "sig module X = A1 end", making L2 incompatible with it.

A discussion about what to do started on caml-devel. Here is Jacques' reply to my email:

What should we do about it?

An easy fix when the submodule contains no abstract types is to add an annotation on module X:

module L1 = struct
module X : module type of A1 = A1
end

I'm tempted to argue in favor of generalizing the special treatment of 'module type of' w.r.t. module aliases to submodules to keep backward compatibility. This seems also more coherent with the treatment of 'module type of' w.r.t. type aliases.

I'm not sure of what you mean by your second sentence.
'module type of' does keep type aliases.
The only specific behavior is that it does not strengthen the types, i.e. it does not add aliases to abstract types.
This is not what happens here for the module aliases: the alias comes from the definition in L1, not from module type of.
So this would not just mean not adding aliases, but actively removing them from the signature, recursively.

This said I agree that the choice not to strengthen abstract types had your use case in mind.
Since the problem we see here is again specific to this use case, it could be reasonable to solve it in the same direction.
However, there is also a new problem here: if we do not leave the alias by default, we need a way to reintroduce it.
Currently "(module type of L1) with module X = A1" does not reintroduce an alias.
I suppose it should introduce one, so this is rather a bug.

What do you think?

I'm not sure what is the best solution.
In particular, experience has shown that not strengthening 'module type of' was actually a bad idea.
In at least 95% of cases you want the strengthened version, and doing the strengthening by hand is a pain.
The use case you describe here seems pretty rare, to the point I was even wondering wether we should not revert to the strengthening semantics. But, as rare as it is, you demonstrated that it happens in practice, so maybe we need two versions of 'module type of'.

I suppose it also relates to the expected semantics of 'module type of'.
Is it just something which looks up a module type in the environment,
or is it something more meta-level, which behaves like syntactically copying the module type.
#6305 is an example of that: there the intuition is that 'module type of A' should not introduce a dependency on A, since we do not access any internal part of A, but this would require some specific treatment.
#6305

File attachments

@vicuna
Copy link
Author

vicuna commented Jan 27, 2014

Comment author: @gasche

My gut feeling (with a healthy disrespect for what the theory or
implementation actually allow) is that the code example you show is
fundamentally wrong. One should not use (module type of L1) in
signature (contravariant) position after just experimenting with this
fragile construction and seeing that, at one point in time, it happens
to do more or less what you want to save you the burden of writing
a proper signature. Of course that was going to break as soon as we
refined the signature language in a way that distinguishes L1 and L2.

Instead, I would expect having to write something such as:

module L1 = struct
  module X = A1
  type t = ...
  ...
end

module type S = sig
  module X : sig .. end
  type t
  include (module type of L1 with module X := X and type t := t)
end

module L2 (* : S with ... *) = struct
  module X = A2
end

module F (L : S) = ...

Where the S signature re-abstracts over all type and module components
of L1 that are specific, in a way that removes some boilerplate
(over giving the full signature) but protects again future
refinements.

I guess my message is that we should let "module type of" be used to
avoid repeating value components, but should discourage uses that
abusively capture (module) types, to avoid such identity issues in the
future.

I would be in favor of switching (module type) to the strengthened
interpretation as soon as possible -- if we can accept any break of an
experimental feature at all. Maybe there are language constructs that
we could add to simplify weakening strong signatures, or strengthen
weak ones.

@vicuna
Copy link
Author

vicuna commented Jan 28, 2014

Comment author: @alainfrisch

An easy fix when the submodule contains no abstract types is to add an
annotation on module X:

module L1 = struct
module X : module type of A1 = A1
end

Another fix (which applies also if the submodule contains abstract types and we want to keep a strengthened version):

module L1 = struct
module X = struct include A1 end
end

Actually, rewriting:

module X = A

to

module X = struct include A end

seems to be a way to recover the pre-alias semantics in a systematic way (it could be automated by a ppx filter).

@vicuna
Copy link
Author

vicuna commented Jan 30, 2014

Comment author: @alainfrisch

In which use cases would one prefer the current behavior of 'module type of' w.r.t. module aliases?

@vicuna
Copy link
Author

vicuna commented Jan 31, 2014

Comment author: @garrigue

In which use cases would one prefer the current behavior of 'module type of' w.r.t. module aliases?

In the case you want to keep the information that the submodules keep their identity.
For instance, if you want to use them as argument to applicative functors.

This said, this may be too restrictive even when you want to use this for an extended implementation, as this means that you cannot reimplement submodules.

So I tend to agree with you that 'module type of' should remove module aliases, independently of whether strengthening was a good idea or not.

@vicuna
Copy link
Author

vicuna commented Jan 31, 2014

Comment author: @garrigue

OK, I added a patch that does what you want: it removes aliases in submodules too.

However, I need to think a bit more about the soundness aspect.
Indeed, this patch allows to create ill-typed signatures: since we remove aliases, some declaration lower in the signature, which depended on the alias, could be ill-typed.
If the signature contains no values, one can turn it into a module using 'module rec'.
However this does not seem to be the case here, as for this to make sense there should be values in some module.
So probably this only means that this signature would not match any module (included the original module), which is bad, but not unsound. Still, we need to be careful.

I we additionally have to check for the well-typedness of the signature, this is much more work :(

@vicuna
Copy link
Author

vicuna commented Feb 4, 2014

Comment author: @alainfrisch

Which case do you see of a signature depending on an alias (and which breaks when the alias is expanded)?

@vicuna
Copy link
Author

vicuna commented Feb 5, 2014

Comment author: @garrigue

Here is an example:

module F (X : sig end) = struct type t = A end
module G (X : sig type t end) (Y : sig type t = X.t end) =
struct type t = B end
module M1 = struct end
module N1 = F(M1)
module M = struct
module M2 = M1
module N2 = F(M2)
module H = G(N1)(N2)
end

module type S = module type of M

The functor G only accepts modules containing the same type t, but in the signature S this is not the case, since F was applied on different modules. So S is invalid, and would be refused if input explicitly.
Note however that we can still coerce M to S:

module M' : S = M

I'm not yet sure whether this creates problems or not...
And there are other examples of signatures one cannot write directly.

@vicuna
Copy link
Author

vicuna commented Feb 5, 2014

Comment author: @lpw25

Personally, I think that (module type of ...) should be changed to give the strengthened type, and it should include module aliases.

For example, an obvious use for (module type of ...) is to allow you to write:

include (module type of Foo)

as the interface for a module which had:

include Foo

in its implementation. In this case it is clear that you want the strengthened version and the module aliases included.

This said, this may be too restrictive even when you want to use this for an extended implementation, as this means that you cannot reimplement submodules.

If you want to reimplement submodules then you just need to use:

module M : (module type of Foo.M)

include (module type of Foo) with module M := M

This is probably a good thing as it makes your re-implementation obvious.

(An even better way would be to write:

include (module type of Foo) with module M : (module type of Foo.M)

but for some reason we do not allow with module to specify the type of a module explicitly.)

@vicuna
Copy link
Author

vicuna commented Feb 5, 2014

Comment author: @lpw25

If you want to reimplement submodules then you just need to use:

module M : (module type of Foo.M)

include (module type of Foo) with module M := M

This is probably a good thing as it makes your re-implementation obvious.

Actually, maybe this doesn't work. I think that I am using "with module" backwards.

@vicuna
Copy link
Author

vicuna commented Feb 6, 2014

Comment author: @garrigue

Indeed, the problem with strengthening by default is that 'with' is only allowed to make a signature more specific, not less.

Also, concerning module aliases, it seems that an expected property of 'module type of' is that it should generate a signature with as little dependencies as possible. If we keep aliases, we introduce a dependency.

So having a 'module type of' remove module aliases seems a good idea to me.
My only concerns are:

  • soundness as we may create invalid signature

  • robustness as one expects that if module M' = M, then one should have module M' : module type of M

  • whether there is some need to add aliases to a signature using 'with'
    unfortunately using the current construct to do that would not be backward compatible, so we are probably stuck with writing

    module type T' = sig
    module M2 = M1
    include T with module M2 := M2
    end

An additional question is whether to remove aliases inside functors too (the current patch only does it for submodules).

@vicuna
Copy link
Author

vicuna commented Feb 6, 2014

Comment author: @lpw25

whether there is some need to add aliases to a signature using 'with' unfortunately using the current construct to do that would not be backward compatible

Is this just because currently module aliases are not produced by functors? For example, I think on trunk the following is not allowed:

module F (X : S) : sig module M = X end = struct
module M = X
end

Which means that many current uses of with module do not want aliases. Although I think we could support module aliases in the result of functors quite easily (I don't think it gets tricky until you allow functor application in the paths being aliased).

Or is there some other major cause of backwards incompatibility?

@vicuna
Copy link
Author

vicuna commented Feb 6, 2014

Comment author: @garrigue

Well, actually you can add an alias to a signature.
The reason is the rather strange behavior of 'with module' (already reported a long time ago in a bug report by Yaron I think).
Namely, while the specification of 'with module M = Q' is to add to the type paths M.P.t aliases to Q.P.t, what actually happens is that the signature of M is replaced by the strengthened signature of P.
So you can do the following:

module type S1 = sig module M : sig type t end end;;
module M = struct type t = int let v = 1 end;;
module M' = M;;
module type S1'' = S1 with module M = M';;

You can also see the strangeness (which is not new) by trying:

module type S1' = S1 with module M = M;;

Concerning functors, this is completely unrelated.
The reason we do not allow taking aliased of functor parameters is that is complicates compilation.
A module alias behaves as a direct reference, so it needs to have the same representation as its target.
However, in the case of functor parameters, functor application also does a coercion, so in practice this cannot be a direct reference, at least not outside of the functor.
Another solution could be to allow aliases of functor parameters, while expanding them when exiting the body, but I don't see much use for that.

@vicuna
Copy link
Author

vicuna commented Feb 7, 2014

Comment author: @lpw25

The reason we do not allow taking aliased of functor parameters is that is complicates compilation.
A module alias behaves as a direct reference, so it needs to have the same representation as its target.
However, in the case of functor parameters, functor application also does a coercion, so in practice this cannot be a direct reference, at least not outside of the functor.

I think that it actually is safe to allow module aliases of
functor parameters.

If we consider an example:

(* Some arbitrary module types where S can be coerced to T
which can be coerced to U *)
module type S = sig type t val a : t val b : t end
module type T = sig type t val a : t end
module type U = sig type t end

(* A functor expecting U whose result includes a (generative)
type *)
module G (Y : U) = struct type t = T end

(* A module of type S *)
module M : S = struct type t = T let a = T let b = T end

(* A functor expecting T whose result includes an alias and
some uses of that alias *)
module F (X : S) = struct
module I = X
type t = I.t
module J = G(I)
end

(* F applied to M *)
module N = F(M)

As you say, it is obviously safe to have I be a module alias
within the body of F. The question is over the safety of
strengthening this module alias when F is applied. In other
words, is it safe for N to have the type:

sig
module I = M
type t = I.t
module J : sig
type t = G(I).t = T
end
end

This means that whilst I was an alias to M : S within the
body it is an alias to M outside of it. However, I think that
this is in fact safe.

There are two places where I can appear in the output of F:
as a direct part of a path (e.g. I.t), or as the applicative
argument to a functor within a path (e.g. G(I).t).

When used directly in a path, this is clearly safe because the
current semantics of strengthening would add equalities for all
its type members anyway.

When used as the argument to a functor things are more subtle,
but it is still safe. The functor (in this case G) will expect
an argument with some type that M can be coerced to (in this
case U). The application within F is equivalent to G(M : S : U) whilst the application outside of F is equivalent to G(M : U). However, M : S : U is equal to M : U, so the
applications are in fact the same.

Note that, in this discussion, when I have written M : S I mean
the strengthened module obtained by restricting M to the
members of S and strengthening all generative types to have
aliases to their counterparts in M. This is not the same as
what you get when you write (M : S) in OCaml, which is why
G(M) and G((M : U)) have different module types in OCaml.

It is possible that I have missed something, but I think this
argument justifies the safety of allowing module aliases to
functor arguments.

@vicuna
Copy link
Author

vicuna commented Feb 7, 2014

Comment author: @lpw25

Well, actually you can add an alias to a signature.

Does this mean that making S with module X = Y introduce a type alias would be (mostly) backwards compatible? It certainly seems that the semantics you described closely match what you would get by introducing a module alias.

In general, I have always thought that the feature:

S with module X = Y

didn't really make sense. You couldn't write module X = Y inside S, so how could you possibly add it to S. The more sensible feature would be:

S with module X : T

since you are then adding a statement which could actually have been in S.

However, with the arrival of module aliases, this can now make sense: you could indeed have written module X = Y within S.

Indeed, the main criticism in Yaron's bug report was that the behaviour of with module did not match something which could be produced by a functor. But if module aliases to functor arguments were allowed then this criticism would disappear (ignoring for now things like S with module X = F(M)).

@vicuna
Copy link
Author

vicuna commented Mar 5, 2014

Comment author: @garrigue

It took me a long time, but I found an example showing that producing a malformed signature is dangerous in practice.
The conclusion is that if we want to remove aliases from signatures produced by "module type of" we need to check that the resulting signature is well-formed. I have no idea how much code this will require...

The example is below.
Note that running it in the toplevel requires a further patch, which allows using module aliases inside "pure recursive modules", i.e. recursive modules directly extracted from a signature without any implementation.
Without this patch, the signature must be put in a mli file, using the -trans-mod flag so that no implementation is required.
Note also that this is not strictly unsound, as all implementation types are kept, but it allows invariants being broken, for instance here the internal representation of sets can be corrupted.

(* Counter example *)
module Int = struct type t = int let compare = compare end
module SInt = Set.Make(Int)
type (,) eq = Eq : ('a,'a) eq
type wrap = W of (SInt.t, SInt.t) eq

module M = struct
module I = Int
type wrap' = wrap = W of (Set.Make(Int).t, Set.Make(I).t) eq
end

module type S = module type of M

module M' : S = M;;
module rec MR : S = MR;; (* not well defined )
MR.I.compare 1 2;; (
delayed failure *)

module Int2 = struct type t = int let compare x y = compare y x end

module type S' = sig
module I = Int2
include S with module I := I
end

module rec M2 : S' = M2;; (* should succeed! (but this is bad) *)

let M2.W eq = W Eq;;

let s = List.fold_right SInt.add [1;2;3] SInt.empty;;
module SInt2 = Set.Make(Int2);;
let conv : type a b. (a,b) eq -> a -> b = fun Eq x -> x;;
let s' : SInt2.t = conv eq s;;
SInt2.elements s';;
SInt2.mem 2 s';; (* invariants are broken: 2 cannot be found *)

@vicuna
Copy link
Author

vicuna commented Mar 5, 2014

Comment author: @lpw25

The conclusion is that if we want to remove aliases from signatures produced by "module type of" we need to check that the resulting signature is well-formed. I have no idea how much code this will require...

If you're going to check that a signature is well-formed after an operation then you might as well allow destructive substitution to weaken signatures and then check that the result is well-formed. Then module type of could be fully strengthened.

@vicuna
Copy link
Author

vicuna commented Mar 6, 2014

Comment author: @garrigue

I added a new correct patch.
It checks whether an alias is used as argument in a path inside the signature, and does not remove it in that case.

@vicuna
Copy link
Author

vicuna commented Mar 6, 2014

Comment author: @garrigue

A few answers to Leo:

If you're going to check that a signature is well-formed after an operation then you might as well allow destructive substitution to weaken signatures and then check that the result is well-formed.

The approach in my patch is to ensure that the signature is well-formed, by adopting a conservative policy, but I do not check it after hand. This would be more complicated, as the code doing that is mixed with the translation from untyped syntax...

Does this mean that making S with module X = Y introduce a type alias would be (mostly) backwards compatible? It certainly seems that the semantics you described closely match what you would get by introducing a module alias.

I just showed that you can introduce an alias in a signature if you want to.
The signature obtained in such a way is certainly not equivalent to the weaker one, since it contains a stronger requirement, and Alain showed how this can break code.

I think that it actually is safe to allow module aliases of functor parameters.

We are clearly talking about different problems.
From a type theoretic point of view it is ok to allow module aliases of functor parameters.
However, this subtly change the semantics.
The current semantics of OCaml is that when a functor receives an argument, it is coerced to the parameter signature. If the functor returns this argument unchanged, it still returns the coerced version, with a smaller signature.
However, if you allow module aliases to propagate through functors, the argument would be returned with its original, unrestricted signature.
This is interesting, but incompatible.

let y = 3
module M = struct let x = 1 let y = 2 end
module F(X : sig val x : int end) = struct module N = X end
module FM = F(M)
open FM.N
let () = print_int y

With the current semantics, this would print 3.
If we allow propagation of module aliases (in its current form), this would print 2.

Note that we could also develop a conservative semantics, which would keep both the alias and the restricted signature. I tried, but this is a lot of work, as the two approaches have to be handled in parallel.

@vicuna
Copy link
Author

vicuna commented Mar 6, 2014

Comment author: @lpw25

The approach in my patch is to ensure that the signature is well-formed, by adopting a conservative policy, but I do not check it after hand. This would be more complicated, as the code doing that is mixed with the translation from untyped syntax...

Couldn't you apply this conservative policy to destructive substitutions, so that:

module M : (module type of Foo.M)

include (module type of Foo) with module M := M

would allow people to remove the module aliases themselves if they needed to.

In general, I think it would be better for module type of to provide the most strengthened version of the signature possible, and then support a means to remove aliases when that is safe.

let y = 3
module M = struct let x = 1 let y = 2 end
module F(X : sig val x : int end) = struct module N = X end
module FM = F(M)
open FM.N
let () = print_int y

Is opening the only place where a larger signature can cause unexpected errors? If so, then perhaps we could provide a warning in such cases, on by default for the first few versions after the change.

@vicuna
Copy link
Author

vicuna commented Mar 6, 2014

Comment author: @alainfrisch

Is opening the only place where a larger signature can cause unexpected errors?

Another case with "include" where a larger signature can cause an error:

module M = struct type t end;;

module N = struct include M type t end
(* => "Multiple definition of type name t" *)

module N = struct include (M : sig end) type t end
(* ok *)

@vicuna
Copy link
Author

vicuna commented Mar 6, 2014

Comment author: @lpw25

Another case with "include" where a larger signature can cause an error:

Or even:

module X = struct let x = 3 end;;

module X : sig val x : int end

let x = 1;;

val x : int = 1

module Y = struct include X let y = x end;;

module Y : sig val x : int val y : int end

Y.y;;

  • : int = 3

module Y = struct include (X : sig end) let y = x end;;

module Y : sig val y : int end

Y.y;;

  • : int = 1

Although, I think these could also be detected and warned about.

@vicuna
Copy link
Author

vicuna commented Mar 10, 2014

Comment author: @garrigue

Fixed in trunk at revision 14451, applying remove_alias_correct.diff.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants