Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add module _ = X syntax #6662

Closed
vicuna opened this issue Nov 18, 2014 · 36 comments
Closed

Add module _ = X syntax #6662

vicuna opened this issue Nov 18, 2014 · 36 comments

Comments

@vicuna
Copy link

vicuna commented Nov 18, 2014

Original bug ID: 6662
Reporter: @whitequark
Status: acknowledged (set by @damiendoligez on 2014-12-24T15:46:14Z)
Resolution: open
Priority: normal
Severity: feature
Category: language features
Related to: #6362 #6821
Monitored by: @gasche @diml @ygrek @hcarty @Chris00

Bug description

This would be consistent with other bindings and useful e.g. for cstubs, where you often need to execute a functor only for side effects.

@vicuna
Copy link
Author

vicuna commented Nov 18, 2014

Comment author: @gasche

Also on the wishlist:

module X = M
and Y = N

which would be useful as a code preprocessing target (when you want to give them to two module expressions given by the user, without having the name of one risk shadowing free module variables in the other).

@vicuna
Copy link
Author

vicuna commented Nov 18, 2014

Comment author: @alainfrisch

you often need to execute a functor only for side effects.

This can be achieved with:

include (... : sig end)

@vicuna
Copy link
Author

vicuna commented Nov 18, 2014

Comment author: @mshinwell

I'm not sure "module _ = X" makes sense in the context of -no-alias-deps, where the initializer for X would not be run. (-no-alias-deps is likely to become a prevalent mode of compilation, I suspect, since it gives arguably a more reasonable semantics for module aliases.)

@vicuna
Copy link
Author

vicuna commented Nov 18, 2014

Comment author: @lpw25

I'm not sure "module _ = X" makes sense in the context of -no-alias-deps, where the initializer for X would not be run.

This is true for regular paths, but they can't include side-effects anyway. If the path includes a functor application then it will always be executed.

@vicuna
Copy link
Author

vicuna commented Nov 18, 2014

Comment author: @lpw25

Since it is similar, I'll bring up a suggestion that I've made before: supporting () as the result type of a functor.

This would mean supporting:

module Foo (X : Bar) : () = struct ... end

and

module () = Foo(Baz)

which is useful for functors that are intended to be used entirely for their side-effects.

The idea (similar to generative functors) is that () is essentially the same as sig end, but it cannot be assigned to a named module, only to the result of a functor or to ().

@vicuna
Copy link
Author

vicuna commented Nov 18, 2014

Comment author: @gasche

I thought of the same today when seeing this request, but:

  1. what about path that are simple module idents, do you propose to also accept
    module () = Uid
    and force linking even under -no-alias-deps?

  2. Should we allow this only when the signature is explicitly defined as () (which is not possible for a .mli, so that would rule out (1) above), or consider that other signatures can be subtyped into () ?

If, under -no-alias-deps, we don't have a way to force linking of compiler units (which are never functions) using the module language, we will have to use a way to do that in the term language: have the must-be-linked module expose an dummy init : unit -> unit function and use
let () = Uid.init()
instead of
module () = Uid
(... and fight for cross-module cmx optimization to never ever allow to remove a module from the must-be-linked list)

If the "good style" is to do this for compilation unit whose linking we may want to force, it would seem natural to also follow it in functor bodies: export a (init : unit -> unit) function and do

let () = let module M = F(X) in M.init()

In short: if the proposed feature works only for non-alias paths, it's maybe not worth it.

@vicuna
Copy link
Author

vicuna commented Nov 18, 2014

Comment author: @whitequark

lpw25, without being able to subtype any module to (), this feature would be useless for my use case (cstubs).

@vicuna
Copy link
Author

vicuna commented Nov 18, 2014

Comment author: @lpw25

  1. Should we allow this only when the signature is explicitly defined as () (which is not possible for a .mli, so that would rule out (1) above), or consider that other signatures can be subtyped into () ?

As with generative functors, the point is to ensure that usage matches definition. So you should only be able to write:

module () = Foo(X)

if Foo is defined to return (). If you wanted to ignore the result of something you could use:

module () = Ignore(Foo(X))

where Ignore has type functor (_ : sig end) -> (). This is much the same as () and ignore in the core language.

Ignore can also be used to enforce the linking of a component:

module () = Ignore(Uid)

since it will coerce the module to sig end rather than alias it.

lpw25, without being able to subtype any module to (), this feature would be useless for my use case (cstubs).

Using Ignore with () would also work in your case, but I should have made clear that I think this would be in addition to supporting module _ = ... which should be supported for consistency with let.

@vicuna
Copy link
Author

vicuna commented Nov 18, 2014

Comment author: @gasche

Excellent!
Would it be possible to add Ignore in Pervasives for this purpose?

@vicuna
Copy link
Author

vicuna commented Nov 18, 2014

Comment author: @lpw25

Would it be possible to add Ignore in Pervasives for this purpose?

Seems reasonable to me.

@vicuna
Copy link
Author

vicuna commented Nov 18, 2014

Comment author: @yallop

I'm in favour of this proposal. It's an example of the general principle that you shouldn't be obliged to name something when you're not going to actually use the name.

@vicuna
Copy link
Author

vicuna commented Nov 19, 2014

Comment author: @alainfrisch

It's an example of the general principle that you shouldn't be obliged to name something when you're not going to actually use the name.

It feels strange to force using a binding construct to explicitly avoid giving a name to something.

Moreover, this is already possible without using the binding construct:

include (... : sig end)

or:

include Ignore(...)

@vicuna
Copy link
Author

vicuna commented Nov 19, 2014

Comment author: @yallop

I don't really see it as forcing the use of a binding construct: nothing's stopping you from using the 'include' forms, after all. If you're modifying code so that it no longer uses the module name (or starts using it) then it seems quite natural to move between 'module M = F(X)' and 'module _ = F(X)'.

@vicuna
Copy link
Author

vicuna commented Nov 19, 2014

Comment author: @gasche

(It's also strange to use the include construct to purposefully include nothing.)

@vicuna
Copy link
Author

vicuna commented Nov 19, 2014

Comment author: @alainfrisch

The case of a functor only used for its side-effect is quite rare (and thanks to first-class module, it is also possible to use a regular function in that case), and there is already a way to support that (actually, such a functor would likely return an empty signature already, thus supporting "include F(X)" directly).

The proposed feature is only a convenience and I'm not sure that in this specific case, it's worth extending the language to provide this convenience, considering how light the existing alternative is.

At least, if one decides to support it, it could be done purely as syntactic sugar (mapping "module _ = ..." to "include (... : sig end)") to avoid changing the Parsetree.

If you're modifying code so that it no longer uses the module name (or starts using it) then it seems quite natural to move between 'module M = F(X)' and 'module _ = F(X)'.

How often does this happen? I can imagine frameworks that make heavy use of functors only for their side-effects, but then you know from the start that the result will always be ignored (and the resulting signature is empty). Are there indeed cases of functors with non-empty output signatures that can be used for their side-effects only in some cases, but not always?

@vicuna
Copy link
Author

vicuna commented Nov 19, 2014

Comment author: @whitequark

Alain, indeed there is at least one: cstubs. It has a signature which include several callable functions in "run" mode, and it invokes side effects that generate stub C code in "binding generation" mode.

@vicuna
Copy link
Author

vicuna commented Mar 10, 2015

Comment author: @lpw25

Note that the include (... : sig end) trick does not work for local modules. Which gives another reason for adding this feature.

@vicuna
Copy link
Author

vicuna commented Mar 10, 2015

Comment author: @garrigue

What's wrong with writing
let () = let module M = ... in ()

@vicuna
Copy link
Author

vicuna commented Mar 11, 2015

Comment author: @alainfrisch

If you want to avoid introducing a dummy name:

ignore (module ... : EMPTY);

or:

let _ = (module ... : EMPTY) in

(with "module type EMPTY = sig end" defined somewhere)

@vicuna
Copy link
Author

vicuna commented Mar 11, 2015

Comment author: @yallop

Here's another argument in favour of this proposal. Since 4.02 you can write '_' in functor bindings

functor (_: S) -> ...

so it's natural to have the same binding syntax in other module binding contexts.

It's good that there are workarounds, but I don't see any drawbacks to simply making the language a little more uniform. Alain, is it the need to change Parsetree that you're concerned about (rather than the change to the language)?

@vicuna
Copy link
Author

vicuna commented Mar 11, 2015

Comment author: @alainfrisch

Alain, is it the need to change Parsetree that you're concerned about (rather than the change to the language)?

Yes, indeed (since I consider the feature to be quite rarely useful). I can think of several ways to support the feature:

  • Keep the Parsetree definition, and use "_" as the bound identifier name, with some special support in the type-checker.

  • Change "string" to "string option" in Parsetree.module_binding and Parsetree.Pexp_letmodule.

  • Add a new form of structure item "Pstr_ignore of include_declaration" (and something else for the local version?).

  • Treat it purely at the parser level, and map it to "include (... : sig end)" (for structure item, not clear what to do for the local version).

In addition to these technical considerations, I also have a general dislike for "_ bindings": I find "let _ = ..." very ugly. Either the thing to be ignored has type unit, and it's better to write "let () = ..." or simply use sequencing; or it doesn't, and I prefer to be explicit about it and use the ignore function.

@vicuna
Copy link
Author

vicuna commented Mar 11, 2015

Comment author: @lpw25

  • Change "string" to "string option" in Parsetree.module_binding and Parsetree.Pexp_letmodule.

This would be my personal preference. Using _ as an identifier is a bit hacky, treating it at the parser level increases the distance between the AST and the concrete syntax, and "Pstr_ignore" is inconsistent with how let _ is handled.

@vicuna
Copy link
Author

vicuna commented Mar 11, 2015

Comment author: @gasche

On the other hand, Alain's taste would seem to indicate that (module () = ..) is equally or more important than (module _ = ..), and just using an option would not cover that case.

@vicuna
Copy link
Author

vicuna commented Mar 11, 2015

Comment author: @lpw25

On the other hand, Alain's taste would seem to indicate that (module () = ..) is equally or more important than (module _ = ..), and just using an option would not cover that case.

Good point. I would very much like to add the module () = feature, although unlike module _ = it requires some support in the type system.

@vicuna
Copy link
Author

vicuna commented Mar 11, 2015

Comment author: @alainfrisch

If we start finding other useful forms of "module binding patterns" in addition to plain identifiers, I'd indeed be more inclined to "break" the Parsetree (which requires to change many client code -- we don't provide any kind of backward compatibility guarantee, but it's better when breakages are justified by something with a clear benefit). With "_" and "()", the case is already stronger, and I'd rather avoid turning "string" into "string option" now and breaking it again later if/when "()" is introduced.

@vicuna
Copy link
Author

vicuna commented Mar 11, 2015

Comment author: @gasche

It is my understanding that 4.03 will break other things syntax-wise in any case; or did we manage to avoid that yet? (I'm personally interested in pushing the syntax changes #6662, #6800 and #6806, but I may wait until I get a clearer picture of the menhir-for-ocaml work.)

I would be in favor of having at least (module () = ...) in the next release that does open the pandora box of AST changes, whichever it is.

(Also, because I don't think we can stabilize the AST format, it might be good to get some actual experience of AST breakage for ppx users to understand the typical pain points and required changes, before trying to design/provide compatibility libraries to alleviate this issue.)

@vicuna
Copy link
Author

vicuna commented Mar 11, 2015

Comment author: @yallop

Yes, 4.03 changes the parsetree:

1584803

@vicuna
Copy link
Author

vicuna commented Mar 11, 2015

Comment author: @gasche

(Of course I don't mean that breaking the parsetree once means that "anything goes" until the next release, but rather than we have to think in term of a batch of small changes rather than each change as a problematic breakage in isolation.)

@vicuna
Copy link
Author

vicuna commented Mar 12, 2015

Comment author: @garrigue

But we still fallback on the problem that none of these are really useful.
Arguably "functor (_ : S) -> ..." is useful since there are no workarounds, but all the other cases presented here have easy workarounds (either through first class modules or include).
Or do you want to go all the way to "module patterns" ?

@vicuna
Copy link
Author

vicuna commented Mar 12, 2015

Comment author: @lpw25

But we still fallback on the problem that none of these are really useful.

module _ = ... is useful because it enables programmers to more clearly express their intent. Without it people do not fallback to the cumbersome work-around of include (.... : sig end), they use module UnusedName = ... which does not make clear that the resulting module is not intended to be used.

functor F (X : S) : () = ... and module () = ... genuinely add expressivity, because they allow you to ensure that the intention of the functor writer and the functor user agree.

Adding module _ = ... would also make it more reasonable to add a warning for unused module definitions, since it gives an easy way to dismiss the warning.

@vicuna
Copy link
Author

vicuna commented Mar 12, 2015

Comment author: @garrigue

Is it not enough for the module to have an empty signature?
Also, functor F (X : S) : () = ... seems strictly equivalent to
functor F (X : S) : sig end = ..., so I do not see the point.
Or do you mean that () would actually not be subject to subtyping?
This would be rather confusing.

I find this discussion a bit strange, because this seems to assume that using functors only for their side effects is a common practice. I know that some people do that, but do we really want to promote that? Note that we do not prevent it in any way, since all this is already doable, without name space leaks.

@vicuna
Copy link
Author

vicuna commented Mar 12, 2015

Comment author: @lpw25

Is it not enough for the module to have an empty signature?
Also, functor F (X : S) : () = ... seems strictly equivalent to
functor F (X : S) : sig end = ..., so I do not see the point.
Or do you mean that () would actually not be subject to subtyping?
This would be rather confusing.

It is different in the same way that:

functor F (X : sig end)

is different from

functor F ()

The idea is as follows:

  • () can only be used as a module type in the return type of a functor.

  • Any structure can be subtyped to () as part of returning from a functor.

  • A module of type () cannot be bound to a name (or included within a
    structure). It can only be used with module () = ....

The point is to ensure that the intention of the functor's author and the intention of the functor's user are the same. If we have:

functor F (X : S) : () = ...

and

module () = F(M)

this clearly indicated that both the author and user of F intend that it is only used for its side-effects. If the author of F changes his mind and alters the definition to:

functor F(X : S) = ...

because F is no longer intended for side-effects, then the user of F will get an error message to notify them about the change in interface.

This is similar to the benefits of writing:

let () = ....

instead of

let _ = ....

@vicuna
Copy link
Author

vicuna commented Dec 28, 2016

Comment author: @yallop

This feature would also offer a convenient way of checking that a module satisfies a signature without introducing a binding into the environment. For example, when defining a module

module Num =
struct
...
end

it's sometimes useful to be able to check at the point of definition that the module satisfies one or more interfaces:

module _ : ORDERED = Num
module _ : PRINTABLE = Num
(* etc. *)

@vicuna
Copy link
Author

vicuna commented Dec 28, 2016

Comment author: @gasche

I'm not sure why we are arguing about this. Both (module _ = ...) and (module () = ...) are natural, consistent syntax with several proposed used-cases, and it seems fairly easy to implement.

(There is a cost to changing the parsetree definition, but hopefully by now a lot of the parsetree consumers have found a compatibility-story they liked?)

@vicuna
Copy link
Author

vicuna commented Jan 3, 2017

Comment author: @alainfrisch

(There is a cost to changing the parsetree definition, but hopefully by now a lot of the parsetree consumers have found a compatibility-story they liked?)

I don't think so.

@nojb
Copy link
Contributor

nojb commented Feb 21, 2020

I believe this has been fixed by #8908

@nojb nojb closed this as completed Feb 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants