Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0006681OCaml~DO NOT USE (was: OCaml general)public2014-11-28 17:502017-02-16 15:18
Reporterdrup 
Assigned Tofrisch 
PrioritynormalSeverityminorReproducibilityhave not tried
StatusclosedResolutionfixed 
PlatformOSOS Version
Product Version4.02.1 
Target Version4.03.0+dev / +beta1Fixed in Version4.03.0+dev / +beta1 
Summary0006681: Signatures as ppx payload
DescriptionCurrently, the payload of an extension point is either
- a structure [%foo <struct> ]
This case also covers expressions
- a type [%foo: <type> ]
- a pattern [%foo? <pat> ]

I would like to be able to have signatures as a payload too. Gabriel Scherer proposed the syntax [%foo sig <sig> end ].

The use case I have in mind is to create sections in a .mli, in particular in eliom.

[%%client sig
val foo : ...;
end ]

and the shorter version:
val%client foo : ....

Some people argued that a syntax [%%client start] is better, but it doesn't allow for the shortcut syntax, which I really would like to have.

As far as I can tell, this is the only last piece of syntax that is not valid as a payload (you can manage to put most of the other syntaxes, one way or another).
TagsNo tags attached.
Attached Files

- Relationships
related to 0006688closedfrisch Accept value declarations as structure items 

-  Notes
(0012623)
frisch (developer)
2014-12-03 10:25
edited on: 2014-12-03 10:36

Another direction would be to allow in the parser (and Parsetree) to have "val" item in structures. They would be rejected by the type-checker for now, although there are some interesting ways to use them in structures (as a nice way to specify the expected type for a function, or perhaps to support local forward declarations). Or do you have other kinds of signature items that are not also structure items that you'd like to use?

For now, you could also turn "val" into "external":

[%%client external foo : .... = ""]

(0012626)
drup (reporter)
2014-12-03 16:39
edited on: 2014-12-03 16:39

.eliomi are .mli where you can put client/shared/server annotations. I would like that anything valid in a .mli be valid in a .eliomi, annotated.

I'm pretty sure other people will have the same need (add %foo annotations on signature items).

I don't see adding val to structures as a valid solution for this problem.

(0012631)
lpw25 (developer)
2014-12-03 16:58

> I don't see adding val to structures as a valid solution for this problem.

But it does add support for your suggested short-cut version.
(0012632)
drup (reporter)
2014-12-03 17:08

Only because I gave a simple example. It wouldn't help as soon we want other signature items:

[%client
module M : .....
]

(module%foo is not possible currently, unfortunately)
(0012634)
frisch (developer)
2014-12-03 17:58

If you don't use class-based features and with the addition of "val" to structures, you could mostly translate all signatures to structures:

[%%client module M : sig type t val x : .... end]

would be written:

[%%client module M = struct type t val x : .... end]

Alternatively, one could also consider adding "module M : S" to structures directly, but the case for it is less clear than for "val".
(0012638)
frisch (developer)
2014-12-03 18:41

> .eliomi are .mli where you can put client/shared/server annotations. I would like that anything valid in a .mli be valid in a .eliomi, annotated.

Did you consider using attributes instead of extension nodes:

  module M : ...
    [@@client]

  val foo : ..
    [@@server]

?
(0012640)
lpw25 (developer)
2014-12-03 18:54

> Only because I gave a simple example. It wouldn't help as soon we want other signature items:

Sorry, I was thinking that the proposal was for all signature items to be valid as structure items, but it is indeed only `val`s which are proposed.
(0012643)
drup (reporter)
2014-12-03 19:12

frisch: I could do that but:
1) It doesn't fit the semantic of attributes vs extensions.
2) It doesn't allow to annotate several signature items at once, and I would like to allow both individual and grouped annotations (which is perfectly possible with extension nodes on structures).
3) The compiler will not reject the annotation if uninterpreted, see 1).

This seems to me as a logical extension of how extension nodes behave on structures, It is the same behavior, but available on signatures (and without workaround syntax which main purpose will be to confuse the user).
(0012647)
frisch (developer)
2014-12-04 10:11

First, note that I'm not opposed in principle to adding signatures as a new valid form of attribute/extension payload. But the proposed syntax [%%client sig ... end] seems a bit heavy, and I'm trying to see if other solutions couldn't be preferred. Considering the large overlap between the syntax of structure and signature items, I've a natural tendency to try bringing the two syntactic classes closer to each other. Supporting both as payload will probably lead to cases where users write [%%client type t] instead of [%%client sig type t end]; and then you'll feel some pressure to support both forms in your ppx for the intersection of the two syntactic categories, leading to code duplication.

(For the same reason, I've also tried to merge expressions and patterns, but failed to do so; that just introduced too many conflicts in the grammar.)


> 1) It doesn't fit the semantic of attributes vs extensions.

That's a matter of interpretation. It seems to me that a signature item under a %client is interpreted in the normal way (when not filtered out) by the compiler. It has never been said that attributes don't change the (static or dynamic) meaning of a program: typically, attributes used to trigger deriving-like code generation adds more declarations to the actual code, which of course affects its meaning. In your case, if I understand correctly, attributes would be used to drop items, but not really change their meaning.

Extension nodes are quite different: they are supposed to embed a "foreign" sub-language which happens to share the same concrete syntax as a fragment of OCaml, or to extend OCaml with "new" local features. Typical uses of extensions would be to define lexers/parsers, embed "data query sublanguages", non-standard forms of let-binding or pattern matching, etc.


But assuming you really don't want to use attributes, can we try to see what it would take to stick to structures as payload? I take your point about avoiding syntactic workaround for 'module X : S'. It's more intrusive than the proposed change for 'val', but it's not very difficult to support them as structure items as well. If we want to avoid extending the Parsetree, it's pretty easy to do by adding the following rule to module_binding_body in parser.mly:

  | COLON module_type
      {
        let abstr = ghmod(Pmod_extension (mkloc "MISSING" (symbol_rloc()) , PStr[])) in
        mkmod(Pmod_constraint(abstr, $2))
      }


or:

  | COLON module_type
      {
        let ext =
          Ast_mapper.extension_of_error
            (error ~loc:(symbol_rloc ())
               "Error: module declarations are not allowed in structures (a module expression must 
be provided)")
        in
        let abstr = ghmod(Pmod_extension ext) in
        mkmod(Pmod_constraint(abstr, $2))
      }


i.e. considering:

  module X : S

in structures to be equivalent to:

  module X : S = [%...]


(And same for classes.)

(A more ambitious approach would be to really merge structure and signature items both in the parser and in the Parsetree. This would reduce significantly code duplication in the parser and all syntactic boilerplate code (Ast_helper, Ast_mapper, printast), while adding not too much noise in the typechecker, and providing better error message when users use structure items in signatures or the opposite.)
(0012659)
drup (reporter)
2014-12-04 15:23

I agree that the syntax [%%foo sig ... end ] is not optimal.

My initial idea was to say "if the extension node is in a structure, then the payload is a structure, if it's in a signature, the payload is a signature" but gabriel pointed out several issues with that, and I tend to agree that it's a bad solution too.

Your proposition for module signature in the parsetree means that a ppx that want to output a signature for it will have to transform the structure. It seems a bit adhoc and unnecessary to me. I suppose it would probably be ok if the transformation is provided in compiler-libs.

I actually like the "more ambitious" approach. Would the core team be interested by such a change ?
(0012661)
frisch (developer)
2014-12-04 15:49

> I actually like the "more ambitious" approach. Would the core team be interested by such a change ?

Warning: some heavy discussions to be expected!

Technically, it seems the most difficult part of merging signatures and structures at the syntactic level comes from the "include" statement. It would force to merge module_expr and module_type as well, although they are quite different.
(0012662)
lpw25 (developer)
2014-12-04 15:57

> Technically, it seems the most difficult part of merging signatures and structures at the syntactic level comes from the "include" statement. It would force to merge module_expr and module_type as well, although they are quite different.

How about a slightly less ambitious approach where structures and signatures share most of their items, but not everything? This could also include merging the syntactic descriptions of:

  let open ... in
  let module ... in
  (* and in the future
       let type ... in
       let exception ... in
       let class ... in
     etc. *)

with the structure and signature versions.
(0012671)
frisch (developer)
2014-12-05 18:10

> How about a slightly less ambitious approach where structures and signatures share most of their items, but not everything?

I've given it a try in the merge_sig_str SVN branch: merge the definitions of signature/structure item in the Parsetree definition (which already allows to share some boilerplate code), and be more liberal in the parser (which also removes some duplication). Actually, the only remaining differences in the parser between structures and signatures are:

  - top-level expressions are only allowed in structures
  - include is parsed differently in structures and in signatures

and the rest is treated by the type-checker (i.e. reject signature-only items in struct and structure-only items in signatures, with special care to reject exception/extension constructor rebinding in signatures and non-primitive value declarations in structure). Module definitions (module X = ME) and declarations (module X : MT) are represented differently (resp. with the Pstr_module and Psig_module constructors, which now belong to the same type), contrary to a previously mentioned hack. Module aliases in signatures (module X = Y) are represented as module definitions and interpreted as declaration when type-checking signatures (one should probably go one step further and remove the Pmty_alias constructor altogether, treating module X = Y directly in the type-checker).

I don't know why, but merging class definitions and class declarations introduced some conflicts (on SHARP) in the grammar. I'm confident that someone more fluent than me with yacc could fix that.


Waiting for some feedback before daring to officially propose that for inclusion...
(0012691)
drup (reporter)
2014-12-07 00:43

I tracked down the reduce/reduce conflict. It's a combination of ... interesting things.
1) When reading a file with #use, it's authorized not to put ";;" at the end of lines *and* to use directives starting by "#".
2) In type expressions "#foo" (and "<type> #foo") is authorized and means "[< foo]" (resp "[< <type> foo]"). It was declared deprecated 13 years ago in this commit: https://github.com/ocaml/ocaml/commit/42d1811#diff-283538db7c320a41442d987f0c880fc5R166 [^]

The following line, when read using #use, is such ambiguous in LR(1):
"class t : foo #bar -> ...."

The following line is equally ambiguous
"type t = foo #bar"
and is resolved in favor of considering (foo #bar) as the whole type.

It's not obvious to me how to remove the ambiguity without making the grammar more complex. Removing the deprecated feature solves the problem, though, but I don't know if it's acceptable or not.
(0012692)
lpw25 (developer)
2014-12-07 01:44

#foo is not just for variants (which is deprecated), but also for objects (which is very much not deprecated).
(0012693)
lpw25 (developer)
2014-12-07 01:51

The conflict would probably be best avoided by applying the same rules for directives in files as those used for toplevel expressions (i.e. often must be preceeded by ;;).

This is not backwards compatible but might be acceptable.
(0012694)
drup (reporter)
2014-12-07 02:04

Ah, indeed, I didn't know this one, thanks! If I understand correctly, it transforms a class into the relevant open object type.
(0012698)
frisch (developer)
2014-12-07 11:36

Do you guys understand why the conflict does not appear in the current OCaml grammar, but shows up after the merger of class definitions/declarations rules?
(0012699)
lpw25 (developer)
2014-12-07 12:40

The conflict is because when parsing a class type, the following two forms are allowed:

    class_type

    type -> class_type

which means we might be parsing a class type or we might be parsing a type. These two forms overlap in two ways:

   1. Either could be an identifier
   2. Either could be an extension node

Previously these two forms were disambiguated by what could follow them. Class types could only be followed by a signature item (`type`, `let`, `module` etc.), whereas types could not possibly be followed by a signature item.

With the merge of signature items and structure items, they could now both be followed by `#`. In the case of a class type this would be the start of a toplevel directive, in the case of a regular type this would be a `#foo` object type.

So it is now not possible (in LR(1)) to disambiguate:

    class foo : bar
    #directive

and

    class foo : bar #directive -> baz

at the point where we have shifted bar and have `#` as our lookahead token.
(0012700)
lpw25 (developer)
2014-12-07 12:45

> The conflict would probably be best avoided by applying the same rules for directives in files as those used for toplevel expressions (i.e. often must be preceeded by ;;).

I think this would be the best solution. Top-level directives are rarely mixed with normal definitions, and I conjecture that when they are they are placed at the top of the file rather than mixed in amongst the normal definitions.

Perhaps a scan over OPAM and a survey on the caml-list could verify this conjecture?
(0012701)
drup (reporter)
2014-12-07 15:14

To add to what leo said: Having # as a lookahead token in this position is only possible with the parsing rules for #use, which could only parse structure items previously, not signature items.
(0012709)
frisch (developer)
2014-12-08 09:50

I just realized that "class type" definitions can only refer to class_signatures, not arbitrary class_types, which prevents the conflict on:

class type foo = bar #directive -> baz

Do you know if there is a deep reason to reject such definitions? If not, it would give an extra argument to drop the conflict on such items, which would also benefit to "class" declarations.
(0012721)
lpw25 (developer)
2014-12-08 13:33

> Do you know if there is a deep reason to reject such definitions?

I don't know how deep the reason is, but it matches the semantics of class definitions.

If I define a class `foo`:

    class foo x = object method m = x + 1 end

then I can use `foo` as a class type:

    class bar : foo = object method m = 3 method private n = 4 end

This doesn't force `bar` to have the same full class type as `foo`, only to have the same class signature.

I think that supporting class types like:

    class type baz = int -> object method m : int end

would require us to change the above behaviour for consistency (which is obviously not backwards compatible).
(0014582)
drup (reporter)
2015-10-19 15:12

Where are we on this one ? Do we still have unresolved grammar issues ?
(0014588)
frisch (developer)
2015-10-20 12:42

I don't think there has been any progress on this, so the conflict remains in the branch. Also, synchronizing with trunk would take some effort given the amount of changes/refactoring that went into the parser recently. More importantly, this hasn't really been discussed among maintainers (and the proposal is probably controversial).

If you're interested in pushing this forward, it might be worth creating a PR on Github, perhaps starting from my SVN branch, merging with trunk, and fixing the conflict.
(0014842)
drup (reporter)
2015-11-26 01:33

PR here: https://github.com/ocaml/ocaml/pull/312 [^]

frisch: The rebase was acrobatic, and I basically changed most of your code in the process, so review would be welcome.
(0015095)
frisch (developer)
2015-12-09 18:01

https://github.com/ocaml/ocaml/pull/326 [^] has been merged.

- Issue History
Date Modified Username Field Change
2014-11-28 17:50 drup New Issue
2014-12-03 10:25 frisch Note Added: 0012623
2014-12-03 10:36 frisch Note Edited: 0012623 View Revisions
2014-12-03 10:51 frisch Relationship added related to 0006688
2014-12-03 16:39 drup Note Added: 0012626
2014-12-03 16:39 drup Note Edited: 0012626 View Revisions
2014-12-03 16:58 lpw25 Note Added: 0012631
2014-12-03 17:08 drup Note Added: 0012632
2014-12-03 17:58 frisch Note Added: 0012634
2014-12-03 18:41 frisch Note Added: 0012638
2014-12-03 18:54 lpw25 Note Added: 0012640
2014-12-03 19:12 drup Note Added: 0012643
2014-12-04 10:11 frisch Note Added: 0012647
2014-12-04 15:23 drup Note Added: 0012659
2014-12-04 15:49 frisch Note Added: 0012661
2014-12-04 15:57 lpw25 Note Added: 0012662
2014-12-05 18:10 frisch Note Added: 0012671
2014-12-07 00:43 drup Note Added: 0012691
2014-12-07 01:44 lpw25 Note Added: 0012692
2014-12-07 01:51 lpw25 Note Added: 0012693
2014-12-07 02:04 drup Note Added: 0012694
2014-12-07 11:36 frisch Note Added: 0012698
2014-12-07 12:40 lpw25 Note Added: 0012699
2014-12-07 12:45 lpw25 Note Added: 0012700
2014-12-07 15:14 drup Note Added: 0012701
2014-12-08 09:50 frisch Note Added: 0012709
2014-12-08 13:33 lpw25 Note Added: 0012721
2015-01-08 18:18 doligez Status new => acknowledged
2015-01-13 22:36 doligez Target Version => 4.02.3+dev
2015-07-10 17:48 doligez Target Version 4.02.3+dev => 4.03.0+dev / +beta1
2015-10-19 15:12 drup Note Added: 0014582
2015-10-20 12:42 frisch Note Added: 0014588
2015-11-26 01:33 drup Note Added: 0014842
2015-12-09 18:01 frisch Note Added: 0015095
2015-12-09 18:01 frisch Status acknowledged => resolved
2015-12-09 18:01 frisch Fixed in Version => 4.03.0+dev / +beta1
2015-12-09 18:01 frisch Resolution open => fixed
2015-12-09 18:01 frisch Assigned To => frisch
2017-02-16 15:18 xleroy Status resolved => closed
2017-02-23 16:36 doligez Category OCaml general => -OCaml general
2017-03-03 17:55 doligez Category -OCaml general => -(deprecated) general
2017-03-03 18:01 doligez Category -(deprecated) general => ~deprecated (was: OCaml general)
2017-03-06 17:04 doligez Category ~deprecated (was: OCaml general) => ~DO NOT USE (was: OCaml general)


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker