Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0005584OCamlOCaml generalpublic2012-04-11 23:092014-05-05 18:10
Reporterlpw25 
Assigned Togarrigue 
PrioritynormalSeverityfeatureReproducibilityN/A
StatusresolvedResolutionfixed 
PlatformOSOS Version
Product Version 
Target VersionFixed in Version4.02.0+dev 
Summary0005584: Open Extensible Types
Description
OCaml already has one open extensible type (exn), it would be useful to extend this and allow abstract types to be declared open.

This could be particularly useful when combined with GADTs.

Example syntax:

   open type foo

   extend foo with Constr of int

   open type 'a bar

   extend _ bar with Gadt1: int bar

   extend char bar with Gadt2


I've attached a patch (tested with revision 12270) and more details can be found at:

https://sites.google.com/site/ocamlopen/ [^]

Note that this patch does not yet add these extensions to the camlp4 parser, and may break existing camlp4 code.
Tagspatch
Attached Files? file icon extension.patch-1.0.2 [^] (289,214 bytes) 2012-04-11 23:09
? file icon extension.patch-2.0.0 [^] (203,216 bytes) 2012-05-18 13:11
? file icon extension.patch-3.0.0 [^] (305,465 bytes) 2014-02-27 17:54
patch file icon shorter-transl-extension-constructor.patch [^] (3,811 bytes) 2014-03-19 12:20 [Show Content]
patch file icon merge-exceptions.patch [^] (131,788 bytes) 2014-04-11 15:33 [Show Content]
? file icon extension.patch-4.0.0 [^] (364,813 bytes) 2014-04-27 14:31
? file icon extension.patch-4.0.1 [^] (364,187 bytes) 2014-04-30 01:39
? file icon extension.patch-4.0.2 [^] (365,758 bytes) 2014-05-02 21:48

- Relationships
related to 0006219resolvedyallop Existential types for exceptions 

-  Notes
(0007346)
frisch (developer)
2012-04-12 20:04
edited on: 2012-04-12 20:06

I like the proposal very much, but I'm not so fond of the proposed syntax.

1. "open type foo" seems to indicate that one wants to open the type foo. For instance, if foo had been declared as a record type (e.g. in a different module, or with labels that have been shadowed by another declaration), this could mean that one wants to bring its labels in scope. Here, the intention is to declare a new datatype, so the syntax should start with "type foo" like for any other type declaration. For instance: "type foo = open" or "type foo = ..".

2. "extend foo with Constr of int" introduces a new keyword, which is rather common and is thus likely to break existing code. From my experience, there is little chance this can be accepted in OCaml. Another proposal: "type foo += Constr of int". This also has the advantage that one can add several cases:

  type foo += Foo of int | Bar of string

3. "type foo = A | .." could be a short-hand for "type foo = .. ;; type foo += A".

(0007350)
lpw25 (developer)
2012-04-13 13:40

I agree that the proposed syntax is not very good. Regarding the above suggestions:

1.
For type declaratons, I think that the "type 'a foo = .." syntax would be fine. Another alternative would be "type > 'a foo", borrowing the ">" from polymorphic variants.

Whichever syntax is used for type declarations, I think that it should definitely allow type abbreviations to be checked for their openness. For instance:

type foo = bar

should create a type abbreviation that is open if bar is open, but

type > foo = bar

should check that bar is open and create a type abbreviation.

2 and 3.
I like the "type foo +=" syntax, it is certainly preferable to a new keyword. The only downside is that it provides no obvious syntax for rebinding extensions, but I don't think that this is a particularly useful feature anyway.

I'm not sure about allowing multiple extensions to be declared at once, or allowing extensions to be declared within the type declaration, because:

type foo = A | B | ..

will be compatible with

type foo = B | A | ..

which could cause confusion.

I also think that if it is too easy to create an open type and extensions that are equivalent to an ordinary variant type, then people will create them unnecessarily. Essentially, the additional runtime cost of using extensions compared to ordinary variants should be reflected by the syntax in some way.
(0007351)
frisch (developer)
2012-04-13 13:48
edited on: 2012-04-13 13:53

1. I strongly prefer "type 'a foo = .." to "type > 'a foo". We are declaring a type, which happens to be of a new kind (previously, we had only "abstract", "variant", "record"; now there is "open"). Now similarly to other kinds of type declarations with a manifest type, we can write both:

  type bar = foo
  type bar = foo = ..

2. For rebinding, what about: "type foo += M.X = X of int * int"?

I agree with your remark about defining constructors directly within the type declaration. But I don't see the problem with multiple extensions:

  type foo = ..
  type foo += A of int | B of string

(0007353)
lpw25 (developer)
2012-04-13 16:33
edited on: 2012-04-13 16:34

For the rebinding it would be more like: "type foo += X = M.Y", which is probably fine.

I think that having all of

"type foo += A of int | B of string"

"type foo += B of string | A of int"

"type foo += A of int
 ...
 type foo += B of string"

as equivalent definitions may cause confusion, because there aren't really any other features in OCaml that allow similar duplication.

Such syntactic sugar might be better added using camlp4.

(0007354)
garrigue (manager)
2012-04-13 17:16

I have not yet tried the patch, but the idea seems fine.
Concerning the syntax, I would vote for

  type open foo = ...
  type foo += ...

Since open is already a keyword there is no problem.
By the way I believe that private should have been the same way, but wasn't for historical reasons.
Also, private was originally intended to apply to individual constructors, so it would make sense to use it with that meaning.
Last, I believe that you should allow

  type foo += A of int | B of bool

Since the syntax follows the one for variants, there is no reason have people write more than necessary.

For me the advantage of extensible types is double:
* allow extension of an already defined type (but this was already possible with exn)
* allow GADT constructors (which is difficult with polymorphic variants)

The real goal of polymorphic variants was to allow reusing the same constructor in different types, with fine grain pattern-matching, and this is orthogonal to this.
(0007443)
lpw25 (developer)
2012-05-18 13:28

I have attached a new version of the patch (tested with revision 12464).

The new version uses the syntax:

  type 'a foo = ..

to define open types. To match this the behaviour has been changed so that open types are a new kind of type, rather than a property of types.

The syntax for adding extensions has also been changed to:

  type 'a foo +=
      A of 'a
    | B: int foo

To match this syntax, these have been changed to behave more like extensions to a type's definition, rather than declarations of constructors. Intuitively, they behave like:

  type 'a foo_new = 'a foo + A of 'a | B: int foo_new

and can be used wherever the definition:

  type 'a foo_new = 'a foo = ..

would be allowed.

In the above definition, although A and B are defined in a single extension, they are separate members of a module signature so that the following definitions are equivalent to the one above:

  type 'a foo += A of 'a

  type 'a foo += B: int foo

Extensions can be made private using the expected syntax:

  type 'a foo += private C of int

They can also be rebound:

  type 'a foo += D = A

The patch also changes the implementation to use Alain's suggestion of using an object tag instead of needing a new tag for extensions.

I have added the extension to Ocamldoc and the other tools, but I did not add the new syntax to either camlp4 or the emacs mode, as I thought that was better left to someone who knows more about them.
(0007534)
hcarty (reporter)
2012-06-09 20:40

Is this patch still being considered for inclusion? I tried applying patch 2.0.0 against the trunk revision 12594 and it does not apply cleanly.
(0007535)
lpw25 (developer)
2012-06-09 21:37

I don't know whether it's being considered for inclusion, but patch 2.0.0 was made for revision 12464 and should work with that one. I wasn't planning on rebasing the patch on a newer revision until OCaml version 4.0.0 is released.
(0007937)
hcarty (reporter)
2012-08-10 03:53

Will there be an update to this patch now that 4.00.0 is out?
(0008730)
lpw25 (developer)
2013-01-09 19:05

The latest version of this patch (currently about 6 months behind trunk) can be found at https://github.com/lpw25/ocaml-open/. [^]

There is also a version built against 4.00.1 at https://github.com/lpw25/ocaml-open/archive/4.00.1+open_types.tar.gz [^]
(0010980)
lpw25 (developer)
2014-02-27 18:01

I've finally brought my patch up to date. The attached patch (extension.patch-3.0.0) was tested with revision 14443. To make review easier I've also added a pull request to the Github mirror (https://github.com/ocaml/ocaml/pull/18 [^]).
(0011053)
garrigue (manager)
2014-03-19 02:38

This is a large patch, but actually a big part of it is about updating ocamldoc to handle extensible types...
The approach of defining them as a variant of exception types makes most of the changes smooth, so from the point of view of typing I expect no big problem.
This raises the question of why not just replace exceptions with extensible types (they are a special case).

I've created a branch open_types in subversion, merging extension.patch-3.0.0.
Bootstrap went fine, and all the tests go through.
I'll still need to check in detail what happens in Typedecl.
In particular variance inference is easy to get wrong, and I'm slightly disturbed by the size
of the transl_extension_constructor function (couldn't duplication be avoided, will it be easy to
maintain?)
Out of that my only gripe is that some lines are more than 100 characters (coding rule is 80 characters per line)...
(0011057)
lpw25 (developer)
2014-03-19 12:24

> In particular variance inference is easy to get wrong, and I'm slightly disturbed by the size
of the transl_extension_constructor function (couldn't duplication be avoided, will it be easy to
maintain?)

I've attached a patch (shorter-transl-extension-constructor.patch) which makes transl_extension_constructor shorter by factoring out the record creation in each branch of the match. It is still quite a long function, but most of that is the logic for rebinding extensions which can't really be avoided.

> Out of that my only gripe is that some lines are more than 100 characters (coding rule is 80 characters per line)...

Damn, I thought I caught all of those (excluding ocamldoc whose code does not obey the 80 column rule at all).
(0011093)
garrigue (manager)
2014-03-25 09:06

Sorry for the slow reaction.
I merged the patch. It makes the function a bit shorter, but this is mostly cosmetic.
I would have hoped more sharing with normal constructors, but maybe this is difficult.

Also, what about the reason to keep exceptions as an independent mechanism?
Differences in internal representations?
Well, it could certainly make bootstrapping difficult...
(0011097)
lpw25 (developer)
2014-03-25 11:12

> I would have hoped more sharing with normal constructors, but maybe this is difficult.

Possibly `make_cstr` could be extracted from `transl_declaration` and made a bit more general. Then it could be used by both regular constructors and type extensions.

> Also, what about the reason to keep exceptions as an independent mechanism?

There is a slight difference in representations of exception declarations and type extensions. The exception declarations contain a string with the name of the exception. This allows the name to be printed if the exception is not caught. Type extensions do not include such a string.

So currently `exception Foo` and `type exn += Foo` are not considered equivalent in a module type (and the second one is currently not allowed -- which ensures that all `exn` constructors can have their names printed).

Note that the constructors themselves are already represented the same (using `Cstr_extension`).
(0011098)
frisch (developer)
2014-03-25 14:06

Wouldn't it be useful to keep the constructor name, for debugging purposes? I don't know precisely how this would be exported to the user, maybe with a special primitive that can be instantiated only on extensible types. I.e. the user would write:

  type 'a foo = ..

  val name: 'a foo -> string = "%caml_constructor_name"

(The compiler would ensure that the primitive is only used at the proper types.)
(0011104)
lpw25 (developer)
2014-03-26 16:20

> Wouldn't it be useful to keep the constructor name, for debugging purposes? I don't know precisely how this would be exported to the user, maybe with a special primitive that can be instantiated only on extensible types.

The simplest way would probably be via a function in `Obj`. A similar function could be provided to obtain the unique id of the extension/exception.
(0011105)
frisch (developer)
2014-03-26 16:26

> The simplest way would probably be via a function in `Obj`.

Yes, but I assume its signature cannot be restricted to accept only values of extensible types, hence my proposal. Another approach would be to define a built-in supertype of all extensible types, and expose a single function working on that type:

  val name: extensible -> string

  ...


  name (x :> extensible)
(0011107)
lpw25 (developer)
2014-03-26 16:40

> Yes, but I assume its signature cannot be restricted to accept only values of extensible types

As the use cases for this feature are just debugging, I'm not sure that it is worth the additional complexity. A run-time exception for misusing the function should be sufficient (and achievable since I don't think an object tagged block whose first parameter is a string appears elsewhere).
(0011261)
lpw25 (developer)
2014-04-11 15:39

I've uploaded another patch (to be applied on top of the last one). This patch does 4 things:

- Extracts `make_cstr` from `transl_declaration` and makes it a bit more general. This allows it to be used for both regular constructors and type extensions.

- Fixes a bug (forgot to check that extension arguments had closed types).

- Makes exceptions a special case of type extensions. This is basically done by making the run-time representations identical and then replacing all `Sig_exceptions` with the equivalent `Sig_typext`.

- Adds `Obj.extension_name` and `Obj.extension_id` which can be used to print the name/id of an extension (or exception).

These commits have also been added to the pull request, where they can be viewed as separate diffs.
(0011262)
lpw25 (developer)
2014-04-11 15:42
edited on: 2014-04-14 12:08

Note that a side-effect of the most recent patch is support for exceptions with GADT syntax:

    exception Foo : 'a -> exn

which addresses 0006219.

(0011269)
frisch (developer)
2014-04-14 17:32

Jacques: I think you forgot to add ocamldoc/odoc_extension.ml to the branch.
(0011270)
frisch (developer)
2014-04-14 17:44

I'm testing the branch, not the latest patches, so maybe this doesn't apply:

# exception A;;
exception A
# A;;
- : exn = A
# type foo = ..;;
type foo = ..
# type foo += A;;
type foo += A
# A;;
- : foo = <extension>


It could be useful to have the toplevel uses the same printing heuristics for extensible types as for exceptions.
(0011271)
lpw25 (developer)
2014-04-14 18:01

> I'm testing the branch, not the latest patches, so maybe this doesn't apply:
> ...
> It could be useful to have the toplevel uses the same printing heuristics for extensible types as for exceptions.

That is indeed added by the latest patch, since it depends on storing the name in the "slot". It is slightly less aggressive than the exception printer (which kind of breaks abstraction when it cannot find the correct exception definition), but it will now print `- : foo = A` for your example.
(0011272)
frisch (developer)
2014-04-14 18:03

The patch represents type parameters with "core_type" instead of "string loc option" in the Parsetree. This was discussed in April 2013 on the wg-camlp4 mailing list ("Changes to the parsetree" thread). My position is still that it's a good change, but only if the parser accepts arbitrary core types in position of type parameters, and have the type-checker reject forms which are not allowed (the argument is that it reduces the "syntactic invariants" required for a parsetree to be printed and parsed backed). Moreover, it could certainly be useful to allow attributes on type parameters. As far as I remember, allowing arbitrary type expressions introduced some conflicts in the grammar, though.

How is this change related to the current proposal? If there is no real need for it, I'd propose to dissociate this point from the patch.
(0011273)
lpw25 (developer)
2014-04-14 18:17
edited on: 2014-04-14 18:19

> How is this change related to the current proposal?

I found it useful to be able to obtain the `Types.type_expr` corresponding to a type parameter directly from a `Typedtree.type_extension`. This meant changing the parameter from a `string loc` to a `core_type`. For consistency I also changed it in `Parsetree.type_extension`. Then for even more consistency, I changed it everywhere else.

Note that, unlike a `Typedtree.type_declaration`, a `Typedtree.type_extension` does not have a corresponding `Types.type_extension` from which to extract this information.

I think that it is also a more sensible representation. The parameter is represented with (restricted) type expression syntax and has a corresponding `type_expr` in the `Types.type_decl`, so it is both syntactically and semantically a type.

Also note that, in the presence of `constraint`s, the `type_expr` contains information that is not available in the `string loc`.

> Moreover, it could certainly be useful to allow attributes on type parameters

That should certainly be easy after this change.

(0011275)
frisch (developer)
2014-04-14 19:25

Do you think it would be possible to tweak the grammar to accept all kinds of type expressions, then? It would be good that the type-checker checks for accepted forms anyway, to prevent weird behavior with badly buggy ppx rewriters for instance.
(0011276)
lpw25 (developer)
2014-04-14 19:43

> Do you think it would be possible to tweak the grammar to accept all kinds of type expressions, then?

I'm not sure, I'm assuming that there is some kind of conflict. But it should certainly be possible to extend the grammar to allow things like attributes and extensions.

> It would be good that the type-checker checks for accepted forms anyway, to prevent weird behavior with badly buggy ppx rewriters for instance.

This is a good point, although it raises some much wider issues: what invariants does the type checker assume about its input from the parser and are they all checked? What error should be raised for a case that is only possible with a faulty ppx rewriter? Should there be a sanity check that encodes all the invariants and can be run before type-checking?
(0011280)
garrigue (manager)
2014-04-15 12:14

I've merged merge-exceptions in the open_types branch, and added forgotten files.
I should probably have been more careful about where to base the branch, and I'm getting some conflicts...
(0011281)
garrigue (manager)
2014-04-15 12:44

I do not think it is a good idea to accept arbitrary type expressions as formal parameters in a declaration.
If I understand correctly, Leo's idea was just to use core_type rather than string loc in the parse tree to get a more uniform representation, without changing the parser itself.
However, as Alain points, this creates a problem with ppx rewriters, as they may create badly formed declarations.
Overall, I do not think this kind of relaxing of the parse tree is a good idea, as it creates a distance between the type declaration and allowed input.

The question of attributes is of course orthogonal to that.
However, allowing attributes on parameters in the type declaration does not seem very nice syntactically...
(0011282)
lpw25 (developer)
2014-04-15 13:39

> If I understand correctly, Leo's idea was just to use core_type rather than string loc in the parse tree to get a more uniform representation, without changing the parser itself.

That's the idea.

> However, as Alain points, this creates a problem with ppx rewriters, as they may create badly formed declarations.
> Overall, I do not think this kind of relaxing of the parse tree is a good idea, as it creates a distance between the type declaration and allowed input.

Without searching properly for an actual example, I assume that there are probably a few places where particular parsetree does not correspond to a valid output of the parser. Do you think we should try and locate these assumptions and ensure that they are checked properly during type-checking? Perhaps we should have a "Illegal Parse Tree" error for such cases.
(0011283)
frisch (developer)
2014-04-15 13:58

> I assume that there are probably a few places where particular parsetree does not correspond to a valid output of the parser

"Structural" invariants are now better expressed in the Parsetree. For instance, there are two different nodes for Pexp_fun and Pexp_function, while previously a kind of node could express forms that don't have a valid syntax.

There are still some constraints such as Ptyp_poly and Pexp_poly which can only appear in specific contexts (documented in the parsetree.mli) -- I don't remember if the type-checker behaves correctly when the constraint is not satisfied.

There are also a lot of side conditions, such as many lists being non empty, identifier being well-formed, etc.


> Do you think we should try and locate these assumptions and ensure that they are checked properly during type-checking?

This seems a good goal, and a first step is to document missing constraints in parsetree.mli.
(0011318)
lpw25 (developer)
2014-04-23 18:57

Is there anything else I should be doing to help this get merged? I would really like to see it in 4.02.

> This seems a good goal, and a first step is to document missing constraints in parsetree.mli.

I think I might work on a patch to give "illegal parse tree" errors instead of assertion failures in cases where parse tree invariants are not met, but I would prefer to keep that separate form this extensible variants patch.
(0011319)
garrigue (manager)
2014-04-24 07:57

I think it's ready for merging, but I don't have much time to do it.
Can you first send a mail to caml-devel?
(0011335)
lpw25 (developer)
2014-04-30 01:40

Patch made up-to-date again (extension.patch-4.0.1). This is against current trunk (r14708).
(0011344)
garrigue (manager)
2014-05-05 01:14

Merged extension.patch-4.0.2 in trunk at revision 14737.
(0011345)
gasche (developer)
2014-05-05 07:08

Congratulations! That was a lot of work, and we now have a nice feature.
(0011346)
frisch (developer)
2014-05-05 09:43

Indeed, this is great!

Leo: can you propose a patch to the manual as well?
(0011348)
lpw25 (developer)
2014-05-05 12:25
edited on: 2014-05-05 12:25

> Leo: can you propose a patch to the manual as well?

Sure. I'll have a look at that sometime in the next few weeks.

(0011357)
ybarnoy (reporter)
2014-05-05 18:10

Awesome job! This is shaping up to be a terrific release.

- Issue History
Date Modified Username Field Change
2012-04-11 23:09 lpw25 New Issue
2012-04-11 23:09 lpw25 File Added: extension.patch-1.0.2
2012-04-12 20:04 frisch Note Added: 0007346
2012-04-12 20:06 frisch Note Edited: 0007346 View Revisions
2012-04-13 13:40 lpw25 Note Added: 0007350
2012-04-13 13:48 frisch Note Added: 0007351
2012-04-13 13:53 frisch Note Edited: 0007351 View Revisions
2012-04-13 16:33 lpw25 Note Added: 0007353
2012-04-13 16:34 lpw25 Note Edited: 0007353 View Revisions
2012-04-13 17:16 garrigue Note Added: 0007354
2012-05-02 16:50 doligez Assigned To => doligez
2012-05-02 16:50 doligez Status new => acknowledged
2012-05-18 13:11 lpw25 File Added: extension.patch-2.0.0
2012-05-18 13:28 lpw25 Note Added: 0007443
2012-06-09 20:40 hcarty Note Added: 0007534
2012-06-09 21:37 lpw25 Note Added: 0007535
2012-08-09 08:46 doligez Assigned To doligez =>
2012-08-10 03:53 hcarty Note Added: 0007937
2013-01-09 19:05 lpw25 Note Added: 0008730
2013-12-05 15:44 doligez Tag Attached: patch
2014-02-27 17:54 lpw25 File Added: extension.patch-3.0.0
2014-02-27 18:01 lpw25 Note Added: 0010980
2014-03-19 02:38 garrigue Note Added: 0011053
2014-03-19 12:20 lpw25 File Added: shorter-transl-extension-constructor.patch
2014-03-19 12:24 lpw25 Note Added: 0011057
2014-03-25 09:06 garrigue Note Added: 0011093
2014-03-25 11:12 lpw25 Note Added: 0011097
2014-03-25 14:06 frisch Note Added: 0011098
2014-03-26 16:20 lpw25 Note Added: 0011104
2014-03-26 16:26 frisch Note Added: 0011105
2014-03-26 16:40 lpw25 Note Added: 0011107
2014-04-02 14:46 yallop Relationship added related to 0006219
2014-04-11 15:33 lpw25 File Added: merge-exceptions.patch
2014-04-11 15:39 lpw25 Note Added: 0011261
2014-04-11 15:42 lpw25 Note Added: 0011262
2014-04-14 12:08 lpw25 Note Edited: 0011262 View Revisions
2014-04-14 17:32 frisch Note Added: 0011269
2014-04-14 17:44 frisch Note Added: 0011270
2014-04-14 18:01 lpw25 Note Added: 0011271
2014-04-14 18:03 frisch Note Added: 0011272
2014-04-14 18:17 lpw25 Note Added: 0011273
2014-04-14 18:19 lpw25 Note Edited: 0011273 View Revisions
2014-04-14 19:25 frisch Note Added: 0011275
2014-04-14 19:43 lpw25 Note Added: 0011276
2014-04-15 12:14 garrigue Note Added: 0011280
2014-04-15 12:44 garrigue Note Added: 0011281
2014-04-15 13:39 lpw25 Note Added: 0011282
2014-04-15 13:58 frisch Note Added: 0011283
2014-04-23 18:57 lpw25 Note Added: 0011318
2014-04-24 07:57 garrigue Note Added: 0011319
2014-04-27 14:31 lpw25 File Added: extension.patch-4.0.0
2014-04-30 01:39 lpw25 File Added: extension.patch-4.0.1
2014-04-30 01:40 lpw25 Note Added: 0011335
2014-05-02 21:48 lpw25 File Added: extension.patch-4.0.2
2014-05-05 01:14 garrigue Note Added: 0011344
2014-05-05 01:14 garrigue Status acknowledged => resolved
2014-05-05 01:14 garrigue Fixed in Version => 4.02.0+dev
2014-05-05 01:14 garrigue Resolution open => fixed
2014-05-05 01:14 garrigue Assigned To => garrigue
2014-05-05 07:08 gasche Note Added: 0011345
2014-05-05 09:43 frisch Note Added: 0011346
2014-05-05 12:25 lpw25 Note Added: 0011348
2014-05-05 12:25 lpw25 Note Edited: 0011348 View Revisions
2014-05-05 18:10 ybarnoy Note Added: 0011357


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker